[2024-03-20 22:45:08,732][00146] Saving configuration to /workspace/metta/train_dir/p2.objt_atn.4/config.json... [2024-03-20 22:45:08,847][00146] Rollout worker 0 uses device cpu [2024-03-20 22:45:08,847][00146] Rollout worker 1 uses device cpu [2024-03-20 22:45:08,847][00146] Rollout worker 2 uses device cpu [2024-03-20 22:45:08,848][00146] Rollout worker 3 uses device cpu [2024-03-20 22:45:08,848][00146] Rollout worker 4 uses device cpu [2024-03-20 22:45:08,848][00146] Rollout worker 5 uses device cpu [2024-03-20 22:45:08,848][00146] Rollout worker 6 uses device cpu [2024-03-20 22:45:08,848][00146] Rollout worker 7 uses device cpu [2024-03-20 22:45:08,848][00146] Rollout worker 8 uses device cpu [2024-03-20 22:45:08,848][00146] Rollout worker 9 uses device cpu [2024-03-20 22:45:08,848][00146] Rollout worker 10 uses device cpu [2024-03-20 22:45:08,848][00146] Rollout worker 11 uses device cpu [2024-03-20 22:45:08,848][00146] Rollout worker 12 uses device cpu [2024-03-20 22:45:08,848][00146] Rollout worker 13 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 14 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 15 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 16 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 17 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 18 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 19 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 20 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 21 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 22 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 23 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 24 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 25 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 26 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 27 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 28 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 29 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 30 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 31 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 32 uses device cpu [2024-03-20 22:45:08,849][00146] Rollout worker 33 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 34 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 35 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 36 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 37 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 38 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 39 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 40 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 41 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 42 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 43 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 44 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 45 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 46 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 47 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 48 uses device cpu [2024-03-20 22:45:08,850][00146] Rollout worker 49 uses device cpu [2024-03-20 22:45:13,223][00146] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-20 22:45:13,223][00146] InferenceWorker_p0-w0: min num requests: 16 [2024-03-20 22:45:13,285][00146] Starting all processes... [2024-03-20 22:45:13,285][00146] Starting process learner_proc0 [2024-03-20 22:45:13,383][00146] Starting all processes... [2024-03-20 22:45:13,386][00146] Starting process inference_proc0-0 [2024-03-20 22:45:13,387][00146] Starting process rollout_proc0 [2024-03-20 22:45:13,387][00146] Starting process rollout_proc1 [2024-03-20 22:45:13,388][00146] Starting process rollout_proc2 [2024-03-20 22:45:13,388][00146] Starting process rollout_proc3 [2024-03-20 22:45:13,388][00146] Starting process rollout_proc4 [2024-03-20 22:45:13,388][00146] Starting process rollout_proc5 [2024-03-20 22:45:13,388][00146] Starting process rollout_proc6 [2024-03-20 22:45:13,389][00146] Starting process rollout_proc7 [2024-03-20 22:45:13,389][00146] Starting process rollout_proc8 [2024-03-20 22:45:13,389][00146] Starting process rollout_proc9 [2024-03-20 22:45:13,389][00146] Starting process rollout_proc10 [2024-03-20 22:45:13,390][00146] Starting process rollout_proc11 [2024-03-20 22:45:13,391][00146] Starting process rollout_proc12 [2024-03-20 22:45:13,391][00146] Starting process rollout_proc13 [2024-03-20 22:45:13,391][00146] Starting process rollout_proc14 [2024-03-20 22:45:13,391][00146] Starting process rollout_proc15 [2024-03-20 22:45:13,391][00146] Starting process rollout_proc16 [2024-03-20 22:45:13,391][00146] Starting process rollout_proc17 [2024-03-20 22:45:13,393][00146] Starting process rollout_proc18 [2024-03-20 22:45:13,394][00146] Starting process rollout_proc19 [2024-03-20 22:45:13,394][00146] Starting process rollout_proc20 [2024-03-20 22:45:13,396][00146] Starting process rollout_proc21 [2024-03-20 22:45:13,397][00146] Starting process rollout_proc22 [2024-03-20 22:45:13,398][00146] Starting process rollout_proc23 [2024-03-20 22:45:13,399][00146] Starting process rollout_proc24 [2024-03-20 22:45:13,402][00146] Starting process rollout_proc25 [2024-03-20 22:45:13,503][00146] Starting process rollout_proc30 [2024-03-20 22:45:13,404][00146] Starting process rollout_proc26 [2024-03-20 22:45:13,425][00146] Starting process rollout_proc28 [2024-03-20 22:45:13,426][00146] Starting process rollout_proc29 [2024-03-20 22:45:13,407][00146] Starting process rollout_proc27 [2024-03-20 22:45:13,503][00146] Starting process rollout_proc31 [2024-03-20 22:45:13,531][00146] Starting process rollout_proc32 [2024-03-20 22:45:13,532][00146] Starting process rollout_proc33 [2024-03-20 22:45:13,532][00146] Starting process rollout_proc34 [2024-03-20 22:45:13,532][00146] Starting process rollout_proc35 [2024-03-20 22:45:13,542][00146] Starting process rollout_proc36 [2024-03-20 22:45:13,542][00146] Starting process rollout_proc37 [2024-03-20 22:45:13,545][00146] Starting process rollout_proc38 [2024-03-20 22:45:13,592][00146] Starting process rollout_proc40 [2024-03-20 22:45:13,592][00146] Starting process rollout_proc39 [2024-03-20 22:45:13,593][00146] Starting process rollout_proc41 [2024-03-20 22:45:13,622][00146] Starting process rollout_proc42 [2024-03-20 22:45:13,646][00146] Starting process rollout_proc43 [2024-03-20 22:45:13,663][00146] Starting process rollout_proc44 [2024-03-20 22:45:13,687][00146] Starting process rollout_proc45 [2024-03-20 22:45:13,700][00146] Starting process rollout_proc46 [2024-03-20 22:45:13,721][00146] Starting process rollout_proc47 [2024-03-20 22:45:13,796][00146] Starting process rollout_proc48 [2024-03-20 22:45:13,797][00146] Starting process rollout_proc49 [2024-03-20 22:45:16,302][00386] Worker 4 uses CPU cores [4] [2024-03-20 22:45:16,566][00809] Worker 25 uses CPU cores [25] [2024-03-20 22:45:16,650][00388] Worker 0 uses CPU cores [0] [2024-03-20 22:45:16,738][01288] Worker 40 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:16,746][00426] Worker 12 uses CPU cores [12] [2024-03-20 22:45:16,794][00711] Worker 17 uses CPU cores [17] [2024-03-20 22:45:16,834][00776] Worker 21 uses CPU cores [21] [2024-03-20 22:45:16,891][01415] Worker 34 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:16,950][00808] Worker 22 uses CPU cores [22] [2024-03-20 22:45:16,976][01037] Worker 33 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:16,984][00382] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-20 22:45:16,984][00382] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-03-20 22:45:17,006][00382] Num visible devices: 1 [2024-03-20 22:45:17,082][00743] Worker 20 uses CPU cores [20] [2024-03-20 22:45:17,114][00390] Worker 7 uses CPU cores [7] [2024-03-20 22:45:17,134][01002] Worker 31 uses CPU cores [31] [2024-03-20 22:45:17,150][00645] Worker 14 uses CPU cores [14] [2024-03-20 22:45:17,177][01575] Worker 46 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:17,198][00362] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-20 22:45:17,198][00362] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-03-20 22:45:17,219][00362] Num visible devices: 1 [2024-03-20 22:45:17,287][00362] Starting seed is not provided [2024-03-20 22:45:17,287][00362] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-20 22:45:17,287][00362] Initializing actor-critic model on device cuda:0 [2024-03-20 22:45:17,287][00362] RunningMeanStd input shape: (20,) [2024-03-20 22:45:17,288][00362] RunningMeanStd input shape: (24, 11, 11) [2024-03-20 22:45:17,288][00362] RunningMeanStd input shape: (1, 11, 11) [2024-03-20 22:45:17,288][00362] RunningMeanStd input shape: (2,) [2024-03-20 22:45:17,288][00362] RunningMeanStd input shape: (1,) [2024-03-20 22:45:17,288][00362] RunningMeanStd input shape: (1,) [2024-03-20 22:45:17,306][00387] Worker 5 uses CPU cores [5] [2024-03-20 22:45:17,306][00389] Worker 6 uses CPU cores [6] [2024-03-20 22:45:17,322][00487] Worker 13 uses CPU cores [13] [2024-03-20 22:45:17,324][01225] Worker 37 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:17,342][00938] Worker 29 uses CPU cores [29] [2024-03-20 22:45:17,391][00679] Worker 18 uses CPU cores [18] [2024-03-20 22:45:17,394][00393] Worker 11 uses CPU cores [11] [2024-03-20 22:45:17,422][00385] Worker 3 uses CPU cores [3] [2024-03-20 22:45:17,442][00391] Worker 8 uses CPU cores [8] [2024-03-20 22:45:17,475][00677] Worker 16 uses CPU cores [16] [2024-03-20 22:45:17,488][01379] Worker 39 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:17,496][01733] Worker 49 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:17,510][00970] Worker 27 uses CPU cores [27] [2024-03-20 22:45:17,542][01066] Worker 28 uses CPU cores [28] [2024-03-20 22:45:17,550][00384] Worker 2 uses CPU cores [2] [2024-03-20 22:45:17,558][00383] Worker 1 uses CPU cores [1] [2024-03-20 22:45:17,575][01098] Worker 32 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:17,619][01161] Worker 35 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:17,622][01670] Worker 45 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:17,627][01414] Worker 38 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:17,638][00678] Worker 15 uses CPU cores [15] [2024-03-20 22:45:17,681][01193] Worker 36 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:17,758][01765] Worker 48 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:17,814][00873] Worker 30 uses CPU cores [30] [2024-03-20 22:45:17,845][01480] Worker 43 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:17,859][00401] Worker 10 uses CPU cores [10] [2024-03-20 22:45:17,863][01512] Worker 44 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:17,902][00841] Worker 24 uses CPU cores [24] [2024-03-20 22:45:17,952][00906] Worker 26 uses CPU cores [26] [2024-03-20 22:45:17,965][01471] Worker 42 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:18,006][01576] Worker 47 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:18,028][00874] Worker 23 uses CPU cores [23] [2024-03-20 22:45:18,061][00362] Created Actor Critic model with architecture: [2024-03-20 22:45:18,061][00362] PredictingActorCritic( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (global_vars): RunningMeanStdInPlace() (griddly_obs): RunningMeanStdInPlace() (kinship): RunningMeanStdInPlace() (last_action): RunningMeanStdInPlace() (last_reward): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): GriddlyEncoder( (object_embedding): Sequential( (0): Linear(in_features=52, out_features=64, bias=True) (1): ELU(alpha=1.0) (2): Sequential( (0): Linear(in_features=64, out_features=64, bias=True) (1): ELU(alpha=1.0) ) (3): Sequential( (0): Linear(in_features=64, out_features=64, bias=True) (1): ELU(alpha=1.0) ) (4): Sequential( (0): Linear(in_features=64, out_features=64, bias=True) (1): ELU(alpha=1.0) ) ) (encoder_head): Sequential( (0): Linear(in_features=7767, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) (3): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) (4): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) (5): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): GriddlyDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) [2024-03-20 22:45:18,068][00775] Worker 19 uses CPU cores [19] [2024-03-20 22:45:18,095][00392] Worker 9 uses CPU cores [9] [2024-03-20 22:45:18,110][01439] Worker 41 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:45:18,248][00362] Using optimizer [2024-03-20 22:45:18,456][00362] No checkpoints found [2024-03-20 22:45:18,457][00362] No checkpoints found, init from checkpoint workspace/metta/train_dir/p2.objt_atn.3/checkpoint_p0/checkpoint_000046241_1515225088.pth [2024-03-20 22:45:18,457][00362] EvtLoop [learner_proc0_evt_loop, process=learner_proc0] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Runner_EvtLoop', signal_name='start'), args=() Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/learning/learner_worker.py", line 139, in init init_model_data = self.learner.init() File "/workspace/metta/third_party/sample_factory/sample_factory/algo/learning/learner.py", line 246, in init self.load_from_checkpoint(self.policy_id) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/learning/learner.py", line 364, in load_from_checkpoint checkpoint_dict = torch.load(self.cfg.init_checkpoint_path, map_location=self.device) File "/opt/conda/lib/python3.10/site-packages/torch/serialization.py", line 998, in load with _open_file_like(f, 'rb') as opened_file: File "/opt/conda/lib/python3.10/site-packages/torch/serialization.py", line 445, in _open_file_like return _open_file(name_or_buffer, mode) File "/opt/conda/lib/python3.10/site-packages/torch/serialization.py", line 426, in __init__ super().__init__(open(name, mode)) FileNotFoundError: [Errno 2] No such file or directory: 'workspace/metta/train_dir/p2.objt_atn.3/checkpoint_p0/checkpoint_000046241_1515225088.pth' [2024-03-20 22:45:18,465][00362] Unhandled exception [Errno 2] No such file or directory: 'workspace/metta/train_dir/p2.objt_atn.3/checkpoint_p0/checkpoint_000046241_1515225088.pth' in evt loop learner_proc0_evt_loop [2024-03-20 22:45:33,221][00146] Heartbeat connected on Batcher_0 [2024-03-20 22:45:33,224][00146] Heartbeat connected on InferenceWorker_p0-w0 [2024-03-20 22:45:33,225][00146] Heartbeat connected on RolloutWorker_w0 [2024-03-20 22:45:33,226][00146] Heartbeat connected on RolloutWorker_w1 [2024-03-20 22:45:33,228][00146] Heartbeat connected on RolloutWorker_w2 [2024-03-20 22:45:33,229][00146] Heartbeat connected on RolloutWorker_w3 [2024-03-20 22:45:33,230][00146] Heartbeat connected on RolloutWorker_w4 [2024-03-20 22:45:33,231][00146] Heartbeat connected on RolloutWorker_w5 [2024-03-20 22:45:33,232][00146] Heartbeat connected on RolloutWorker_w6 [2024-03-20 22:45:33,234][00146] Heartbeat connected on RolloutWorker_w7 [2024-03-20 22:45:33,235][00146] Heartbeat connected on RolloutWorker_w8 [2024-03-20 22:45:33,236][00146] Heartbeat connected on RolloutWorker_w9 [2024-03-20 22:45:33,237][00146] Heartbeat connected on RolloutWorker_w10 [2024-03-20 22:45:33,238][00146] Heartbeat connected on RolloutWorker_w11 [2024-03-20 22:45:33,240][00146] Heartbeat connected on RolloutWorker_w12 [2024-03-20 22:45:33,241][00146] Heartbeat connected on RolloutWorker_w13 [2024-03-20 22:45:33,243][00146] Heartbeat connected on RolloutWorker_w14 [2024-03-20 22:45:33,244][00146] Heartbeat connected on RolloutWorker_w15 [2024-03-20 22:45:33,245][00146] Heartbeat connected on RolloutWorker_w16 [2024-03-20 22:45:33,246][00146] Heartbeat connected on RolloutWorker_w17 [2024-03-20 22:45:33,247][00146] Heartbeat connected on RolloutWorker_w18 [2024-03-20 22:45:33,249][00146] Heartbeat connected on RolloutWorker_w19 [2024-03-20 22:45:33,250][00146] Heartbeat connected on RolloutWorker_w20 [2024-03-20 22:45:33,251][00146] Heartbeat connected on RolloutWorker_w21 [2024-03-20 22:45:33,252][00146] Heartbeat connected on RolloutWorker_w22 [2024-03-20 22:45:33,253][00146] Heartbeat connected on RolloutWorker_w23 [2024-03-20 22:45:33,255][00146] Heartbeat connected on RolloutWorker_w24 [2024-03-20 22:45:33,256][00146] Heartbeat connected on RolloutWorker_w25 [2024-03-20 22:45:33,257][00146] Heartbeat connected on RolloutWorker_w26 [2024-03-20 22:45:33,258][00146] Heartbeat connected on RolloutWorker_w27 [2024-03-20 22:45:33,259][00146] Heartbeat connected on RolloutWorker_w28 [2024-03-20 22:45:33,261][00146] Heartbeat connected on RolloutWorker_w29 [2024-03-20 22:45:33,262][00146] Heartbeat connected on RolloutWorker_w30 [2024-03-20 22:45:33,263][00146] Heartbeat connected on RolloutWorker_w31 [2024-03-20 22:45:33,264][00146] Heartbeat connected on RolloutWorker_w32 [2024-03-20 22:45:33,265][00146] Heartbeat connected on RolloutWorker_w33 [2024-03-20 22:45:33,266][00146] Heartbeat connected on RolloutWorker_w34 [2024-03-20 22:45:33,267][00146] Heartbeat connected on RolloutWorker_w35 [2024-03-20 22:45:33,269][00146] Heartbeat connected on RolloutWorker_w36 [2024-03-20 22:45:33,270][00146] Heartbeat connected on RolloutWorker_w37 [2024-03-20 22:45:33,271][00146] Heartbeat connected on RolloutWorker_w38 [2024-03-20 22:45:33,272][00146] Heartbeat connected on RolloutWorker_w39 [2024-03-20 22:45:33,274][00146] Heartbeat connected on RolloutWorker_w40 [2024-03-20 22:45:33,275][00146] Heartbeat connected on RolloutWorker_w41 [2024-03-20 22:45:33,276][00146] Heartbeat connected on RolloutWorker_w42 [2024-03-20 22:45:33,277][00146] Heartbeat connected on RolloutWorker_w43 [2024-03-20 22:45:33,279][00146] Heartbeat connected on RolloutWorker_w44 [2024-03-20 22:45:33,280][00146] Heartbeat connected on RolloutWorker_w45 [2024-03-20 22:45:33,281][00146] Heartbeat connected on RolloutWorker_w46 [2024-03-20 22:45:33,282][00146] Heartbeat connected on RolloutWorker_w47 [2024-03-20 22:45:33,284][00146] Heartbeat connected on RolloutWorker_w48 [2024-03-20 22:45:33,285][00146] Heartbeat connected on RolloutWorker_w49 [2024-03-20 22:45:46,235][00146] Keyboard interrupt detected in the event loop EvtLoop [Runner_EvtLoop, process=main process 146], exiting... [2024-03-20 22:45:46,237][00389] Stopping RolloutWorker_w6... [2024-03-20 22:45:46,237][01670] Stopping RolloutWorker_w45... [2024-03-20 22:45:46,237][01471] Stopping RolloutWorker_w42... [2024-03-20 22:45:46,237][00645] Stopping RolloutWorker_w14... [2024-03-20 22:45:46,237][00711] Stopping RolloutWorker_w17... [2024-03-20 22:45:46,237][00393] Stopping RolloutWorker_w11... [2024-03-20 22:45:46,237][01414] Stopping RolloutWorker_w38... [2024-03-20 22:45:46,237][01225] Stopping RolloutWorker_w37... [2024-03-20 22:45:46,237][00387] Stopping RolloutWorker_w5... [2024-03-20 22:45:46,237][00392] Stopping RolloutWorker_w9... [2024-03-20 22:45:46,237][01066] Stopping RolloutWorker_w28... [2024-03-20 22:45:46,237][00362] Stopping Batcher_0... [2024-03-20 22:45:46,237][01576] Stopping RolloutWorker_w47... [2024-03-20 22:45:46,238][01002] Stopping RolloutWorker_w31... [2024-03-20 22:45:46,238][01098] Stopping RolloutWorker_w32... [2024-03-20 22:45:46,238][00645] Loop rollout_proc14_evt_loop terminating... [2024-03-20 22:45:46,238][01480] Stopping RolloutWorker_w43... [2024-03-20 22:45:46,238][00391] Stopping RolloutWorker_w8... [2024-03-20 22:45:46,238][00146] Runner profile tree view: main_loop: 32.9524 [2024-03-20 22:45:46,238][00426] Stopping RolloutWorker_w12... [2024-03-20 22:45:46,238][00393] Loop rollout_proc11_evt_loop terminating... [2024-03-20 22:45:46,238][01225] Loop rollout_proc37_evt_loop terminating... [2024-03-20 22:45:46,238][00383] Stopping RolloutWorker_w1... [2024-03-20 22:45:46,238][01733] Stopping RolloutWorker_w49... [2024-03-20 22:45:46,238][00390] Stopping RolloutWorker_w7... [2024-03-20 22:45:46,238][00387] Loop rollout_proc5_evt_loop terminating... [2024-03-20 22:45:46,238][01193] Stopping RolloutWorker_w36... [2024-03-20 22:45:46,238][00392] Loop rollout_proc9_evt_loop terminating... [2024-03-20 22:45:46,238][00711] Loop rollout_proc17_evt_loop terminating... [2024-03-20 22:45:46,238][01037] Stopping RolloutWorker_w33... [2024-03-20 22:45:46,238][01414] Loop rollout_proc38_evt_loop terminating... [2024-03-20 22:45:46,238][00384] Stopping RolloutWorker_w2... [2024-03-20 22:45:46,238][01471] Loop rollout_proc42_evt_loop terminating... [2024-03-20 22:45:46,238][01066] Loop rollout_proc28_evt_loop terminating... [2024-03-20 22:45:46,238][01098] Loop rollout_proc32_evt_loop terminating... [2024-03-20 22:45:46,238][00362] Loop batcher_evt_loop terminating... [2024-03-20 22:45:46,238][01379] Stopping RolloutWorker_w39... [2024-03-20 22:45:46,238][00146] Collected {}, FPS: 0.0 [2024-03-20 22:45:46,239][01480] Loop rollout_proc43_evt_loop terminating... [2024-03-20 22:45:46,238][01002] Loop rollout_proc31_evt_loop terminating... [2024-03-20 22:45:46,239][00391] Loop rollout_proc8_evt_loop terminating... [2024-03-20 22:45:46,239][00426] Loop rollout_proc12_evt_loop terminating... [2024-03-20 22:45:46,239][00390] Loop rollout_proc7_evt_loop terminating... [2024-03-20 22:45:46,239][01733] Loop rollout_proc49_evt_loop terminating... [2024-03-20 22:45:46,238][00743] Stopping RolloutWorker_w20... [2024-03-20 22:45:46,239][00384] Loop rollout_proc2_evt_loop terminating... [2024-03-20 22:45:46,239][01193] Loop rollout_proc36_evt_loop terminating... [2024-03-20 22:45:46,239][01037] Loop rollout_proc33_evt_loop terminating... [2024-03-20 22:45:46,239][01379] Loop rollout_proc39_evt_loop terminating... [2024-03-20 22:45:46,239][00873] Stopping RolloutWorker_w30... [2024-03-20 22:45:46,239][00388] Stopping RolloutWorker_w0... [2024-03-20 22:45:46,239][00808] Stopping RolloutWorker_w22... [2024-03-20 22:45:46,240][00388] Loop rollout_proc0_evt_loop terminating... [2024-03-20 22:45:46,240][00743] Loop rollout_proc20_evt_loop terminating... [2024-03-20 22:45:46,240][00873] Loop rollout_proc30_evt_loop terminating... [2024-03-20 22:45:46,240][00808] Loop rollout_proc22_evt_loop terminating... [2024-03-20 22:45:46,237][00776] Stopping RolloutWorker_w21... [2024-03-20 22:45:46,242][01415] Stopping RolloutWorker_w34... [2024-03-20 22:45:46,243][01415] Loop rollout_proc34_evt_loop terminating... [2024-03-20 22:45:46,243][00385] Stopping RolloutWorker_w3... [2024-03-20 22:45:46,243][00776] Loop rollout_proc21_evt_loop terminating... [2024-03-20 22:45:46,243][00874] Stopping RolloutWorker_w23... [2024-03-20 22:45:46,243][01575] Stopping RolloutWorker_w46... [2024-03-20 22:45:46,243][01161] Stopping RolloutWorker_w35... [2024-03-20 22:45:46,244][00385] Loop rollout_proc3_evt_loop terminating... [2024-03-20 22:45:46,244][00775] Stopping RolloutWorker_w19... [2024-03-20 22:45:46,244][00874] Loop rollout_proc23_evt_loop terminating... [2024-03-20 22:45:46,244][01575] Loop rollout_proc46_evt_loop terminating... [2024-03-20 22:45:46,244][01161] Loop rollout_proc35_evt_loop terminating... [2024-03-20 22:45:46,244][00775] Loop rollout_proc19_evt_loop terminating... [2024-03-20 22:45:46,237][00841] Stopping RolloutWorker_w24... [2024-03-20 22:45:46,237][00487] Stopping RolloutWorker_w13... [2024-03-20 22:45:46,237][00678] Stopping RolloutWorker_w15... [2024-03-20 22:45:46,238][01576] Loop rollout_proc47_evt_loop terminating... [2024-03-20 22:45:46,247][00841] Loop rollout_proc24_evt_loop terminating... [2024-03-20 22:45:46,247][01288] Stopping RolloutWorker_w40... [2024-03-20 22:45:46,247][00487] Loop rollout_proc13_evt_loop terminating... [2024-03-20 22:45:46,247][00678] Loop rollout_proc15_evt_loop terminating... [2024-03-20 22:45:46,247][01288] Loop rollout_proc40_evt_loop terminating... [2024-03-20 22:45:46,237][00809] Stopping RolloutWorker_w25... [2024-03-20 22:45:46,237][00382] Stopping InferenceWorker_p0-w0... [2024-03-20 22:45:46,249][00809] Loop rollout_proc25_evt_loop terminating... [2024-03-20 22:45:46,249][00382] Loop inference_proc0-0_evt_loop terminating... [2024-03-20 22:45:46,237][00386] Stopping RolloutWorker_w4... [2024-03-20 22:45:46,251][00386] Loop rollout_proc4_evt_loop terminating... [2024-03-20 22:45:46,237][00677] Stopping RolloutWorker_w16... [2024-03-20 22:45:46,255][00677] Loop rollout_proc16_evt_loop terminating... [2024-03-20 22:45:46,237][00389] Loop rollout_proc6_evt_loop terminating... [2024-03-20 22:45:46,237][01439] Stopping RolloutWorker_w41... [2024-03-20 22:45:46,259][01439] Loop rollout_proc41_evt_loop terminating... [2024-03-20 22:45:46,237][01512] Stopping RolloutWorker_w44... [2024-03-20 22:45:46,273][01512] Loop rollout_proc44_evt_loop terminating... [2024-03-20 22:45:46,237][00938] Stopping RolloutWorker_w29... [2024-03-20 22:45:46,237][00401] Stopping RolloutWorker_w10... [2024-03-20 22:45:46,276][00401] Loop rollout_proc10_evt_loop terminating... [2024-03-20 22:45:46,237][01765] Stopping RolloutWorker_w48... [2024-03-20 22:45:46,277][01765] Loop rollout_proc48_evt_loop terminating... [2024-03-20 22:45:46,237][00970] Stopping RolloutWorker_w27... [2024-03-20 22:45:46,279][00970] Loop rollout_proc27_evt_loop terminating... [2024-03-20 22:45:46,237][01670] Loop rollout_proc45_evt_loop terminating... [2024-03-20 22:45:46,237][00906] Stopping RolloutWorker_w26... [2024-03-20 22:45:46,287][00906] Loop rollout_proc26_evt_loop terminating... [2024-03-20 22:45:46,237][00679] Stopping RolloutWorker_w18... [2024-03-20 22:45:46,291][00679] Loop rollout_proc18_evt_loop terminating... [2024-03-20 22:45:46,239][00383] Loop rollout_proc1_evt_loop terminating... [2024-03-20 22:45:46,275][00938] Loop rollout_proc29_evt_loop terminating... [2024-03-20 22:46:03,207][03784] Saving configuration to /workspace/metta/train_dir/p2.objt_atn.4/config.json... [2024-03-20 22:46:03,215][03784] Rollout worker 0 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 1 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 2 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 3 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 4 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 5 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 6 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 7 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 8 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 9 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 10 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 11 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 12 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 13 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 14 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 15 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 16 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 17 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 18 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 19 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 20 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 21 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 22 uses device cpu [2024-03-20 22:46:03,215][03784] Rollout worker 23 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 24 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 25 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 26 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 27 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 28 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 29 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 30 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 31 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 32 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 33 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 34 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 35 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 36 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 37 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 38 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 39 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 40 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 41 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 42 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 43 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 44 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 45 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 46 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 47 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 48 uses device cpu [2024-03-20 22:46:03,216][03784] Rollout worker 49 uses device cpu [2024-03-20 22:46:07,579][03784] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-20 22:46:07,579][03784] InferenceWorker_p0-w0: min num requests: 16 [2024-03-20 22:46:07,646][03784] Starting all processes... [2024-03-20 22:46:07,646][03784] Starting process learner_proc0 [2024-03-20 22:46:07,735][03784] Starting all processes... [2024-03-20 22:46:07,738][03784] Starting process inference_proc0-0 [2024-03-20 22:46:07,738][03784] Starting process rollout_proc0 [2024-03-20 22:46:07,738][03784] Starting process rollout_proc1 [2024-03-20 22:46:07,738][03784] Starting process rollout_proc2 [2024-03-20 22:46:07,738][03784] Starting process rollout_proc3 [2024-03-20 22:46:07,738][03784] Starting process rollout_proc4 [2024-03-20 22:46:07,738][03784] Starting process rollout_proc5 [2024-03-20 22:46:07,739][03784] Starting process rollout_proc6 [2024-03-20 22:46:07,740][03784] Starting process rollout_proc7 [2024-03-20 22:46:07,740][03784] Starting process rollout_proc8 [2024-03-20 22:46:07,741][03784] Starting process rollout_proc9 [2024-03-20 22:46:07,741][03784] Starting process rollout_proc10 [2024-03-20 22:46:07,741][03784] Starting process rollout_proc11 [2024-03-20 22:46:07,742][03784] Starting process rollout_proc12 [2024-03-20 22:46:07,742][03784] Starting process rollout_proc13 [2024-03-20 22:46:07,742][03784] Starting process rollout_proc14 [2024-03-20 22:46:07,743][03784] Starting process rollout_proc15 [2024-03-20 22:46:07,743][03784] Starting process rollout_proc16 [2024-03-20 22:46:07,743][03784] Starting process rollout_proc17 [2024-03-20 22:46:07,743][03784] Starting process rollout_proc18 [2024-03-20 22:46:07,743][03784] Starting process rollout_proc19 [2024-03-20 22:46:07,746][03784] Starting process rollout_proc20 [2024-03-20 22:46:07,746][03784] Starting process rollout_proc21 [2024-03-20 22:46:07,748][03784] Starting process rollout_proc22 [2024-03-20 22:46:07,752][03784] Starting process rollout_proc23 [2024-03-20 22:46:07,753][03784] Starting process rollout_proc24 [2024-03-20 22:46:07,753][03784] Starting process rollout_proc25 [2024-03-20 22:46:07,753][03784] Starting process rollout_proc26 [2024-03-20 22:46:07,756][03784] Starting process rollout_proc27 [2024-03-20 22:46:07,758][03784] Starting process rollout_proc28 [2024-03-20 22:46:07,758][03784] Starting process rollout_proc29 [2024-03-20 22:46:07,775][03784] Starting process rollout_proc30 [2024-03-20 22:46:07,787][03784] Starting process rollout_proc31 [2024-03-20 22:46:07,809][03784] Starting process rollout_proc32 [2024-03-20 22:46:07,810][03784] Starting process rollout_proc33 [2024-03-20 22:46:07,812][03784] Starting process rollout_proc34 [2024-03-20 22:46:07,812][03784] Starting process rollout_proc35 [2024-03-20 22:46:07,875][03784] Starting process rollout_proc36 [2024-03-20 22:46:07,875][03784] Starting process rollout_proc37 [2024-03-20 22:46:07,875][03784] Starting process rollout_proc38 [2024-03-20 22:46:07,897][03784] Starting process rollout_proc39 [2024-03-20 22:46:07,897][03784] Starting process rollout_proc40 [2024-03-20 22:46:07,900][03784] Starting process rollout_proc41 [2024-03-20 22:46:07,943][03784] Starting process rollout_proc42 [2024-03-20 22:46:07,965][03784] Starting process rollout_proc43 [2024-03-20 22:46:07,965][03784] Starting process rollout_proc44 [2024-03-20 22:46:07,966][03784] Starting process rollout_proc45 [2024-03-20 22:46:08,010][03784] Starting process rollout_proc46 [2024-03-20 22:46:08,011][03784] Starting process rollout_proc47 [2024-03-20 22:46:08,011][03784] Starting process rollout_proc48 [2024-03-20 22:46:08,011][03784] Starting process rollout_proc49 [2024-03-20 22:46:10,792][04020] Worker 4 uses CPU cores [4] [2024-03-20 22:46:10,854][04475] Worker 25 uses CPU cores [25] [2024-03-20 22:46:10,943][05021] Worker 30 uses CPU cores [30] [2024-03-20 22:46:10,998][04924] Worker 42 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:11,094][04023] Worker 8 uses CPU cores [8] [2024-03-20 22:46:11,150][04412] Worker 24 uses CPU cores [24] [2024-03-20 22:46:11,170][04380] Worker 23 uses CPU cores [23] [2024-03-20 22:46:11,198][04284] Worker 21 uses CPU cores [21] [2024-03-20 22:46:11,229][04017] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-20 22:46:11,229][04017] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-03-20 22:46:11,242][04017] Num visible devices: 1 [2024-03-20 22:46:11,298][04015] Worker 0 uses CPU cores [0] [2024-03-20 22:46:11,334][04317] Worker 20 uses CPU cores [20] [2024-03-20 22:46:11,338][04604] Worker 32 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:11,342][04061] Worker 15 uses CPU cores [15] [2024-03-20 22:46:11,391][05064] Worker 47 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:11,405][04668] Worker 41 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:11,410][04018] Worker 2 uses CPU cores [2] [2024-03-20 22:46:11,436][04019] Worker 3 uses CPU cores [3] [2024-03-20 22:46:11,501][05367] Worker 49 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:11,550][04280] Worker 16 uses CPU cores [16] [2024-03-20 22:46:11,562][04027] Worker 11 uses CPU cores [11] [2024-03-20 22:46:11,594][04016] Worker 1 uses CPU cores [1] [2024-03-20 22:46:11,600][04476] Worker 27 uses CPU cores [27] [2024-03-20 22:46:11,626][04282] Worker 18 uses CPU cores [18] [2024-03-20 22:46:11,647][04957] Worker 33 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:11,651][04734] Worker 39 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:11,742][04605] Worker 34 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:11,750][04022] Worker 6 uses CPU cores [6] [2024-03-20 22:46:11,758][04281] Worker 17 uses CPU cores [17] [2024-03-20 22:46:11,771][05180] Worker 48 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:11,800][04036] Worker 12 uses CPU cores [12] [2024-03-20 22:46:11,832][04959] Worker 44 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:11,840][04021] Worker 5 uses CPU cores [5] [2024-03-20 22:46:11,847][04602] Worker 28 uses CPU cores [28] [2024-03-20 22:46:11,847][04039] Worker 13 uses CPU cores [13] [2024-03-20 22:46:11,865][04732] Worker 37 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:11,875][03995] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-20 22:46:11,875][03995] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-03-20 22:46:11,883][03995] Num visible devices: 1 [2024-03-20 22:46:11,886][04731] Worker 35 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:11,959][03995] Starting seed is not provided [2024-03-20 22:46:11,960][03995] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-20 22:46:11,960][03995] Initializing actor-critic model on device cuda:0 [2024-03-20 22:46:11,960][03995] RunningMeanStd input shape: (20,) [2024-03-20 22:46:11,961][03995] RunningMeanStd input shape: (24, 11, 11) [2024-03-20 22:46:11,961][03995] RunningMeanStd input shape: (1, 11, 11) [2024-03-20 22:46:11,961][03995] RunningMeanStd input shape: (2,) [2024-03-20 22:46:11,961][03995] RunningMeanStd input shape: (1,) [2024-03-20 22:46:11,961][03995] RunningMeanStd input shape: (1,) [2024-03-20 22:46:12,030][04026] Worker 10 uses CPU cores [10] [2024-03-20 22:46:12,034][04477] Worker 26 uses CPU cores [26] [2024-03-20 22:46:12,035][04956] Worker 43 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:12,048][04735] Worker 40 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:12,062][04283] Worker 19 uses CPU cores [19] [2024-03-20 22:46:12,074][04025] Worker 7 uses CPU cores [7] [2024-03-20 22:46:12,106][04024] Worker 9 uses CPU cores [9] [2024-03-20 22:46:12,124][04892] Worker 31 uses CPU cores [31] [2024-03-20 22:46:12,125][04181] Worker 14 uses CPU cores [14] [2024-03-20 22:46:12,137][04603] Worker 29 uses CPU cores [29] [2024-03-20 22:46:12,150][04733] Worker 38 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:12,163][04891] Worker 36 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:12,196][05030] Worker 46 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:12,238][04285] Worker 22 uses CPU cores [22] [2024-03-20 22:46:12,262][05029] Worker 45 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-20 22:46:12,302][03995] Created Actor Critic model with architecture: [2024-03-20 22:46:12,302][03995] PredictingActorCritic( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (global_vars): RunningMeanStdInPlace() (griddly_obs): RunningMeanStdInPlace() (kinship): RunningMeanStdInPlace() (last_action): RunningMeanStdInPlace() (last_reward): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): GriddlyEncoder( (object_embedding): Sequential( (0): Linear(in_features=52, out_features=64, bias=True) (1): ELU(alpha=1.0) (2): Sequential( (0): Linear(in_features=64, out_features=64, bias=True) (1): ELU(alpha=1.0) ) (3): Sequential( (0): Linear(in_features=64, out_features=64, bias=True) (1): ELU(alpha=1.0) ) (4): Sequential( (0): Linear(in_features=64, out_features=64, bias=True) (1): ELU(alpha=1.0) ) ) (encoder_head): Sequential( (0): Linear(in_features=7767, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) (3): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) (4): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) (5): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): GriddlyDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) [2024-03-20 22:46:12,460][03995] Using optimizer [2024-03-20 22:46:12,671][03995] No checkpoints found [2024-03-20 22:46:12,671][03995] No checkpoints found, init from checkpoint train_dir/p2.objt_atn.3/checkpoint_p0/checkpoint_000046241_1515225088.pth [2024-03-20 22:46:12,694][03995] Could not load action_parameterization.distribution_linear.weight from the checkpoint, replacing with random data [2024-03-20 22:46:12,735][03995] Could not load action_parameterization.distribution_linear.bias from the checkpoint, replacing with random data [2024-03-20 22:46:12,736][03995] Not restoring optimizer state from the checkpoint [2024-03-20 22:46:12,736][03995] Loaded experiment state at self.train_step=0, self.env_steps=0 [2024-03-20 22:46:12,736][03995] Initialized policy 0 weights for model version 0 [2024-03-20 22:46:12,737][03995] LearnerWorker_p0 finished initialization! [2024-03-20 22:46:12,737][03995] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-20 22:46:12,781][04017] RunningMeanStd input shape: (20,) [2024-03-20 22:46:12,782][04017] RunningMeanStd input shape: (24, 11, 11) [2024-03-20 22:46:12,782][04017] RunningMeanStd input shape: (1, 11, 11) [2024-03-20 22:46:12,782][04017] RunningMeanStd input shape: (2,) [2024-03-20 22:46:12,782][04017] RunningMeanStd input shape: (1,) [2024-03-20 22:46:12,782][04017] RunningMeanStd input shape: (1,) [2024-03-20 22:46:13,056][03784] Inference worker 0-0 is ready! [2024-03-20 22:46:13,056][03784] All inference workers are ready! Signal rollout workers to start! [2024-03-20 22:46:15,521][03784] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-20 22:46:20,521][03784] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-20 22:46:21,460][04602] Decorrelating experience for 0 frames... [2024-03-20 22:46:21,488][04603] Decorrelating experience for 0 frames... [2024-03-20 22:46:21,563][04476] Decorrelating experience for 0 frames... [2024-03-20 22:46:23,154][04061] Decorrelating experience for 0 frames... [2024-03-20 22:46:24,066][04021] Decorrelating experience for 0 frames... [2024-03-20 22:46:24,483][04892] Decorrelating experience for 0 frames... [2024-03-20 22:46:24,990][04282] Decorrelating experience for 0 frames... [2024-03-20 22:46:25,040][04280] Decorrelating experience for 0 frames... [2024-03-20 22:46:25,521][03784] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-20 22:46:25,781][04026] Decorrelating experience for 0 frames... [2024-03-20 22:46:26,209][05021] Decorrelating experience for 0 frames... [2024-03-20 22:46:26,595][04027] Decorrelating experience for 0 frames... [2024-03-20 22:46:26,904][04020] Decorrelating experience for 0 frames... [2024-03-20 22:46:27,576][03784] Heartbeat connected on Batcher_0 [2024-03-20 22:46:27,577][03784] Heartbeat connected on LearnerWorker_p0 [2024-03-20 22:46:27,617][03784] Heartbeat connected on InferenceWorker_p0-w0 [2024-03-20 22:46:28,374][04181] Decorrelating experience for 0 frames... [2024-03-20 22:46:28,827][04317] Decorrelating experience for 0 frames... [2024-03-20 22:46:29,324][04412] Decorrelating experience for 0 frames... [2024-03-20 22:46:29,409][04023] Decorrelating experience for 0 frames... [2024-03-20 22:46:29,449][04036] Decorrelating experience for 0 frames... [2024-03-20 22:46:29,529][04024] Decorrelating experience for 0 frames... [2024-03-20 22:46:29,848][04039] Decorrelating experience for 0 frames... [2024-03-20 22:46:30,400][04018] Decorrelating experience for 0 frames... [2024-03-20 22:46:30,502][04283] Decorrelating experience for 0 frames... [2024-03-20 22:46:30,521][03784] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-20 22:46:30,615][04019] Decorrelating experience for 0 frames... [2024-03-20 22:46:31,454][04476] Decorrelating experience for 256 frames... [2024-03-20 22:46:31,472][04603] Decorrelating experience for 256 frames... [2024-03-20 22:46:33,468][04477] Decorrelating experience for 0 frames... [2024-03-20 22:46:33,617][04475] Decorrelating experience for 0 frames... [2024-03-20 22:46:35,521][03784] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-20 22:46:36,059][04022] Decorrelating experience for 0 frames... [2024-03-20 22:46:36,080][04025] Decorrelating experience for 0 frames... [2024-03-20 22:46:36,193][04015] Decorrelating experience for 0 frames... [2024-03-20 22:46:36,244][04285] Decorrelating experience for 0 frames... [2024-03-20 22:46:37,374][04281] Decorrelating experience for 0 frames... [2024-03-20 22:46:37,579][05367] Decorrelating experience for 0 frames... [2024-03-20 22:46:37,838][04284] Decorrelating experience for 0 frames... [2024-03-20 22:46:38,642][05021] Decorrelating experience for 256 frames... [2024-03-20 22:46:38,648][04016] Decorrelating experience for 0 frames... [2024-03-20 22:46:39,002][04734] Decorrelating experience for 0 frames... [2024-03-20 22:46:39,073][04380] Decorrelating experience for 0 frames... [2024-03-20 22:46:39,162][05029] Decorrelating experience for 0 frames... [2024-03-20 22:46:39,300][04280] Decorrelating experience for 256 frames... [2024-03-20 22:46:39,357][04956] Decorrelating experience for 0 frames... [2024-03-20 22:46:39,790][04959] Decorrelating experience for 0 frames... [2024-03-20 22:46:39,827][05030] Decorrelating experience for 0 frames... [2024-03-20 22:46:39,878][04732] Decorrelating experience for 0 frames... [2024-03-20 22:46:40,328][03784] Heartbeat connected on RolloutWorker_w29 [2024-03-20 22:46:40,521][03784] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-20 22:46:40,792][04924] Decorrelating experience for 0 frames... [2024-03-20 22:46:41,070][04021] Decorrelating experience for 256 frames... [2024-03-20 22:46:41,484][04412] Decorrelating experience for 256 frames... [2024-03-20 22:46:41,532][04602] Decorrelating experience for 256 frames... [2024-03-20 22:46:42,167][04317] Decorrelating experience for 256 frames... [2024-03-20 22:46:42,614][05064] Decorrelating experience for 0 frames... [2024-03-20 22:46:42,697][04735] Decorrelating experience for 0 frames... [2024-03-20 22:46:42,934][04024] Decorrelating experience for 256 frames... [2024-03-20 22:46:43,030][04605] Decorrelating experience for 0 frames... [2024-03-20 22:46:43,552][04604] Decorrelating experience for 0 frames... [2024-03-20 22:46:43,814][04020] Decorrelating experience for 256 frames... [2024-03-20 22:46:43,861][04957] Decorrelating experience for 0 frames... [2024-03-20 22:46:43,910][04733] Decorrelating experience for 0 frames... [2024-03-20 22:46:43,999][04018] Decorrelating experience for 256 frames... [2024-03-20 22:46:44,140][04023] Decorrelating experience for 256 frames... [2024-03-20 22:46:44,363][03784] Heartbeat connected on RolloutWorker_w27 [2024-03-20 22:46:44,517][04731] Decorrelating experience for 0 frames... [2024-03-20 22:46:44,652][04283] Decorrelating experience for 256 frames... [2024-03-20 22:46:45,164][04477] Decorrelating experience for 256 frames... [2024-03-20 22:46:45,521][03784] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 183.3. Samples: 5500. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-20 22:46:45,724][04668] Decorrelating experience for 0 frames... [2024-03-20 22:46:45,868][04891] Decorrelating experience for 0 frames... [2024-03-20 22:46:45,950][04282] Decorrelating experience for 256 frames... [2024-03-20 22:46:46,890][04019] Decorrelating experience for 256 frames... [2024-03-20 22:46:47,188][04892] Decorrelating experience for 256 frames... [2024-03-20 22:46:47,614][05180] Decorrelating experience for 0 frames... [2024-03-20 22:46:48,121][04061] Decorrelating experience for 256 frames... [2024-03-20 22:46:48,929][04026] Decorrelating experience for 256 frames... [2024-03-20 22:46:49,217][04015] Decorrelating experience for 256 frames... [2024-03-20 22:46:49,813][04027] Decorrelating experience for 256 frames... [2024-03-20 22:46:50,384][04016] Decorrelating experience for 256 frames... [2024-03-20 22:46:50,442][03784] Heartbeat connected on RolloutWorker_w5 [2024-03-20 22:46:50,521][03784] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 800.0. Samples: 28000. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-20 22:46:50,636][04022] Decorrelating experience for 256 frames... [2024-03-20 22:46:51,438][04025] Decorrelating experience for 256 frames... [2024-03-20 22:46:51,620][04475] Decorrelating experience for 256 frames... [2024-03-20 22:46:52,825][03784] Heartbeat connected on RolloutWorker_w9 [2024-03-20 22:46:53,091][03784] Heartbeat connected on RolloutWorker_w20 [2024-03-20 22:46:53,258][03784] Heartbeat connected on RolloutWorker_w2 [2024-03-20 22:46:53,516][04181] Decorrelating experience for 256 frames... [2024-03-20 22:46:53,749][03784] Heartbeat connected on RolloutWorker_w8 [2024-03-20 22:46:54,094][03784] Heartbeat connected on RolloutWorker_w4 [2024-03-20 22:46:54,789][03784] Heartbeat connected on RolloutWorker_w24 [2024-03-20 22:46:54,790][03784] Heartbeat connected on RolloutWorker_w16 [2024-03-20 22:46:55,277][04284] Decorrelating experience for 256 frames... [2024-03-20 22:46:55,521][03784] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 2007.5. Samples: 80300. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-20 22:46:55,536][03784] Heartbeat connected on RolloutWorker_w3 [2024-03-20 22:46:56,024][03784] Heartbeat connected on RolloutWorker_w26 [2024-03-20 22:46:56,787][03784] Heartbeat connected on RolloutWorker_w30 [2024-03-20 22:46:58,177][03784] Heartbeat connected on RolloutWorker_w28 [2024-03-20 22:46:58,476][03784] Heartbeat connected on RolloutWorker_w18 [2024-03-20 22:46:59,042][03784] Heartbeat connected on RolloutWorker_w11 [2024-03-20 22:46:59,106][03784] Heartbeat connected on RolloutWorker_w10 [2024-03-20 22:46:59,131][03784] Heartbeat connected on RolloutWorker_w6 [2024-03-20 22:46:59,315][04281] Decorrelating experience for 256 frames... [2024-03-20 22:46:59,912][03784] Heartbeat connected on RolloutWorker_w19 [2024-03-20 22:47:00,334][03784] Heartbeat connected on RolloutWorker_w7 [2024-03-20 22:47:00,521][03784] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 3262.2. Samples: 146800. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-20 22:47:00,652][04285] Decorrelating experience for 256 frames... [2024-03-20 22:47:00,858][04380] Decorrelating experience for 256 frames... [2024-03-20 22:47:00,899][03784] Heartbeat connected on RolloutWorker_w31 [2024-03-20 22:47:01,623][03784] Heartbeat connected on RolloutWorker_w25 [2024-03-20 22:47:02,922][03784] Heartbeat connected on RolloutWorker_w15 [2024-03-20 22:47:03,018][03784] Heartbeat connected on RolloutWorker_w0 [2024-03-20 22:47:04,986][04036] Decorrelating experience for 256 frames... [2024-03-20 22:47:05,272][03784] Heartbeat connected on RolloutWorker_w1 [2024-03-20 22:47:05,274][04039] Decorrelating experience for 256 frames... [2024-03-20 22:47:05,441][04956] Decorrelating experience for 256 frames... [2024-03-20 22:47:05,521][03784] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 7580.0. Samples: 341100. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-20 22:47:05,650][04732] Decorrelating experience for 256 frames... [2024-03-20 22:47:05,798][04735] Decorrelating experience for 256 frames... [2024-03-20 22:47:06,025][03784] Heartbeat connected on RolloutWorker_w21 [2024-03-20 22:47:06,427][04604] Decorrelating experience for 256 frames... [2024-03-20 22:47:06,799][04924] Decorrelating experience for 256 frames... [2024-03-20 22:47:06,982][03784] Heartbeat connected on RolloutWorker_w17 [2024-03-20 22:47:07,406][03784] Heartbeat connected on RolloutWorker_w14 [2024-03-20 22:47:07,620][04603] Worker 29, sleep for 87.000 sec to decorrelate experience collection [2024-03-20 22:47:08,385][05030] Decorrelating experience for 256 frames... [2024-03-20 22:47:08,790][04891] Decorrelating experience for 256 frames... [2024-03-20 22:47:09,226][03784] Heartbeat connected on RolloutWorker_w23 [2024-03-20 22:47:09,419][05367] Decorrelating experience for 256 frames... [2024-03-20 22:47:09,497][03784] Heartbeat connected on RolloutWorker_w22 [2024-03-20 22:47:09,867][04734] Decorrelating experience for 256 frames... [2024-03-20 22:47:10,215][05029] Decorrelating experience for 256 frames... [2024-03-20 22:47:10,521][03784] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 12986.7. Samples: 584400. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-20 22:47:10,702][04476] Worker 27, sleep for 81.000 sec to decorrelate experience collection [2024-03-20 22:47:11,242][04959] Decorrelating experience for 256 frames... [2024-03-20 22:47:12,097][04605] Decorrelating experience for 256 frames... [2024-03-20 22:47:13,483][04731] Decorrelating experience for 256 frames... [2024-03-20 22:47:13,638][05064] Decorrelating experience for 256 frames... [2024-03-20 22:47:14,888][05180] Decorrelating experience for 256 frames... [2024-03-20 22:47:15,220][03784] Heartbeat connected on RolloutWorker_w12 [2024-03-20 22:47:15,365][03784] Heartbeat connected on RolloutWorker_w13 [2024-03-20 22:47:15,521][03784] Fps is (10 sec: 3276.8, 60 sec: 546.1, 300 sec: 546.1). Total num frames: 32768. Throughput: 0: 16080.0. Samples: 723600. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-20 22:47:16,128][03784] Heartbeat connected on RolloutWorker_w40 [2024-03-20 22:47:16,375][04733] Decorrelating experience for 256 frames... [2024-03-20 22:47:16,892][04668] Decorrelating experience for 256 frames... [2024-03-20 22:47:17,506][04957] Decorrelating experience for 256 frames... [2024-03-20 22:47:20,521][03784] Fps is (10 sec: 3276.8, 60 sec: 546.1, 300 sec: 504.1). Total num frames: 32768. Throughput: 0: 22488.9. Samples: 1012000. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-20 22:47:20,619][03784] Heartbeat connected on RolloutWorker_w43 [2024-03-20 22:47:21,255][03784] Heartbeat connected on RolloutWorker_w37 [2024-03-20 22:47:21,847][04317] Worker 20, sleep for 60.000 sec to decorrelate experience collection [2024-03-20 22:47:22,408][04412] Worker 24, sleep for 72.000 sec to decorrelate experience collection [2024-03-20 22:47:22,441][03784] Heartbeat connected on RolloutWorker_w32 [2024-03-20 22:47:22,535][03995] Signal inference workers to stop experience collection... [2024-03-20 22:47:22,535][03995] Signal inference workers to resume experience collection... [2024-03-20 22:47:22,618][04017] InferenceWorker_p0-w0: stopping experience collection [2024-03-20 22:47:22,618][04017] InferenceWorker_p0-w0: resuming experience collection [2024-03-20 22:47:22,890][04280] Worker 16, sleep for 48.000 sec to decorrelate experience collection [2024-03-20 22:47:23,099][04477] Worker 26, sleep for 78.000 sec to decorrelate experience collection [2024-03-20 22:47:23,157][03784] Heartbeat connected on RolloutWorker_w42 [2024-03-20 22:47:23,388][03784] Heartbeat connected on RolloutWorker_w45 [2024-03-20 22:47:23,498][04021] Worker 5, sleep for 15.000 sec to decorrelate experience collection [2024-03-20 22:47:24,560][03784] Heartbeat connected on RolloutWorker_w36 [2024-03-20 22:47:24,597][03784] Heartbeat connected on RolloutWorker_w34 [2024-03-20 22:47:24,897][03784] Heartbeat connected on RolloutWorker_w49 [2024-03-20 22:47:25,241][03784] Heartbeat connected on RolloutWorker_w46 [2024-03-20 22:47:25,521][03784] Fps is (10 sec: 13107.3, 60 sec: 2730.7, 300 sec: 2340.6). Total num frames: 163840. Throughput: 0: 29095.6. Samples: 1309300. Policy #0 lag: (min: 0.0, avg: 0.9, max: 1.0) [2024-03-20 22:47:25,535][03784] Heartbeat connected on RolloutWorker_w44 [2024-03-20 22:47:25,789][03784] Heartbeat connected on RolloutWorker_w39 [2024-03-20 22:47:26,143][04602] Worker 28, sleep for 84.000 sec to decorrelate experience collection [2024-03-20 22:47:26,310][03784] Heartbeat connected on RolloutWorker_w35 [2024-03-20 22:47:26,887][04024] Worker 9, sleep for 27.000 sec to decorrelate experience collection [2024-03-20 22:47:27,001][04023] Worker 8, sleep for 24.000 sec to decorrelate experience collection [2024-03-20 22:47:27,281][03784] Heartbeat connected on RolloutWorker_w48 [2024-03-20 22:47:27,623][03784] Heartbeat connected on RolloutWorker_w47 [2024-03-20 22:47:28,050][04018] Worker 2, sleep for 6.000 sec to decorrelate experience collection [2024-03-20 22:47:28,144][05021] Worker 30, sleep for 90.000 sec to decorrelate experience collection [2024-03-20 22:47:28,822][04020] Worker 4, sleep for 12.000 sec to decorrelate experience collection [2024-03-20 22:47:28,915][03784] Heartbeat connected on RolloutWorker_w38 [2024-03-20 22:47:29,151][04892] Worker 31, sleep for 93.000 sec to decorrelate experience collection [2024-03-20 22:47:29,162][04019] Worker 3, sleep for 9.000 sec to decorrelate experience collection [2024-03-20 22:47:29,212][03784] Heartbeat connected on RolloutWorker_w33 [2024-03-20 22:47:29,438][04017] Updated weights for policy 0, policy_version 10 (0.0015) [2024-03-20 22:47:29,643][03784] Heartbeat connected on RolloutWorker_w41 [2024-03-20 22:47:29,998][04475] Worker 25, sleep for 75.000 sec to decorrelate experience collection [2024-03-20 22:47:30,406][04283] Worker 19, sleep for 57.000 sec to decorrelate experience collection [2024-03-20 22:47:30,521][03784] Fps is (10 sec: 36045.0, 60 sec: 6553.6, 300 sec: 5242.9). Total num frames: 393216. Throughput: 0: 32864.4. Samples: 1484400. Policy #0 lag: (min: 1.0, avg: 6.8, max: 8.0) [2024-03-20 22:47:31,141][04282] Worker 18, sleep for 54.000 sec to decorrelate experience collection [2024-03-20 22:47:31,965][04027] Worker 11, sleep for 33.000 sec to decorrelate experience collection [2024-03-20 22:47:32,740][04061] Worker 15, sleep for 45.000 sec to decorrelate experience collection [2024-03-20 22:47:33,319][04022] Worker 6, sleep for 18.000 sec to decorrelate experience collection [2024-03-20 22:47:33,430][04026] Worker 10, sleep for 30.000 sec to decorrelate experience collection [2024-03-20 22:47:34,078][04018] Worker 2 awakens! [2024-03-20 22:47:34,315][04025] Worker 7, sleep for 21.000 sec to decorrelate experience collection [2024-03-20 22:47:34,374][04284] Worker 21, sleep for 63.000 sec to decorrelate experience collection [2024-03-20 22:47:35,521][03784] Fps is (10 sec: 45874.5, 60 sec: 10376.5, 300 sec: 7782.4). Total num frames: 622592. Throughput: 0: 40571.0. Samples: 1853700. Policy #0 lag: (min: 1.0, avg: 12.6, max: 16.0) [2024-03-20 22:47:36,709][04017] Updated weights for policy 0, policy_version 20 (0.0010) [2024-03-20 22:47:37,236][04016] Worker 1, sleep for 3.000 sec to decorrelate experience collection [2024-03-20 22:47:37,527][04181] Worker 14, sleep for 42.000 sec to decorrelate experience collection [2024-03-20 22:47:37,621][04380] Worker 23, sleep for 69.000 sec to decorrelate experience collection [2024-03-20 22:47:37,803][04281] Worker 17, sleep for 51.000 sec to decorrelate experience collection [2024-03-20 22:47:38,166][04019] Worker 3 awakens! [2024-03-20 22:47:38,534][04021] Worker 5 awakens! [2024-03-20 22:47:39,377][04285] Worker 22, sleep for 66.000 sec to decorrelate experience collection [2024-03-20 22:47:40,250][04016] Worker 1 awakens! [2024-03-20 22:47:40,521][03784] Fps is (10 sec: 45875.3, 60 sec: 14199.5, 300 sec: 10023.2). Total num frames: 851968. Throughput: 0: 46671.1. Samples: 2180500. Policy #0 lag: (min: 0.0, avg: 23.0, max: 25.0) [2024-03-20 22:47:40,886][04020] Worker 4 awakens! [2024-03-20 22:47:41,812][04039] Worker 13, sleep for 39.000 sec to decorrelate experience collection [2024-03-20 22:47:42,779][04036] Worker 12, sleep for 36.000 sec to decorrelate experience collection [2024-03-20 22:47:44,418][04735] Worker 40, sleep for 120.000 sec to decorrelate experience collection [2024-03-20 22:47:45,521][03784] Fps is (10 sec: 36045.6, 60 sec: 16384.0, 300 sec: 10922.7). Total num frames: 983040. Throughput: 0: 49313.4. Samples: 2365900. Policy #0 lag: (min: 0.0, avg: 17.7, max: 28.0) [2024-03-20 22:47:46,192][04732] Worker 37, sleep for 111.000 sec to decorrelate experience collection [2024-03-20 22:47:46,255][04956] Worker 43, sleep for 129.000 sec to decorrelate experience collection [2024-03-20 22:47:46,503][04017] Updated weights for policy 0, policy_version 31 (0.0024) [2024-03-20 22:47:46,912][04924] Worker 42, sleep for 126.000 sec to decorrelate experience collection [2024-03-20 22:47:47,538][04604] Worker 32, sleep for 96.000 sec to decorrelate experience collection [2024-03-20 22:47:48,052][05029] Worker 45, sleep for 135.000 sec to decorrelate experience collection [2024-03-20 22:47:48,099][05367] Worker 49, sleep for 147.000 sec to decorrelate experience collection [2024-03-20 22:47:48,426][04891] Worker 36, sleep for 108.000 sec to decorrelate experience collection [2024-03-20 22:47:48,792][04605] Worker 34, sleep for 102.000 sec to decorrelate experience collection [2024-03-20 22:47:48,980][04731] Worker 35, sleep for 105.000 sec to decorrelate experience collection [2024-03-20 22:47:49,153][04959] Worker 44, sleep for 132.000 sec to decorrelate experience collection [2024-03-20 22:47:49,246][05030] Worker 46, sleep for 138.000 sec to decorrelate experience collection [2024-03-20 22:47:50,022][04734] Worker 39, sleep for 117.000 sec to decorrelate experience collection [2024-03-20 22:47:50,116][05180] Worker 48, sleep for 144.000 sec to decorrelate experience collection [2024-03-20 22:47:50,521][03784] Fps is (10 sec: 45875.4, 60 sec: 21845.4, 300 sec: 13797.1). Total num frames: 1310720. Throughput: 0: 52493.4. Samples: 2703300. Policy #0 lag: (min: 0.0, avg: 14.3, max: 30.0) [2024-03-20 22:47:50,554][05064] Worker 47, sleep for 141.000 sec to decorrelate experience collection [2024-03-20 22:47:50,832][04017] Updated weights for policy 0, policy_version 41 (0.0011) [2024-03-20 22:47:50,894][04668] Worker 41, sleep for 123.000 sec to decorrelate experience collection [2024-03-20 22:47:50,949][04733] Worker 38, sleep for 114.000 sec to decorrelate experience collection [2024-03-20 22:47:50,989][04957] Worker 33, sleep for 99.000 sec to decorrelate experience collection [2024-03-20 22:47:51,021][04023] Worker 8 awakens! [2024-03-20 22:47:51,409][04022] Worker 6 awakens! [2024-03-20 22:47:53,987][04024] Worker 9 awakens! [2024-03-20 22:47:55,415][04025] Worker 7 awakens! [2024-03-20 22:47:55,521][03784] Fps is (10 sec: 55706.1, 60 sec: 25668.3, 300 sec: 15401.0). Total num frames: 1540096. Throughput: 0: 51055.7. Samples: 2881900. Policy #0 lag: (min: 0.0, avg: 22.5, max: 45.0) [2024-03-20 22:47:56,428][04017] Updated weights for policy 0, policy_version 51 (0.0007) [2024-03-20 22:48:00,521][03784] Fps is (10 sec: 52429.4, 60 sec: 30583.5, 300 sec: 17476.3). Total num frames: 1835008. Throughput: 0: 49951.4. Samples: 2971400. Policy #0 lag: (min: 0.0, avg: 14.9, max: 28.0) [2024-03-20 22:48:00,522][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000000056_1835008.pth... [2024-03-20 22:48:03,526][04026] Worker 10 awakens! [2024-03-20 22:48:05,062][04027] Worker 11 awakens! [2024-03-20 22:48:05,521][03784] Fps is (10 sec: 32768.2, 60 sec: 31129.7, 300 sec: 16979.8). Total num frames: 1867776. Throughput: 0: 48433.6. Samples: 3191500. Policy #0 lag: (min: 2.0, avg: 43.5, max: 56.0) [2024-03-20 22:48:09,244][04017] Updated weights for policy 0, policy_version 61 (0.0009) [2024-03-20 22:48:10,521][03784] Fps is (10 sec: 19660.7, 60 sec: 33860.3, 300 sec: 17666.3). Total num frames: 2031616. Throughput: 0: 46757.9. Samples: 3413400. Policy #0 lag: (min: 0.0, avg: 7.7, max: 17.0) [2024-03-20 22:48:10,521][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:48:10,530][03995] Saving new best policy, reward=0.000! [2024-03-20 22:48:10,990][04280] Worker 16 awakens! [2024-03-20 22:48:14,472][04017] Updated weights for policy 0, policy_version 71 (0.0007) [2024-03-20 22:48:15,521][03784] Fps is (10 sec: 45874.9, 60 sec: 38229.5, 300 sec: 19387.8). Total num frames: 2326528. Throughput: 0: 44893.5. Samples: 3504600. Policy #0 lag: (min: 2.0, avg: 10.5, max: 18.0) [2024-03-20 22:48:15,521][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:48:17,840][04061] Worker 15 awakens! [2024-03-20 22:48:18,884][04036] Worker 12 awakens! [2024-03-20 22:48:19,626][04181] Worker 14 awakens! [2024-03-20 22:48:20,521][03784] Fps is (10 sec: 45874.6, 60 sec: 40960.0, 300 sec: 19923.0). Total num frames: 2490368. Throughput: 0: 41529.0. Samples: 3722500. Policy #0 lag: (min: 47.0, avg: 63.7, max: 75.0) [2024-03-20 22:48:20,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:48:20,913][04039] Worker 13 awakens! [2024-03-20 22:48:21,946][04317] Worker 20 awakens! [2024-03-20 22:48:22,808][04017] Updated weights for policy 0, policy_version 81 (0.0010) [2024-03-20 22:48:25,159][04282] Worker 18 awakens! [2024-03-20 22:48:25,521][03784] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 21173.2). Total num frames: 2752512. Throughput: 0: 39726.7. Samples: 3968200. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-03-20 22:48:25,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:48:27,441][04283] Worker 19 awakens! [2024-03-20 22:48:28,910][04281] Worker 17 awakens! [2024-03-20 22:48:30,521][03784] Fps is (10 sec: 36044.6, 60 sec: 40960.0, 300 sec: 21117.2). Total num frames: 2850816. Throughput: 0: 39286.6. Samples: 4133800. Policy #0 lag: (min: 64.0, avg: 82.7, max: 85.0) [2024-03-20 22:48:30,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:48:31,802][04476] Worker 27 awakens! [2024-03-20 22:48:33,508][04017] Updated weights for policy 0, policy_version 91 (0.0011) [2024-03-20 22:48:34,508][04412] Worker 24 awakens! [2024-03-20 22:48:34,726][04603] Worker 29 awakens! [2024-03-20 22:48:35,521][03784] Fps is (10 sec: 36045.3, 60 sec: 41506.4, 300 sec: 22235.5). Total num frames: 3112960. Throughput: 0: 38164.6. Samples: 4420700. Policy #0 lag: (min: 0.0, avg: 14.1, max: 30.0) [2024-03-20 22:48:35,521][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:48:37,478][04284] Worker 21 awakens! [2024-03-20 22:48:38,559][03995] Signal inference workers to stop experience collection... (50 times) [2024-03-20 22:48:38,635][04017] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-03-20 22:48:38,834][03995] Signal inference workers to resume experience collection... (50 times) [2024-03-20 22:48:38,834][04017] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-03-20 22:48:38,856][04017] Updated weights for policy 0, policy_version 101 (0.0031) [2024-03-20 22:48:40,521][03784] Fps is (10 sec: 65535.9, 60 sec: 44236.7, 300 sec: 24180.5). Total num frames: 3506176. Throughput: 0: 39377.6. Samples: 4653900. Policy #0 lag: (min: 0.0, avg: 78.7, max: 100.0) [2024-03-20 22:48:40,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:48:41,200][04477] Worker 26 awakens! [2024-03-20 22:48:43,657][04017] Updated weights for policy 0, policy_version 111 (0.0022) [2024-03-20 22:48:45,099][04475] Worker 25 awakens! [2024-03-20 22:48:45,464][04285] Worker 22 awakens! [2024-03-20 22:48:45,521][03784] Fps is (10 sec: 55705.4, 60 sec: 44783.0, 300 sec: 24466.8). Total num frames: 3670016. Throughput: 0: 40737.8. Samples: 4804600. Policy #0 lag: (min: 0.0, avg: 87.2, max: 107.0) [2024-03-20 22:48:45,521][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:48:46,721][04380] Worker 23 awakens! [2024-03-20 22:48:50,242][04602] Worker 28 awakens! [2024-03-20 22:48:50,255][04017] Updated weights for policy 0, policy_version 121 (0.0014) [2024-03-20 22:48:50,521][03784] Fps is (10 sec: 45875.5, 60 sec: 44236.7, 300 sec: 25580.2). Total num frames: 3964928. Throughput: 0: 42813.1. Samples: 5118100. Policy #0 lag: (min: 0.0, avg: 57.2, max: 116.0) [2024-03-20 22:48:50,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:48:55,521][03784] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 25804.8). Total num frames: 4128768. Throughput: 0: 45586.6. Samples: 5464800. Policy #0 lag: (min: 0.0, avg: 21.6, max: 39.0) [2024-03-20 22:48:55,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:48:58,243][05021] Worker 30 awakens! [2024-03-20 22:48:59,261][04017] Updated weights for policy 0, policy_version 131 (0.0008) [2024-03-20 22:49:00,523][03784] Fps is (10 sec: 36037.9, 60 sec: 41504.7, 300 sec: 26214.1). Total num frames: 4325376. Throughput: 0: 47740.0. Samples: 5653000. Policy #0 lag: (min: 0.0, avg: 18.6, max: 41.0) [2024-03-20 22:49:00,525][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:49:02,262][04892] Worker 31 awakens! [2024-03-20 22:49:04,092][04017] Updated weights for policy 0, policy_version 141 (0.0048) [2024-03-20 22:49:05,521][03784] Fps is (10 sec: 55705.7, 60 sec: 46967.3, 300 sec: 27563.7). Total num frames: 4685824. Throughput: 0: 48915.6. Samples: 5923700. Policy #0 lag: (min: 4.0, avg: 81.2, max: 139.0) [2024-03-20 22:49:05,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:49:09,395][04017] Updated weights for policy 0, policy_version 151 (0.0028) [2024-03-20 22:49:10,521][03784] Fps is (10 sec: 65548.0, 60 sec: 49151.8, 300 sec: 28461.4). Total num frames: 4980736. Throughput: 0: 50688.7. Samples: 6249200. Policy #0 lag: (min: 2.0, avg: 24.0, max: 46.0) [2024-03-20 22:49:10,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:49:15,521][03784] Fps is (10 sec: 55705.5, 60 sec: 48605.7, 300 sec: 29127.1). Total num frames: 5242880. Throughput: 0: 50717.9. Samples: 6416100. Policy #0 lag: (min: 0.0, avg: 25.5, max: 47.0) [2024-03-20 22:49:15,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:49:16,377][04017] Updated weights for policy 0, policy_version 161 (0.0013) [2024-03-20 22:49:20,521][03784] Fps is (10 sec: 49152.6, 60 sec: 49698.2, 300 sec: 29579.8). Total num frames: 5472256. Throughput: 0: 51437.6. Samples: 6735400. Policy #0 lag: (min: 2.0, avg: 27.1, max: 52.0) [2024-03-20 22:49:20,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:49:23,290][04017] Updated weights for policy 0, policy_version 171 (0.0010) [2024-03-20 22:49:23,634][04604] Worker 32 awakens! [2024-03-20 22:49:25,521][03784] Fps is (10 sec: 45875.1, 60 sec: 49151.9, 300 sec: 30008.6). Total num frames: 5701632. Throughput: 0: 54466.7. Samples: 7104900. Policy #0 lag: (min: 1.0, avg: 42.7, max: 170.0) [2024-03-20 22:49:25,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:49:28,953][03995] Signal inference workers to stop experience collection... (100 times) [2024-03-20 22:49:29,004][04017] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-03-20 22:49:29,027][03995] Signal inference workers to resume experience collection... (100 times) [2024-03-20 22:49:29,046][04017] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-03-20 22:49:29,713][04017] Updated weights for policy 0, policy_version 181 (0.0009) [2024-03-20 22:49:30,090][04957] Worker 33 awakens! [2024-03-20 22:49:30,521][03784] Fps is (10 sec: 49152.3, 60 sec: 51882.8, 300 sec: 30583.5). Total num frames: 5963776. Throughput: 0: 54968.8. Samples: 7278200. Policy #0 lag: (min: 1.0, avg: 22.8, max: 54.0) [2024-03-20 22:49:30,529][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:49:30,891][04605] Worker 34 awakens! [2024-03-20 22:49:33,729][04017] Updated weights for policy 0, policy_version 191 (0.0021) [2024-03-20 22:49:34,078][04731] Worker 35 awakens! [2024-03-20 22:49:35,521][03784] Fps is (10 sec: 68814.0, 60 sec: 54613.3, 300 sec: 31948.8). Total num frames: 6389760. Throughput: 0: 54029.1. Samples: 7549400. Policy #0 lag: (min: 1.0, avg: 32.2, max: 61.0) [2024-03-20 22:49:35,521][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:49:36,526][04891] Worker 36 awakens! [2024-03-20 22:49:37,294][04732] Worker 37 awakens! [2024-03-20 22:49:37,983][04017] Updated weights for policy 0, policy_version 201 (0.0009) [2024-03-20 22:49:40,521][03784] Fps is (10 sec: 72088.9, 60 sec: 52975.0, 300 sec: 32608.2). Total num frames: 6684672. Throughput: 0: 53188.8. Samples: 7858300. Policy #0 lag: (min: 1.0, avg: 64.3, max: 194.0) [2024-03-20 22:49:40,523][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:49:44,522][04735] Worker 40 awakens! [2024-03-20 22:49:44,973][04733] Worker 38 awakens! [2024-03-20 22:49:45,521][03784] Fps is (10 sec: 49151.3, 60 sec: 53520.9, 300 sec: 32768.0). Total num frames: 6881280. Throughput: 0: 53091.2. Samples: 8042000. Policy #0 lag: (min: 0.0, avg: 32.7, max: 58.0) [2024-03-20 22:49:45,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:49:45,683][04017] Updated weights for policy 0, policy_version 211 (0.0012) [2024-03-20 22:49:47,122][04734] Worker 39 awakens! [2024-03-20 22:49:49,770][04017] Updated weights for policy 0, policy_version 221 (0.0016) [2024-03-20 22:49:50,521][03784] Fps is (10 sec: 58982.8, 60 sec: 55159.5, 300 sec: 33834.9). Total num frames: 7274496. Throughput: 0: 53231.1. Samples: 8319100. Policy #0 lag: (min: 0.0, avg: 100.0, max: 211.0) [2024-03-20 22:49:50,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:49:53,014][04924] Worker 42 awakens! [2024-03-20 22:49:53,998][04668] Worker 41 awakens! [2024-03-20 22:49:55,358][04956] Worker 43 awakens! [2024-03-20 22:49:55,521][03784] Fps is (10 sec: 49152.1, 60 sec: 54067.2, 300 sec: 33512.8). Total num frames: 7372800. Throughput: 0: 54184.6. Samples: 8687500. Policy #0 lag: (min: 3.0, avg: 167.8, max: 221.0) [2024-03-20 22:49:55,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:50:00,521][03784] Fps is (10 sec: 29490.8, 60 sec: 54068.9, 300 sec: 33641.8). Total num frames: 7569408. Throughput: 0: 54662.1. Samples: 8875900. Policy #0 lag: (min: 0.0, avg: 130.4, max: 225.0) [2024-03-20 22:50:00,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:50:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000000231_7569408.pth... [2024-03-20 22:50:01,250][04959] Worker 44 awakens! [2024-03-20 22:50:01,819][04017] Updated weights for policy 0, policy_version 232 (0.0019) [2024-03-20 22:50:03,082][05029] Worker 45 awakens! [2024-03-20 22:50:05,521][03784] Fps is (10 sec: 45875.4, 60 sec: 52428.8, 300 sec: 34050.3). Total num frames: 7831552. Throughput: 0: 54015.6. Samples: 9166100. Policy #0 lag: (min: 3.0, avg: 162.8, max: 226.0) [2024-03-20 22:50:05,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:50:06,966][04017] Updated weights for policy 0, policy_version 242 (0.0015) [2024-03-20 22:50:07,259][05030] Worker 46 awakens! [2024-03-20 22:50:10,521][03784] Fps is (10 sec: 62260.1, 60 sec: 53521.2, 300 sec: 34859.6). Total num frames: 8192000. Throughput: 0: 52997.9. Samples: 9489800. Policy #0 lag: (min: 4.0, avg: 164.8, max: 240.0) [2024-03-20 22:50:10,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:50:11,406][03995] Signal inference workers to stop experience collection... (150 times) [2024-03-20 22:50:11,476][03995] Signal inference workers to resume experience collection... (150 times) [2024-03-20 22:50:11,496][04017] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-03-20 22:50:11,504][04017] Updated weights for policy 0, policy_version 252 (0.0020) [2024-03-20 22:50:11,554][04017] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-03-20 22:50:11,658][05064] Worker 47 awakens! [2024-03-20 22:50:14,214][05180] Worker 48 awakens! [2024-03-20 22:50:15,198][05367] Worker 49 awakens! [2024-03-20 22:50:15,521][03784] Fps is (10 sec: 62258.6, 60 sec: 53521.0, 300 sec: 35225.6). Total num frames: 8454144. Throughput: 0: 52437.7. Samples: 9637900. Policy #0 lag: (min: 0.0, avg: 35.2, max: 70.0) [2024-03-20 22:50:15,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:50:16,647][04017] Updated weights for policy 0, policy_version 262 (0.0024) [2024-03-20 22:50:20,521][03784] Fps is (10 sec: 65535.7, 60 sec: 56251.7, 300 sec: 36111.7). Total num frames: 8847360. Throughput: 0: 54593.2. Samples: 10006100. Policy #0 lag: (min: 0.0, avg: 40.3, max: 83.0) [2024-03-20 22:50:20,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:50:21,912][04017] Updated weights for policy 0, policy_version 272 (0.0029) [2024-03-20 22:50:25,521][03784] Fps is (10 sec: 65537.0, 60 sec: 56798.0, 300 sec: 36438.1). Total num frames: 9109504. Throughput: 0: 54382.4. Samples: 10305500. Policy #0 lag: (min: 1.0, avg: 43.3, max: 78.0) [2024-03-20 22:50:25,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:50:29,351][04017] Updated weights for policy 0, policy_version 282 (0.0033) [2024-03-20 22:50:30,521][03784] Fps is (10 sec: 49152.2, 60 sec: 56251.7, 300 sec: 36623.1). Total num frames: 9338880. Throughput: 0: 54740.0. Samples: 10505300. Policy #0 lag: (min: 4.0, avg: 39.4, max: 67.0) [2024-03-20 22:50:30,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:50:30,535][03995] Saving new best policy, reward=0.001! [2024-03-20 22:50:32,995][04017] Updated weights for policy 0, policy_version 292 (0.0053) [2024-03-20 22:50:35,521][03784] Fps is (10 sec: 49152.0, 60 sec: 53521.0, 300 sec: 36927.1). Total num frames: 9601024. Throughput: 0: 56002.3. Samples: 10839200. Policy #0 lag: (min: 2.0, avg: 43.1, max: 81.0) [2024-03-20 22:50:35,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:50:40,521][03784] Fps is (10 sec: 45875.7, 60 sec: 51882.8, 300 sec: 36972.2). Total num frames: 9797632. Throughput: 0: 56235.7. Samples: 11218100. Policy #0 lag: (min: 0.0, avg: 36.6, max: 72.0) [2024-03-20 22:50:40,521][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:50:42,579][04017] Updated weights for policy 0, policy_version 302 (0.0012) [2024-03-20 22:50:45,522][03784] Fps is (10 sec: 45872.5, 60 sec: 52974.5, 300 sec: 37258.4). Total num frames: 10059776. Throughput: 0: 55550.7. Samples: 11375700. Policy #0 lag: (min: 0.0, avg: 33.6, max: 74.0) [2024-03-20 22:50:45,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:50:48,881][04017] Updated weights for policy 0, policy_version 312 (0.0010) [2024-03-20 22:50:50,521][03784] Fps is (10 sec: 52427.9, 60 sec: 50790.3, 300 sec: 37534.3). Total num frames: 10321920. Throughput: 0: 56268.8. Samples: 11698200. Policy #0 lag: (min: 0.0, avg: 39.0, max: 78.0) [2024-03-20 22:50:50,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:50:54,659][04017] Updated weights for policy 0, policy_version 322 (0.0014) [2024-03-20 22:50:55,521][03784] Fps is (10 sec: 55709.3, 60 sec: 54067.4, 300 sec: 37917.3). Total num frames: 10616832. Throughput: 0: 56549.1. Samples: 12034500. Policy #0 lag: (min: 1.0, avg: 32.0, max: 67.0) [2024-03-20 22:50:55,521][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:50:58,610][04017] Updated weights for policy 0, policy_version 332 (0.0052) [2024-03-20 22:50:59,031][03995] Signal inference workers to stop experience collection... (200 times) [2024-03-20 22:50:59,106][04017] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-03-20 22:50:59,326][03995] Signal inference workers to resume experience collection... (200 times) [2024-03-20 22:50:59,327][04017] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-03-20 22:51:00,521][03784] Fps is (10 sec: 65536.6, 60 sec: 56798.0, 300 sec: 38516.8). Total num frames: 10977280. Throughput: 0: 56451.2. Samples: 12178200. Policy #0 lag: (min: 3.0, avg: 36.2, max: 69.0) [2024-03-20 22:51:00,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:51:03,387][04017] Updated weights for policy 0, policy_version 342 (0.0008) [2024-03-20 22:51:05,521][03784] Fps is (10 sec: 65534.7, 60 sec: 57343.9, 300 sec: 38869.6). Total num frames: 11272192. Throughput: 0: 55360.0. Samples: 12497300. Policy #0 lag: (min: 0.0, avg: 44.2, max: 84.0) [2024-03-20 22:51:05,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:51:10,521][03784] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 38988.4). Total num frames: 11501568. Throughput: 0: 56191.0. Samples: 12834100. Policy #0 lag: (min: 0.0, avg: 43.2, max: 78.0) [2024-03-20 22:51:10,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:51:10,617][04017] Updated weights for policy 0, policy_version 352 (0.0016) [2024-03-20 22:51:15,521][03784] Fps is (10 sec: 49151.6, 60 sec: 55159.4, 300 sec: 39877.0). Total num frames: 11763712. Throughput: 0: 55086.5. Samples: 12984200. Policy #0 lag: (min: 0.0, avg: 48.1, max: 90.0) [2024-03-20 22:51:15,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:51:16,492][04017] Updated weights for policy 0, policy_version 362 (0.0029) [2024-03-20 22:51:20,521][03784] Fps is (10 sec: 49152.9, 60 sec: 52429.0, 300 sec: 40654.6). Total num frames: 11993088. Throughput: 0: 55306.8. Samples: 13328000. Policy #0 lag: (min: 0.0, avg: 40.5, max: 81.0) [2024-03-20 22:51:20,521][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:51:25,333][04017] Updated weights for policy 0, policy_version 372 (0.0022) [2024-03-20 22:51:25,521][03784] Fps is (10 sec: 42598.9, 60 sec: 51336.5, 300 sec: 41321.0). Total num frames: 12189696. Throughput: 0: 54759.8. Samples: 13682300. Policy #0 lag: (min: 0.0, avg: 42.1, max: 80.0) [2024-03-20 22:51:25,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:51:29,560][04017] Updated weights for policy 0, policy_version 382 (0.0021) [2024-03-20 22:51:30,521][03784] Fps is (10 sec: 52427.7, 60 sec: 52974.9, 300 sec: 42431.8). Total num frames: 12517376. Throughput: 0: 54329.5. Samples: 13820500. Policy #0 lag: (min: 1.0, avg: 31.6, max: 68.0) [2024-03-20 22:51:30,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:51:35,521][03784] Fps is (10 sec: 58983.4, 60 sec: 52975.0, 300 sec: 43320.4). Total num frames: 12779520. Throughput: 0: 54075.8. Samples: 14131600. Policy #0 lag: (min: 0.0, avg: 48.1, max: 94.0) [2024-03-20 22:51:35,521][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:51:35,902][04017] Updated weights for policy 0, policy_version 392 (0.0009) [2024-03-20 22:51:40,521][03784] Fps is (10 sec: 62259.0, 60 sec: 55705.5, 300 sec: 44542.3). Total num frames: 13139968. Throughput: 0: 53655.3. Samples: 14449000. Policy #0 lag: (min: 1.0, avg: 35.9, max: 71.0) [2024-03-20 22:51:40,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:51:41,418][04017] Updated weights for policy 0, policy_version 402 (0.0028) [2024-03-20 22:51:45,521][03784] Fps is (10 sec: 55704.9, 60 sec: 54613.8, 300 sec: 45208.7). Total num frames: 13336576. Throughput: 0: 54282.2. Samples: 14620900. Policy #0 lag: (min: 0.0, avg: 40.7, max: 85.0) [2024-03-20 22:51:45,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:51:46,789][03995] Signal inference workers to stop experience collection... (250 times) [2024-03-20 22:51:46,793][04017] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-03-20 22:51:47,056][03995] Signal inference workers to resume experience collection... (250 times) [2024-03-20 22:51:47,056][04017] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-03-20 22:51:47,414][04017] Updated weights for policy 0, policy_version 412 (0.0022) [2024-03-20 22:51:50,521][03784] Fps is (10 sec: 52428.8, 60 sec: 55705.6, 300 sec: 46319.5). Total num frames: 13664256. Throughput: 0: 54084.4. Samples: 14931100. Policy #0 lag: (min: 0.0, avg: 33.4, max: 73.0) [2024-03-20 22:51:50,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:51:52,584][04017] Updated weights for policy 0, policy_version 422 (0.0020) [2024-03-20 22:51:55,521][03784] Fps is (10 sec: 65536.1, 60 sec: 56251.6, 300 sec: 47430.3). Total num frames: 13991936. Throughput: 0: 53609.0. Samples: 15246500. Policy #0 lag: (min: 0.0, avg: 47.4, max: 87.0) [2024-03-20 22:51:55,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:51:57,174][04017] Updated weights for policy 0, policy_version 432 (0.0010) [2024-03-20 22:52:00,521][03784] Fps is (10 sec: 49151.9, 60 sec: 52974.8, 300 sec: 47985.7). Total num frames: 14155776. Throughput: 0: 53815.6. Samples: 15405900. Policy #0 lag: (min: 0.0, avg: 47.4, max: 87.0) [2024-03-20 22:52:00,531][03784] Avg episode reward: [(0, '0.002')] [2024-03-20 22:52:00,544][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000000432_14155776.pth... [2024-03-20 22:52:00,676][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000000056_1835008.pth [2024-03-20 22:52:05,521][03784] Fps is (10 sec: 36044.2, 60 sec: 51336.5, 300 sec: 48652.1). Total num frames: 14352384. Throughput: 0: 54335.2. Samples: 15773100. Policy #0 lag: (min: 0.0, avg: 40.2, max: 76.0) [2024-03-20 22:52:05,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:52:08,460][04017] Updated weights for policy 0, policy_version 442 (0.0014) [2024-03-20 22:52:10,521][03784] Fps is (10 sec: 42598.5, 60 sec: 51336.5, 300 sec: 49318.6). Total num frames: 14581760. Throughput: 0: 53224.4. Samples: 16077400. Policy #0 lag: (min: 0.0, avg: 32.5, max: 75.0) [2024-03-20 22:52:10,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:52:13,550][04017] Updated weights for policy 0, policy_version 452 (0.0014) [2024-03-20 22:52:15,521][03784] Fps is (10 sec: 58982.8, 60 sec: 52975.0, 300 sec: 50540.5). Total num frames: 14942208. Throughput: 0: 53586.6. Samples: 16231900. Policy #0 lag: (min: 2.0, avg: 50.9, max: 95.0) [2024-03-20 22:52:15,523][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:52:20,521][03784] Fps is (10 sec: 52429.3, 60 sec: 51882.6, 300 sec: 50651.6). Total num frames: 15106048. Throughput: 0: 54008.8. Samples: 16562000. Policy #0 lag: (min: 0.0, avg: 42.1, max: 89.0) [2024-03-20 22:52:20,521][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:52:20,587][04017] Updated weights for policy 0, policy_version 462 (0.0020) [2024-03-20 22:52:25,521][03784] Fps is (10 sec: 49152.3, 60 sec: 54067.2, 300 sec: 50984.8). Total num frames: 15433728. Throughput: 0: 53491.2. Samples: 16856100. Policy #0 lag: (min: 0.0, avg: 37.3, max: 72.0) [2024-03-20 22:52:25,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:52:26,057][04017] Updated weights for policy 0, policy_version 472 (0.0012) [2024-03-20 22:52:30,176][04017] Updated weights for policy 0, policy_version 482 (0.0023) [2024-03-20 22:52:30,289][03995] Signal inference workers to stop experience collection... (300 times) [2024-03-20 22:52:30,421][04017] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-03-20 22:52:30,527][03784] Fps is (10 sec: 68774.0, 60 sec: 54608.3, 300 sec: 51428.2). Total num frames: 15794176. Throughput: 0: 52833.4. Samples: 16998700. Policy #0 lag: (min: 1.0, avg: 49.8, max: 92.0) [2024-03-20 22:52:30,527][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:52:30,608][03995] Signal inference workers to resume experience collection... (300 times) [2024-03-20 22:52:30,609][04017] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-03-20 22:52:35,157][04017] Updated weights for policy 0, policy_version 492 (0.0014) [2024-03-20 22:52:35,521][03784] Fps is (10 sec: 68813.1, 60 sec: 55705.5, 300 sec: 51762.3). Total num frames: 16121856. Throughput: 0: 53546.8. Samples: 17340700. Policy #0 lag: (min: 2.0, avg: 42.4, max: 76.0) [2024-03-20 22:52:35,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:52:40,521][03784] Fps is (10 sec: 55736.6, 60 sec: 53521.1, 300 sec: 52095.6). Total num frames: 16351232. Throughput: 0: 53426.6. Samples: 17650700. Policy #0 lag: (min: 0.0, avg: 44.5, max: 92.0) [2024-03-20 22:52:40,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:52:42,232][04017] Updated weights for policy 0, policy_version 502 (0.0011) [2024-03-20 22:52:45,521][03784] Fps is (10 sec: 36044.3, 60 sec: 52428.7, 300 sec: 51429.1). Total num frames: 16482304. Throughput: 0: 53115.5. Samples: 17796100. Policy #0 lag: (min: 2.0, avg: 44.7, max: 77.0) [2024-03-20 22:52:45,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:52:50,113][04017] Updated weights for policy 0, policy_version 512 (0.0023) [2024-03-20 22:52:50,521][03784] Fps is (10 sec: 45874.8, 60 sec: 52428.7, 300 sec: 51762.3). Total num frames: 16809984. Throughput: 0: 52571.1. Samples: 18138800. Policy #0 lag: (min: 2.0, avg: 48.9, max: 86.0) [2024-03-20 22:52:50,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:52:55,521][03784] Fps is (10 sec: 58982.7, 60 sec: 51336.5, 300 sec: 51651.2). Total num frames: 17072128. Throughput: 0: 52913.3. Samples: 18458500. Policy #0 lag: (min: 0.0, avg: 48.4, max: 92.0) [2024-03-20 22:52:55,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:52:56,183][04017] Updated weights for policy 0, policy_version 522 (0.0009) [2024-03-20 22:53:00,521][03784] Fps is (10 sec: 52429.3, 60 sec: 52975.0, 300 sec: 52428.8). Total num frames: 17334272. Throughput: 0: 52631.2. Samples: 18600300. Policy #0 lag: (min: 0.0, avg: 40.2, max: 86.0) [2024-03-20 22:53:00,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:53:01,837][04017] Updated weights for policy 0, policy_version 532 (0.0019) [2024-03-20 22:53:05,521][03784] Fps is (10 sec: 62259.6, 60 sec: 55705.7, 300 sec: 53095.3). Total num frames: 17694720. Throughput: 0: 52851.1. Samples: 18940300. Policy #0 lag: (min: 0.0, avg: 42.7, max: 87.0) [2024-03-20 22:53:05,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:53:07,256][04017] Updated weights for policy 0, policy_version 542 (0.0016) [2024-03-20 22:53:10,521][03784] Fps is (10 sec: 58983.0, 60 sec: 55705.7, 300 sec: 52873.1). Total num frames: 17924096. Throughput: 0: 53684.5. Samples: 19271900. Policy #0 lag: (min: 1.0, avg: 39.6, max: 78.0) [2024-03-20 22:53:10,521][03784] Avg episode reward: [(0, '0.002')] [2024-03-20 22:53:11,930][04017] Updated weights for policy 0, policy_version 552 (0.0053) [2024-03-20 22:53:15,521][03784] Fps is (10 sec: 49151.2, 60 sec: 54067.1, 300 sec: 53206.3). Total num frames: 18186240. Throughput: 0: 53568.7. Samples: 19409000. Policy #0 lag: (min: 0.0, avg: 39.5, max: 85.0) [2024-03-20 22:53:15,522][03784] Avg episode reward: [(0, '0.002')] [2024-03-20 22:53:15,716][03995] Signal inference workers to stop experience collection... (350 times) [2024-03-20 22:53:15,793][04017] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-03-20 22:53:15,993][03995] Signal inference workers to resume experience collection... (350 times) [2024-03-20 22:53:15,994][04017] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-03-20 22:53:17,506][04017] Updated weights for policy 0, policy_version 562 (0.0009) [2024-03-20 22:53:20,521][03784] Fps is (10 sec: 65535.5, 60 sec: 57890.1, 300 sec: 53650.7). Total num frames: 18579456. Throughput: 0: 53246.6. Samples: 19736800. Policy #0 lag: (min: 0.0, avg: 41.1, max: 87.0) [2024-03-20 22:53:20,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:53:24,752][04017] Updated weights for policy 0, policy_version 572 (0.0009) [2024-03-20 22:53:25,521][03784] Fps is (10 sec: 62260.0, 60 sec: 56251.7, 300 sec: 54095.0). Total num frames: 18808832. Throughput: 0: 54335.6. Samples: 20095800. Policy #0 lag: (min: 1.0, avg: 43.4, max: 85.0) [2024-03-20 22:53:25,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:53:30,521][03784] Fps is (10 sec: 45874.7, 60 sec: 54072.2, 300 sec: 53983.8). Total num frames: 19038208. Throughput: 0: 54968.9. Samples: 20269700. Policy #0 lag: (min: 0.0, avg: 40.1, max: 79.0) [2024-03-20 22:53:30,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:53:32,196][04017] Updated weights for policy 0, policy_version 582 (0.0010) [2024-03-20 22:53:35,521][03784] Fps is (10 sec: 39322.0, 60 sec: 51336.6, 300 sec: 53206.4). Total num frames: 19202048. Throughput: 0: 55158.0. Samples: 20620900. Policy #0 lag: (min: 0.0, avg: 30.3, max: 75.0) [2024-03-20 22:53:35,521][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:53:37,279][04017] Updated weights for policy 0, policy_version 592 (0.0008) [2024-03-20 22:53:40,521][03784] Fps is (10 sec: 55706.1, 60 sec: 54067.2, 300 sec: 53983.9). Total num frames: 19595264. Throughput: 0: 54622.3. Samples: 20916500. Policy #0 lag: (min: 2.0, avg: 35.7, max: 88.0) [2024-03-20 22:53:40,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:53:42,537][04017] Updated weights for policy 0, policy_version 602 (0.0008) [2024-03-20 22:53:45,521][03784] Fps is (10 sec: 65536.3, 60 sec: 56252.0, 300 sec: 53872.9). Total num frames: 19857408. Throughput: 0: 54589.1. Samples: 21056800. Policy #0 lag: (min: 0.0, avg: 42.4, max: 87.0) [2024-03-20 22:53:45,521][03784] Avg episode reward: [(0, '0.002')] [2024-03-20 22:53:49,045][04017] Updated weights for policy 0, policy_version 612 (0.0010) [2024-03-20 22:53:50,521][03784] Fps is (10 sec: 52429.2, 60 sec: 55159.6, 300 sec: 54206.1). Total num frames: 20119552. Throughput: 0: 55113.4. Samples: 21420400. Policy #0 lag: (min: 0.0, avg: 36.3, max: 79.0) [2024-03-20 22:53:50,521][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:53:54,757][04017] Updated weights for policy 0, policy_version 622 (0.0015) [2024-03-20 22:53:55,521][03784] Fps is (10 sec: 58981.7, 60 sec: 56251.8, 300 sec: 54650.7). Total num frames: 20447232. Throughput: 0: 55135.5. Samples: 21753000. Policy #0 lag: (min: 1.0, avg: 41.5, max: 84.0) [2024-03-20 22:53:55,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:53:58,908][04017] Updated weights for policy 0, policy_version 632 (0.0021) [2024-03-20 22:54:00,521][03784] Fps is (10 sec: 65535.2, 60 sec: 57344.0, 300 sec: 54539.3). Total num frames: 20774912. Throughput: 0: 55173.5. Samples: 21891800. Policy #0 lag: (min: 0.0, avg: 40.5, max: 88.0) [2024-03-20 22:54:00,531][03784] Avg episode reward: [(0, '0.003')] [2024-03-20 22:54:00,544][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000000634_20774912.pth... [2024-03-20 22:54:00,684][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000000231_7569408.pth [2024-03-20 22:54:00,694][03995] Saving new best policy, reward=0.003! [2024-03-20 22:54:00,886][03995] Signal inference workers to stop experience collection... (400 times) [2024-03-20 22:54:00,957][04017] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-03-20 22:54:01,152][03995] Signal inference workers to resume experience collection... (400 times) [2024-03-20 22:54:01,152][04017] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-03-20 22:54:05,521][03784] Fps is (10 sec: 52428.5, 60 sec: 54613.3, 300 sec: 54206.1). Total num frames: 20971520. Throughput: 0: 55451.1. Samples: 22232100. Policy #0 lag: (min: 2.0, avg: 48.8, max: 93.0) [2024-03-20 22:54:05,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:54:06,261][04017] Updated weights for policy 0, policy_version 642 (0.0008) [2024-03-20 22:54:10,521][03784] Fps is (10 sec: 49151.6, 60 sec: 55705.4, 300 sec: 54317.1). Total num frames: 21266432. Throughput: 0: 55377.6. Samples: 22587800. Policy #0 lag: (min: 1.0, avg: 43.5, max: 84.0) [2024-03-20 22:54:10,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:54:12,478][04017] Updated weights for policy 0, policy_version 652 (0.0013) [2024-03-20 22:54:15,521][03784] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 54428.2). Total num frames: 21528576. Throughput: 0: 55246.8. Samples: 22755800. Policy #0 lag: (min: 0.0, avg: 40.3, max: 85.0) [2024-03-20 22:54:15,522][03784] Avg episode reward: [(0, '0.003')] [2024-03-20 22:54:18,242][04017] Updated weights for policy 0, policy_version 662 (0.0013) [2024-03-20 22:54:20,521][03784] Fps is (10 sec: 52429.4, 60 sec: 53521.1, 300 sec: 54539.3). Total num frames: 21790720. Throughput: 0: 55151.0. Samples: 23102700. Policy #0 lag: (min: 1.0, avg: 34.2, max: 66.0) [2024-03-20 22:54:20,522][03784] Avg episode reward: [(0, '0.006')] [2024-03-20 22:54:20,535][03995] Saving new best policy, reward=0.006! [2024-03-20 22:54:25,521][03784] Fps is (10 sec: 42598.6, 60 sec: 52428.8, 300 sec: 54206.0). Total num frames: 21954560. Throughput: 0: 55828.9. Samples: 23428800. Policy #0 lag: (min: 0.0, avg: 40.4, max: 87.0) [2024-03-20 22:54:25,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:54:26,533][04017] Updated weights for policy 0, policy_version 672 (0.0024) [2024-03-20 22:54:30,521][03784] Fps is (10 sec: 52428.4, 60 sec: 54613.3, 300 sec: 53983.9). Total num frames: 22315008. Throughput: 0: 55821.9. Samples: 23568800. Policy #0 lag: (min: 0.0, avg: 36.7, max: 86.0) [2024-03-20 22:54:30,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:54:31,892][04017] Updated weights for policy 0, policy_version 682 (0.0033) [2024-03-20 22:54:35,521][03784] Fps is (10 sec: 65536.1, 60 sec: 56797.8, 300 sec: 53983.9). Total num frames: 22609920. Throughput: 0: 54724.4. Samples: 23883000. Policy #0 lag: (min: 0.0, avg: 40.3, max: 103.0) [2024-03-20 22:54:35,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 22:54:36,149][04017] Updated weights for policy 0, policy_version 692 (0.0010) [2024-03-20 22:54:40,521][03784] Fps is (10 sec: 55705.5, 60 sec: 54613.2, 300 sec: 54206.0). Total num frames: 22872064. Throughput: 0: 54624.3. Samples: 24211100. Policy #0 lag: (min: 2.0, avg: 34.9, max: 65.0) [2024-03-20 22:54:40,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:54:41,662][04017] Updated weights for policy 0, policy_version 702 (0.0014) [2024-03-20 22:54:45,521][03784] Fps is (10 sec: 62258.7, 60 sec: 56251.5, 300 sec: 54095.0). Total num frames: 23232512. Throughput: 0: 54582.2. Samples: 24348000. Policy #0 lag: (min: 0.0, avg: 45.2, max: 92.0) [2024-03-20 22:54:45,522][03784] Avg episode reward: [(0, '0.004')] [2024-03-20 22:54:47,288][04017] Updated weights for policy 0, policy_version 712 (0.0010) [2024-03-20 22:54:47,557][03995] Signal inference workers to stop experience collection... (450 times) [2024-03-20 22:54:47,624][04017] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-03-20 22:54:47,685][03995] Signal inference workers to resume experience collection... (450 times) [2024-03-20 22:54:47,685][04017] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-03-20 22:54:50,521][03784] Fps is (10 sec: 62260.1, 60 sec: 56251.7, 300 sec: 54650.4). Total num frames: 23494656. Throughput: 0: 54049.0. Samples: 24664300. Policy #0 lag: (min: 0.0, avg: 43.6, max: 97.0) [2024-03-20 22:54:50,522][03784] Avg episode reward: [(0, '0.003')] [2024-03-20 22:54:55,053][04017] Updated weights for policy 0, policy_version 722 (0.0011) [2024-03-20 22:54:55,521][03784] Fps is (10 sec: 45874.9, 60 sec: 54067.1, 300 sec: 54650.4). Total num frames: 23691264. Throughput: 0: 53671.1. Samples: 25003000. Policy #0 lag: (min: 0.0, avg: 47.2, max: 98.0) [2024-03-20 22:54:55,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:55:00,521][03784] Fps is (10 sec: 36044.3, 60 sec: 51336.5, 300 sec: 54317.1). Total num frames: 23855104. Throughput: 0: 53079.9. Samples: 25144400. Policy #0 lag: (min: 0.0, avg: 35.9, max: 73.0) [2024-03-20 22:55:00,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:55:05,521][03784] Fps is (10 sec: 26214.5, 60 sec: 49698.1, 300 sec: 53428.5). Total num frames: 23953408. Throughput: 0: 53108.8. Samples: 25492600. Policy #0 lag: (min: 0.0, avg: 35.0, max: 72.0) [2024-03-20 22:55:05,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 22:55:06,658][04017] Updated weights for policy 0, policy_version 732 (0.0023) [2024-03-20 22:55:10,521][03784] Fps is (10 sec: 36045.0, 60 sec: 49152.1, 300 sec: 53428.5). Total num frames: 24215552. Throughput: 0: 51986.6. Samples: 25768200. Policy #0 lag: (min: 0.0, avg: 44.5, max: 92.0) [2024-03-20 22:55:10,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:55:11,725][04017] Updated weights for policy 0, policy_version 742 (0.0020) [2024-03-20 22:55:15,521][03784] Fps is (10 sec: 55705.4, 60 sec: 49698.1, 300 sec: 53095.3). Total num frames: 24510464. Throughput: 0: 52093.3. Samples: 25913000. Policy #0 lag: (min: 0.0, avg: 37.6, max: 78.0) [2024-03-20 22:55:15,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:55:19,370][04017] Updated weights for policy 0, policy_version 752 (0.0023) [2024-03-20 22:55:20,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 52984.2). Total num frames: 24739840. Throughput: 0: 52266.6. Samples: 26235000. Policy #0 lag: (min: 0.0, avg: 49.5, max: 94.0) [2024-03-20 22:55:20,522][03784] Avg episode reward: [(0, '0.003')] [2024-03-20 22:55:22,428][04017] Updated weights for policy 0, policy_version 762 (0.0031) [2024-03-20 22:55:25,407][04017] Updated weights for policy 0, policy_version 772 (0.0023) [2024-03-20 22:55:25,521][03784] Fps is (10 sec: 78642.9, 60 sec: 55705.5, 300 sec: 54094.9). Total num frames: 25296896. Throughput: 0: 50102.2. Samples: 26465700. Policy #0 lag: (min: 3.0, avg: 38.9, max: 71.0) [2024-03-20 22:55:25,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:55:30,521][03784] Fps is (10 sec: 85196.4, 60 sec: 54613.3, 300 sec: 54206.0). Total num frames: 25591808. Throughput: 0: 50440.0. Samples: 26617800. Policy #0 lag: (min: 0.0, avg: 49.4, max: 98.0) [2024-03-20 22:55:30,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 22:55:31,236][03995] Signal inference workers to stop experience collection... (500 times) [2024-03-20 22:55:31,237][03995] Signal inference workers to resume experience collection... (500 times) [2024-03-20 22:55:31,278][04017] Updated weights for policy 0, policy_version 782 (0.0029) [2024-03-20 22:55:31,314][04017] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-03-20 22:55:31,315][04017] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-03-20 22:55:35,521][03784] Fps is (10 sec: 65537.7, 60 sec: 55705.7, 300 sec: 54761.4). Total num frames: 25952256. Throughput: 0: 50137.9. Samples: 26920500. Policy #0 lag: (min: 2.0, avg: 42.8, max: 70.0) [2024-03-20 22:55:35,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:55:35,540][04017] Updated weights for policy 0, policy_version 792 (0.0014) [2024-03-20 22:55:40,521][03784] Fps is (10 sec: 39321.9, 60 sec: 51882.7, 300 sec: 53984.0). Total num frames: 25985024. Throughput: 0: 50651.2. Samples: 27282300. Policy #0 lag: (min: 0.0, avg: 42.4, max: 88.0) [2024-03-20 22:55:40,522][03784] Avg episode reward: [(0, '0.004')] [2024-03-20 22:55:45,521][03784] Fps is (10 sec: 29490.6, 60 sec: 50244.2, 300 sec: 53983.9). Total num frames: 26247168. Throughput: 0: 51408.9. Samples: 27457800. Policy #0 lag: (min: 0.0, avg: 40.1, max: 84.0) [2024-03-20 22:55:45,522][03784] Avg episode reward: [(0, '0.002')] [2024-03-20 22:55:49,036][04017] Updated weights for policy 0, policy_version 802 (0.0009) [2024-03-20 22:55:50,521][03784] Fps is (10 sec: 42598.4, 60 sec: 48605.8, 300 sec: 53539.5). Total num frames: 26411008. Throughput: 0: 51317.8. Samples: 27801900. Policy #0 lag: (min: 0.0, avg: 45.1, max: 86.0) [2024-03-20 22:55:50,531][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:55:55,101][04017] Updated weights for policy 0, policy_version 812 (0.0018) [2024-03-20 22:55:55,521][03784] Fps is (10 sec: 36044.8, 60 sec: 48605.9, 300 sec: 52984.2). Total num frames: 26607616. Throughput: 0: 51837.7. Samples: 28100900. Policy #0 lag: (min: 0.0, avg: 38.5, max: 78.0) [2024-03-20 22:55:55,531][03784] Avg episode reward: [(0, '0.003')] [2024-03-20 22:56:00,521][03784] Fps is (10 sec: 39321.5, 60 sec: 49152.0, 300 sec: 52651.0). Total num frames: 26804224. Throughput: 0: 51940.1. Samples: 28250300. Policy #0 lag: (min: 0.0, avg: 31.8, max: 77.0) [2024-03-20 22:56:00,522][03784] Avg episode reward: [(0, '0.004')] [2024-03-20 22:56:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000000818_26804224.pth... [2024-03-20 22:56:00,644][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000000432_14155776.pth [2024-03-20 22:56:04,205][04017] Updated weights for policy 0, policy_version 822 (0.0015) [2024-03-20 22:56:05,521][03784] Fps is (10 sec: 42599.1, 60 sec: 51336.6, 300 sec: 52651.0). Total num frames: 27033600. Throughput: 0: 51795.7. Samples: 28565800. Policy #0 lag: (min: 0.0, avg: 23.5, max: 62.0) [2024-03-20 22:56:05,521][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:56:08,293][04017] Updated weights for policy 0, policy_version 832 (0.0044) [2024-03-20 22:56:10,521][03784] Fps is (10 sec: 65534.9, 60 sec: 54067.0, 300 sec: 53206.3). Total num frames: 27459584. Throughput: 0: 52271.0. Samples: 28817900. Policy #0 lag: (min: 1.0, avg: 34.0, max: 101.0) [2024-03-20 22:56:10,522][03784] Avg episode reward: [(0, '0.003')] [2024-03-20 22:56:13,257][04017] Updated weights for policy 0, policy_version 842 (0.0013) [2024-03-20 22:56:15,521][03784] Fps is (10 sec: 65535.8, 60 sec: 52975.1, 300 sec: 53206.3). Total num frames: 27688960. Throughput: 0: 52549.1. Samples: 28982500. Policy #0 lag: (min: 0.0, avg: 38.3, max: 76.0) [2024-03-20 22:56:15,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:56:18,219][04017] Updated weights for policy 0, policy_version 852 (0.0021) [2024-03-20 22:56:19,429][03995] Signal inference workers to stop experience collection... (550 times) [2024-03-20 22:56:19,502][04017] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-03-20 22:56:19,708][03995] Signal inference workers to resume experience collection... (550 times) [2024-03-20 22:56:19,708][04017] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-03-20 22:56:20,521][03784] Fps is (10 sec: 68814.2, 60 sec: 56797.9, 300 sec: 54095.0). Total num frames: 28147712. Throughput: 0: 52564.3. Samples: 29285900. Policy #0 lag: (min: 2.0, avg: 50.9, max: 102.0) [2024-03-20 22:56:20,522][03784] Avg episode reward: [(0, '0.002')] [2024-03-20 22:56:21,211][04017] Updated weights for policy 0, policy_version 862 (0.0009) [2024-03-20 22:56:25,521][03784] Fps is (10 sec: 85195.9, 60 sec: 54067.3, 300 sec: 54317.1). Total num frames: 28540928. Throughput: 0: 51082.2. Samples: 29581000. Policy #0 lag: (min: 3.0, avg: 46.9, max: 85.0) [2024-03-20 22:56:25,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:56:27,899][04017] Updated weights for policy 0, policy_version 872 (0.0015) [2024-03-20 22:56:30,521][03784] Fps is (10 sec: 55705.2, 60 sec: 51882.7, 300 sec: 53983.8). Total num frames: 28704768. Throughput: 0: 51331.1. Samples: 29767700. Policy #0 lag: (min: 0.0, avg: 46.3, max: 86.0) [2024-03-20 22:56:30,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:56:35,521][03784] Fps is (10 sec: 32768.2, 60 sec: 48605.8, 300 sec: 53317.4). Total num frames: 28868608. Throughput: 0: 51720.0. Samples: 30129300. Policy #0 lag: (min: 1.0, avg: 40.9, max: 83.0) [2024-03-20 22:56:35,522][03784] Avg episode reward: [(0, '0.000')] [2024-03-20 22:56:35,896][04017] Updated weights for policy 0, policy_version 882 (0.0011) [2024-03-20 22:56:40,521][03784] Fps is (10 sec: 42598.6, 60 sec: 52428.8, 300 sec: 53539.6). Total num frames: 29130752. Throughput: 0: 52380.1. Samples: 30458000. Policy #0 lag: (min: 2.0, avg: 41.1, max: 85.0) [2024-03-20 22:56:40,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 22:56:44,342][04017] Updated weights for policy 0, policy_version 892 (0.0009) [2024-03-20 22:56:45,521][03784] Fps is (10 sec: 45875.6, 60 sec: 51336.7, 300 sec: 53095.3). Total num frames: 29327360. Throughput: 0: 53382.4. Samples: 30652500. Policy #0 lag: (min: 0.0, avg: 40.4, max: 88.0) [2024-03-20 22:56:45,521][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 22:56:48,617][04017] Updated weights for policy 0, policy_version 902 (0.0020) [2024-03-20 22:56:50,521][03784] Fps is (10 sec: 49152.4, 60 sec: 53521.1, 300 sec: 52984.2). Total num frames: 29622272. Throughput: 0: 53295.5. Samples: 30964100. Policy #0 lag: (min: 0.0, avg: 36.7, max: 84.0) [2024-03-20 22:56:50,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 22:56:53,757][04017] Updated weights for policy 0, policy_version 912 (0.0013) [2024-03-20 22:56:55,521][03784] Fps is (10 sec: 68812.4, 60 sec: 56798.0, 300 sec: 53761.8). Total num frames: 30015488. Throughput: 0: 54653.6. Samples: 31277300. Policy #0 lag: (min: 3.0, avg: 38.6, max: 92.0) [2024-03-20 22:56:55,522][03784] Avg episode reward: [(0, '0.002')] [2024-03-20 22:57:00,521][03784] Fps is (10 sec: 55705.3, 60 sec: 56251.8, 300 sec: 53650.7). Total num frames: 30179328. Throughput: 0: 54644.4. Samples: 31441500. Policy #0 lag: (min: 0.0, avg: 34.9, max: 72.0) [2024-03-20 22:57:00,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:57:01,354][04017] Updated weights for policy 0, policy_version 922 (0.0008) [2024-03-20 22:57:05,521][03784] Fps is (10 sec: 42598.7, 60 sec: 56797.9, 300 sec: 53761.8). Total num frames: 30441472. Throughput: 0: 55560.1. Samples: 31786100. Policy #0 lag: (min: 0.0, avg: 32.3, max: 87.0) [2024-03-20 22:57:05,522][03784] Avg episode reward: [(0, '0.002')] [2024-03-20 22:57:06,186][04017] Updated weights for policy 0, policy_version 932 (0.0010) [2024-03-20 22:57:06,212][03995] Signal inference workers to stop experience collection... (600 times) [2024-03-20 22:57:06,283][04017] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-03-20 22:57:06,491][03995] Signal inference workers to resume experience collection... (600 times) [2024-03-20 22:57:06,492][04017] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-03-20 22:57:10,241][04017] Updated weights for policy 0, policy_version 942 (0.0011) [2024-03-20 22:57:10,521][03784] Fps is (10 sec: 68813.4, 60 sec: 56798.1, 300 sec: 53983.9). Total num frames: 30867456. Throughput: 0: 55624.6. Samples: 32084100. Policy #0 lag: (min: 1.0, avg: 40.6, max: 82.0) [2024-03-20 22:57:10,521][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 22:57:15,521][03784] Fps is (10 sec: 72088.8, 60 sec: 57890.1, 300 sec: 54428.2). Total num frames: 31162368. Throughput: 0: 54604.5. Samples: 32224900. Policy #0 lag: (min: 0.0, avg: 46.9, max: 83.0) [2024-03-20 22:57:15,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 22:57:16,938][04017] Updated weights for policy 0, policy_version 952 (0.0009) [2024-03-20 22:57:20,521][03784] Fps is (10 sec: 49151.6, 60 sec: 53521.1, 300 sec: 53983.9). Total num frames: 31358976. Throughput: 0: 54148.9. Samples: 32566000. Policy #0 lag: (min: 0.0, avg: 47.5, max: 87.0) [2024-03-20 22:57:20,521][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:57:25,521][03784] Fps is (10 sec: 29491.3, 60 sec: 48605.9, 300 sec: 53096.3). Total num frames: 31457280. Throughput: 0: 54360.1. Samples: 32904200. Policy #0 lag: (min: 0.0, avg: 40.9, max: 81.0) [2024-03-20 22:57:25,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:57:25,955][04017] Updated weights for policy 0, policy_version 962 (0.0013) [2024-03-20 22:57:30,521][03784] Fps is (10 sec: 42598.2, 60 sec: 51336.6, 300 sec: 53095.3). Total num frames: 31784960. Throughput: 0: 53733.2. Samples: 33070500. Policy #0 lag: (min: 0.0, avg: 37.1, max: 84.0) [2024-03-20 22:57:30,522][03784] Avg episode reward: [(0, '0.002')] [2024-03-20 22:57:31,264][04017] Updated weights for policy 0, policy_version 972 (0.0024) [2024-03-20 22:57:35,521][03784] Fps is (10 sec: 62258.8, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 32079872. Throughput: 0: 53439.9. Samples: 33368900. Policy #0 lag: (min: 3.0, avg: 32.7, max: 69.0) [2024-03-20 22:57:35,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:57:36,321][04017] Updated weights for policy 0, policy_version 982 (0.0009) [2024-03-20 22:57:40,521][03784] Fps is (10 sec: 65535.6, 60 sec: 55159.4, 300 sec: 54095.0). Total num frames: 32440320. Throughput: 0: 53628.7. Samples: 33690600. Policy #0 lag: (min: 1.0, avg: 38.8, max: 80.0) [2024-03-20 22:57:40,522][03784] Avg episode reward: [(0, '0.003')] [2024-03-20 22:57:41,277][04017] Updated weights for policy 0, policy_version 992 (0.0013) [2024-03-20 22:57:45,521][03784] Fps is (10 sec: 65535.4, 60 sec: 56797.7, 300 sec: 53983.9). Total num frames: 32735232. Throughput: 0: 53784.3. Samples: 33861800. Policy #0 lag: (min: 1.0, avg: 37.7, max: 80.0) [2024-03-20 22:57:45,522][03784] Avg episode reward: [(0, '0.003')] [2024-03-20 22:57:47,939][04017] Updated weights for policy 0, policy_version 1002 (0.0008) [2024-03-20 22:57:50,521][03784] Fps is (10 sec: 49152.5, 60 sec: 55159.4, 300 sec: 53761.7). Total num frames: 32931840. Throughput: 0: 53806.5. Samples: 34207400. Policy #0 lag: (min: 1.0, avg: 45.5, max: 88.0) [2024-03-20 22:57:50,522][03784] Avg episode reward: [(0, '0.006')] [2024-03-20 22:57:54,596][04017] Updated weights for policy 0, policy_version 1012 (0.0010) [2024-03-20 22:57:55,521][03784] Fps is (10 sec: 42598.9, 60 sec: 52428.8, 300 sec: 53650.7). Total num frames: 33161216. Throughput: 0: 54537.7. Samples: 34538300. Policy #0 lag: (min: 0.0, avg: 39.4, max: 85.0) [2024-03-20 22:57:55,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 22:57:55,656][03995] Signal inference workers to stop experience collection... (650 times) [2024-03-20 22:57:55,668][03995] Signal inference workers to resume experience collection... (650 times) [2024-03-20 22:57:55,751][04017] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-03-20 22:57:55,752][04017] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-03-20 22:58:00,521][03784] Fps is (10 sec: 52428.0, 60 sec: 54613.2, 300 sec: 53428.5). Total num frames: 33456128. Throughput: 0: 54790.9. Samples: 34690500. Policy #0 lag: (min: 0.0, avg: 41.0, max: 84.0) [2024-03-20 22:58:00,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 22:58:00,558][04017] Updated weights for policy 0, policy_version 1022 (0.0014) [2024-03-20 22:58:00,851][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000001023_33521664.pth... [2024-03-20 22:58:00,968][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000000634_20774912.pth [2024-03-20 22:58:05,521][03784] Fps is (10 sec: 52428.8, 60 sec: 54067.1, 300 sec: 53428.5). Total num frames: 33685504. Throughput: 0: 53662.2. Samples: 34980800. Policy #0 lag: (min: 0.0, avg: 40.8, max: 83.0) [2024-03-20 22:58:05,522][03784] Avg episode reward: [(0, '0.007')] [2024-03-20 22:58:08,807][04017] Updated weights for policy 0, policy_version 1032 (0.0014) [2024-03-20 22:58:10,521][03784] Fps is (10 sec: 42599.1, 60 sec: 50244.2, 300 sec: 53206.4). Total num frames: 33882112. Throughput: 0: 53275.5. Samples: 35301600. Policy #0 lag: (min: 0.0, avg: 34.3, max: 78.0) [2024-03-20 22:58:10,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:58:15,521][03784] Fps is (10 sec: 42597.9, 60 sec: 49151.9, 300 sec: 52650.9). Total num frames: 34111488. Throughput: 0: 53179.9. Samples: 35463600. Policy #0 lag: (min: 2.0, avg: 35.4, max: 90.0) [2024-03-20 22:58:15,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 22:58:15,719][04017] Updated weights for policy 0, policy_version 1042 (0.0015) [2024-03-20 22:58:20,521][03784] Fps is (10 sec: 52429.5, 60 sec: 50790.5, 300 sec: 52873.1). Total num frames: 34406400. Throughput: 0: 52924.6. Samples: 35750500. Policy #0 lag: (min: 0.0, avg: 35.1, max: 73.0) [2024-03-20 22:58:20,521][03784] Avg episode reward: [(0, '0.009')] [2024-03-20 22:58:20,866][03995] Saving new best policy, reward=0.009! [2024-03-20 22:58:20,868][04017] Updated weights for policy 0, policy_version 1052 (0.0009) [2024-03-20 22:58:25,521][03784] Fps is (10 sec: 62259.8, 60 sec: 54613.3, 300 sec: 53206.4). Total num frames: 34734080. Throughput: 0: 52915.7. Samples: 36071800. Policy #0 lag: (min: 5.0, avg: 40.6, max: 80.0) [2024-03-20 22:58:25,522][03784] Avg episode reward: [(0, '0.011')] [2024-03-20 22:58:25,522][03995] Saving new best policy, reward=0.011! [2024-03-20 22:58:26,656][04017] Updated weights for policy 0, policy_version 1062 (0.0013) [2024-03-20 22:58:30,521][03784] Fps is (10 sec: 62258.9, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 35028992. Throughput: 0: 52915.8. Samples: 36243000. Policy #0 lag: (min: 0.0, avg: 37.9, max: 75.0) [2024-03-20 22:58:30,521][03784] Avg episode reward: [(0, '0.011')] [2024-03-20 22:58:31,150][04017] Updated weights for policy 0, policy_version 1072 (0.0017) [2024-03-20 22:58:35,521][03784] Fps is (10 sec: 62259.0, 60 sec: 54613.3, 300 sec: 53428.5). Total num frames: 35356672. Throughput: 0: 51702.2. Samples: 36534000. Policy #0 lag: (min: 1.0, avg: 43.8, max: 78.0) [2024-03-20 22:58:35,522][03784] Avg episode reward: [(0, '0.002')] [2024-03-20 22:58:36,205][04017] Updated weights for policy 0, policy_version 1082 (0.0009) [2024-03-20 22:58:40,521][03784] Fps is (10 sec: 52428.0, 60 sec: 51882.7, 300 sec: 53206.3). Total num frames: 35553280. Throughput: 0: 51333.3. Samples: 36848300. Policy #0 lag: (min: 1.0, avg: 48.0, max: 97.0) [2024-03-20 22:58:40,522][03784] Avg episode reward: [(0, '0.003')] [2024-03-20 22:58:45,521][03784] Fps is (10 sec: 36044.6, 60 sec: 49698.2, 300 sec: 52873.1). Total num frames: 35717120. Throughput: 0: 51206.7. Samples: 36994800. Policy #0 lag: (min: 0.0, avg: 46.4, max: 87.0) [2024-03-20 22:58:45,522][03784] Avg episode reward: [(0, '0.003')] [2024-03-20 22:58:47,920][03995] Signal inference workers to stop experience collection... (700 times) [2024-03-20 22:58:48,005][04017] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-03-20 22:58:48,164][03995] Signal inference workers to resume experience collection... (700 times) [2024-03-20 22:58:48,164][04017] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-03-20 22:58:48,167][04017] Updated weights for policy 0, policy_version 1092 (0.0015) [2024-03-20 22:58:50,521][03784] Fps is (10 sec: 36045.0, 60 sec: 49698.1, 300 sec: 52428.8). Total num frames: 35913728. Throughput: 0: 51291.1. Samples: 37288900. Policy #0 lag: (min: 0.0, avg: 41.3, max: 88.0) [2024-03-20 22:58:50,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:58:55,521][03784] Fps is (10 sec: 19661.0, 60 sec: 45875.2, 300 sec: 51318.0). Total num frames: 35913728. Throughput: 0: 50964.5. Samples: 37595000. Policy #0 lag: (min: 0.0, avg: 41.3, max: 88.0) [2024-03-20 22:58:55,522][03784] Avg episode reward: [(0, '0.006')] [2024-03-20 22:58:58,551][04017] Updated weights for policy 0, policy_version 1102 (0.0011) [2024-03-20 22:59:00,521][03784] Fps is (10 sec: 32768.0, 60 sec: 46421.5, 300 sec: 51762.3). Total num frames: 36241408. Throughput: 0: 50282.4. Samples: 37726300. Policy #0 lag: (min: 2.0, avg: 46.3, max: 90.0) [2024-03-20 22:59:00,522][03784] Avg episode reward: [(0, '0.007')] [2024-03-20 22:59:02,693][04017] Updated weights for policy 0, policy_version 1112 (0.0016) [2024-03-20 22:59:05,521][03784] Fps is (10 sec: 68812.3, 60 sec: 48605.8, 300 sec: 51984.5). Total num frames: 36601856. Throughput: 0: 50688.7. Samples: 38031500. Policy #0 lag: (min: 2.0, avg: 33.4, max: 73.0) [2024-03-20 22:59:05,522][03784] Avg episode reward: [(0, '0.002')] [2024-03-20 22:59:06,833][04017] Updated weights for policy 0, policy_version 1122 (0.0016) [2024-03-20 22:59:10,521][03784] Fps is (10 sec: 75366.7, 60 sec: 51882.7, 300 sec: 52428.8). Total num frames: 36995072. Throughput: 0: 50055.6. Samples: 38324300. Policy #0 lag: (min: 0.0, avg: 60.4, max: 104.0) [2024-03-20 22:59:10,522][03784] Avg episode reward: [(0, '0.002')] [2024-03-20 22:59:12,195][04017] Updated weights for policy 0, policy_version 1132 (0.0015) [2024-03-20 22:59:15,432][04017] Updated weights for policy 0, policy_version 1142 (0.0018) [2024-03-20 22:59:15,521][03784] Fps is (10 sec: 81919.9, 60 sec: 55159.5, 300 sec: 52984.2). Total num frames: 37421056. Throughput: 0: 49366.5. Samples: 38464500. Policy #0 lag: (min: 5.0, avg: 45.6, max: 84.0) [2024-03-20 22:59:15,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 22:59:20,521][03784] Fps is (10 sec: 72089.1, 60 sec: 55159.3, 300 sec: 53428.5). Total num frames: 37715968. Throughput: 0: 49500.0. Samples: 38761500. Policy #0 lag: (min: 0.0, avg: 59.1, max: 111.0) [2024-03-20 22:59:20,522][03784] Avg episode reward: [(0, '0.007')] [2024-03-20 22:59:21,164][04017] Updated weights for policy 0, policy_version 1152 (0.0009) [2024-03-20 22:59:25,521][03784] Fps is (10 sec: 45875.4, 60 sec: 52428.8, 300 sec: 52762.0). Total num frames: 37879808. Throughput: 0: 50308.9. Samples: 39112200. Policy #0 lag: (min: 0.0, avg: 52.9, max: 104.0) [2024-03-20 22:59:25,522][03784] Avg episode reward: [(0, '0.007')] [2024-03-20 22:59:28,633][03995] Signal inference workers to stop experience collection... (750 times) [2024-03-20 22:59:28,706][04017] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-03-20 22:59:28,914][03995] Signal inference workers to resume experience collection... (750 times) [2024-03-20 22:59:28,915][04017] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-03-20 22:59:28,916][04017] Updated weights for policy 0, policy_version 1162 (0.0009) [2024-03-20 22:59:30,521][03784] Fps is (10 sec: 49151.2, 60 sec: 52974.7, 300 sec: 52873.1). Total num frames: 38207488. Throughput: 0: 50902.1. Samples: 39285400. Policy #0 lag: (min: 1.0, avg: 48.9, max: 88.0) [2024-03-20 22:59:30,523][03784] Avg episode reward: [(0, '0.006')] [2024-03-20 22:59:35,521][03784] Fps is (10 sec: 45875.2, 60 sec: 49698.2, 300 sec: 52428.8). Total num frames: 38338560. Throughput: 0: 51482.2. Samples: 39605600. Policy #0 lag: (min: 0.0, avg: 49.8, max: 89.0) [2024-03-20 22:59:35,522][03784] Avg episode reward: [(0, '0.013')] [2024-03-20 22:59:35,686][03995] Saving new best policy, reward=0.013! [2024-03-20 22:59:36,171][04017] Updated weights for policy 0, policy_version 1172 (0.0010) [2024-03-20 22:59:40,521][03784] Fps is (10 sec: 39322.1, 60 sec: 50790.4, 300 sec: 52095.6). Total num frames: 38600704. Throughput: 0: 52035.5. Samples: 39936600. Policy #0 lag: (min: 0.0, avg: 32.3, max: 74.0) [2024-03-20 22:59:40,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 22:59:44,945][04017] Updated weights for policy 0, policy_version 1182 (0.0014) [2024-03-20 22:59:45,521][03784] Fps is (10 sec: 39321.3, 60 sec: 50244.3, 300 sec: 51651.2). Total num frames: 38731776. Throughput: 0: 52657.7. Samples: 40095900. Policy #0 lag: (min: 0.0, avg: 37.1, max: 90.0) [2024-03-20 22:59:45,522][03784] Avg episode reward: [(0, '0.015')] [2024-03-20 22:59:45,523][03995] Saving new best policy, reward=0.015! [2024-03-20 22:59:50,521][03784] Fps is (10 sec: 36044.0, 60 sec: 50790.2, 300 sec: 51762.3). Total num frames: 38961152. Throughput: 0: 53204.2. Samples: 40425700. Policy #0 lag: (min: 1.0, avg: 32.5, max: 93.0) [2024-03-20 22:59:50,523][03784] Avg episode reward: [(0, '0.012')] [2024-03-20 22:59:51,489][04017] Updated weights for policy 0, policy_version 1192 (0.0027) [2024-03-20 22:59:55,521][03784] Fps is (10 sec: 52429.2, 60 sec: 55705.6, 300 sec: 52206.7). Total num frames: 39256064. Throughput: 0: 52506.6. Samples: 40687100. Policy #0 lag: (min: 0.0, avg: 35.0, max: 84.0) [2024-03-20 22:59:55,522][03784] Avg episode reward: [(0, '0.007')] [2024-03-20 22:59:56,731][04017] Updated weights for policy 0, policy_version 1202 (0.0022) [2024-03-20 23:00:00,521][03784] Fps is (10 sec: 65537.2, 60 sec: 56251.6, 300 sec: 53095.3). Total num frames: 39616512. Throughput: 0: 52904.4. Samples: 40845200. Policy #0 lag: (min: 0.0, avg: 38.9, max: 105.0) [2024-03-20 23:00:00,522][03784] Avg episode reward: [(0, '0.003')] [2024-03-20 23:00:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000001209_39616512.pth... [2024-03-20 23:00:00,680][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000000818_26804224.pth [2024-03-20 23:00:03,043][04017] Updated weights for policy 0, policy_version 1212 (0.0013) [2024-03-20 23:00:05,524][03784] Fps is (10 sec: 52413.3, 60 sec: 52972.3, 300 sec: 52761.5). Total num frames: 39780352. Throughput: 0: 53749.8. Samples: 41180400. Policy #0 lag: (min: 2.0, avg: 36.4, max: 68.0) [2024-03-20 23:00:05,525][03784] Avg episode reward: [(0, '0.013')] [2024-03-20 23:00:08,253][04017] Updated weights for policy 0, policy_version 1222 (0.0030) [2024-03-20 23:00:10,521][03784] Fps is (10 sec: 62259.0, 60 sec: 54067.0, 300 sec: 53317.4). Total num frames: 40239104. Throughput: 0: 52599.9. Samples: 41479200. Policy #0 lag: (min: 4.0, avg: 43.1, max: 78.0) [2024-03-20 23:00:10,522][03784] Avg episode reward: [(0, '0.010')] [2024-03-20 23:00:11,310][03995] Signal inference workers to stop experience collection... (800 times) [2024-03-20 23:00:11,381][03995] Signal inference workers to resume experience collection... (800 times) [2024-03-20 23:00:11,396][04017] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-03-20 23:00:11,466][04017] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-03-20 23:00:11,759][04017] Updated weights for policy 0, policy_version 1232 (0.0019) [2024-03-20 23:00:15,521][03784] Fps is (10 sec: 81944.6, 60 sec: 52975.0, 300 sec: 53761.7). Total num frames: 40599552. Throughput: 0: 51829.1. Samples: 41617700. Policy #0 lag: (min: 3.0, avg: 47.8, max: 82.0) [2024-03-20 23:00:15,522][03784] Avg episode reward: [(0, '0.010')] [2024-03-20 23:00:19,421][04017] Updated weights for policy 0, policy_version 1242 (0.0017) [2024-03-20 23:00:20,521][03784] Fps is (10 sec: 55705.5, 60 sec: 51336.4, 300 sec: 52539.9). Total num frames: 40796160. Throughput: 0: 52579.9. Samples: 41971700. Policy #0 lag: (min: 2.0, avg: 43.0, max: 85.0) [2024-03-20 23:00:20,522][03784] Avg episode reward: [(0, '0.010')] [2024-03-20 23:00:25,521][03784] Fps is (10 sec: 39321.1, 60 sec: 51882.6, 300 sec: 52206.6). Total num frames: 40992768. Throughput: 0: 52291.1. Samples: 42289700. Policy #0 lag: (min: 0.0, avg: 46.1, max: 83.0) [2024-03-20 23:00:25,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 23:00:25,798][04017] Updated weights for policy 0, policy_version 1252 (0.0019) [2024-03-20 23:00:30,521][03784] Fps is (10 sec: 49153.0, 60 sec: 51336.8, 300 sec: 51984.5). Total num frames: 41287680. Throughput: 0: 51611.3. Samples: 42418400. Policy #0 lag: (min: 0.0, avg: 40.2, max: 81.0) [2024-03-20 23:00:30,522][03784] Avg episode reward: [(0, '0.009')] [2024-03-20 23:00:33,896][04017] Updated weights for policy 0, policy_version 1262 (0.0017) [2024-03-20 23:00:35,521][03784] Fps is (10 sec: 49152.3, 60 sec: 52428.8, 300 sec: 52539.9). Total num frames: 41484288. Throughput: 0: 52104.7. Samples: 42770400. Policy #0 lag: (min: 0.0, avg: 38.1, max: 78.0) [2024-03-20 23:00:35,522][03784] Avg episode reward: [(0, '0.010')] [2024-03-20 23:00:37,373][04017] Updated weights for policy 0, policy_version 1272 (0.0022) [2024-03-20 23:00:40,521][03784] Fps is (10 sec: 49151.2, 60 sec: 52974.9, 300 sec: 52651.0). Total num frames: 41779200. Throughput: 0: 53351.0. Samples: 43087900. Policy #0 lag: (min: 2.0, avg: 39.5, max: 84.0) [2024-03-20 23:00:40,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 23:00:44,852][04017] Updated weights for policy 0, policy_version 1282 (0.0009) [2024-03-20 23:00:45,521][03784] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 52984.2). Total num frames: 42041344. Throughput: 0: 53431.0. Samples: 43249600. Policy #0 lag: (min: 0.0, avg: 40.0, max: 94.0) [2024-03-20 23:00:45,523][03784] Avg episode reward: [(0, '0.008')] [2024-03-20 23:00:50,521][03784] Fps is (10 sec: 42598.9, 60 sec: 54067.5, 300 sec: 52873.1). Total num frames: 42205184. Throughput: 0: 53636.9. Samples: 43593900. Policy #0 lag: (min: 0.0, avg: 37.6, max: 83.0) [2024-03-20 23:00:50,522][03784] Avg episode reward: [(0, '0.013')] [2024-03-20 23:00:52,561][04017] Updated weights for policy 0, policy_version 1292 (0.0010) [2024-03-20 23:00:55,521][03784] Fps is (10 sec: 42599.5, 60 sec: 53521.2, 300 sec: 53095.3). Total num frames: 42467328. Throughput: 0: 54382.5. Samples: 43926400. Policy #0 lag: (min: 0.0, avg: 34.9, max: 83.0) [2024-03-20 23:00:55,522][03784] Avg episode reward: [(0, '0.008')] [2024-03-20 23:00:58,752][04017] Updated weights for policy 0, policy_version 1302 (0.0018) [2024-03-20 23:01:00,449][03995] Signal inference workers to stop experience collection... (850 times) [2024-03-20 23:01:00,521][03784] Fps is (10 sec: 58981.7, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 42795008. Throughput: 0: 54459.9. Samples: 44068400. Policy #0 lag: (min: 1.0, avg: 40.3, max: 87.0) [2024-03-20 23:01:00,522][03784] Avg episode reward: [(0, '0.008')] [2024-03-20 23:01:00,523][04017] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-03-20 23:01:00,738][03995] Signal inference workers to resume experience collection... (850 times) [2024-03-20 23:01:00,738][04017] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-03-20 23:01:02,229][04017] Updated weights for policy 0, policy_version 1312 (0.0019) [2024-03-20 23:01:05,521][03784] Fps is (10 sec: 81918.9, 60 sec: 58439.1, 300 sec: 53650.7). Total num frames: 43286528. Throughput: 0: 52591.2. Samples: 44338300. Policy #0 lag: (min: 1.0, avg: 40.3, max: 74.0) [2024-03-20 23:01:05,522][03784] Avg episode reward: [(0, '0.011')] [2024-03-20 23:01:09,605][04017] Updated weights for policy 0, policy_version 1322 (0.0010) [2024-03-20 23:01:10,521][03784] Fps is (10 sec: 55705.8, 60 sec: 51882.7, 300 sec: 53095.3). Total num frames: 43352064. Throughput: 0: 52777.8. Samples: 44664700. Policy #0 lag: (min: 0.0, avg: 43.6, max: 87.0) [2024-03-20 23:01:10,522][03784] Avg episode reward: [(0, '0.008')] [2024-03-20 23:01:14,921][04017] Updated weights for policy 0, policy_version 1332 (0.0023) [2024-03-20 23:01:15,521][03784] Fps is (10 sec: 42598.3, 60 sec: 51882.6, 300 sec: 52762.0). Total num frames: 43712512. Throughput: 0: 53617.6. Samples: 44831200. Policy #0 lag: (min: 1.0, avg: 43.9, max: 87.0) [2024-03-20 23:01:15,522][03784] Avg episode reward: [(0, '0.008')] [2024-03-20 23:01:20,332][04017] Updated weights for policy 0, policy_version 1342 (0.0021) [2024-03-20 23:01:20,521][03784] Fps is (10 sec: 62259.3, 60 sec: 52975.0, 300 sec: 52317.7). Total num frames: 43974656. Throughput: 0: 52475.6. Samples: 45131800. Policy #0 lag: (min: 0.0, avg: 44.3, max: 84.0) [2024-03-20 23:01:20,522][03784] Avg episode reward: [(0, '0.007')] [2024-03-20 23:01:25,521][03784] Fps is (10 sec: 45875.2, 60 sec: 52975.0, 300 sec: 52428.8). Total num frames: 44171264. Throughput: 0: 52904.5. Samples: 45468600. Policy #0 lag: (min: 0.0, avg: 44.3, max: 84.0) [2024-03-20 23:01:25,531][03784] Avg episode reward: [(0, '0.016')] [2024-03-20 23:01:27,063][04017] Updated weights for policy 0, policy_version 1352 (0.0022) [2024-03-20 23:01:30,521][03784] Fps is (10 sec: 45874.8, 60 sec: 52428.6, 300 sec: 52762.0). Total num frames: 44433408. Throughput: 0: 52957.8. Samples: 45632700. Policy #0 lag: (min: 1.0, avg: 41.5, max: 82.0) [2024-03-20 23:01:30,531][03784] Avg episode reward: [(0, '0.021')] [2024-03-20 23:01:30,544][03995] Saving new best policy, reward=0.021! [2024-03-20 23:01:35,521][03784] Fps is (10 sec: 42598.5, 60 sec: 51882.7, 300 sec: 52428.8). Total num frames: 44597248. Throughput: 0: 53366.6. Samples: 45995400. Policy #0 lag: (min: 0.0, avg: 38.7, max: 82.0) [2024-03-20 23:01:35,531][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 23:01:35,612][04017] Updated weights for policy 0, policy_version 1362 (0.0011) [2024-03-20 23:01:40,521][03784] Fps is (10 sec: 49152.8, 60 sec: 52428.9, 300 sec: 52873.1). Total num frames: 44924928. Throughput: 0: 52815.5. Samples: 46303100. Policy #0 lag: (min: 0.0, avg: 41.5, max: 86.0) [2024-03-20 23:01:40,522][03784] Avg episode reward: [(0, '0.009')] [2024-03-20 23:01:42,543][04017] Updated weights for policy 0, policy_version 1372 (0.0019) [2024-03-20 23:01:45,521][03784] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 52428.8). Total num frames: 45088768. Throughput: 0: 53229.0. Samples: 46463700. Policy #0 lag: (min: 0.0, avg: 36.9, max: 82.0) [2024-03-20 23:01:45,522][03784] Avg episode reward: [(0, '0.002')] [2024-03-20 23:01:48,200][04017] Updated weights for policy 0, policy_version 1382 (0.0011) [2024-03-20 23:01:49,825][03995] Signal inference workers to stop experience collection... (900 times) [2024-03-20 23:01:49,882][04017] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-03-20 23:01:50,127][03995] Signal inference workers to resume experience collection... (900 times) [2024-03-20 23:01:50,128][04017] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-03-20 23:01:50,521][03784] Fps is (10 sec: 52428.8, 60 sec: 54067.2, 300 sec: 52317.7). Total num frames: 45449216. Throughput: 0: 53851.2. Samples: 46761600. Policy #0 lag: (min: 1.0, avg: 38.1, max: 87.0) [2024-03-20 23:01:50,522][03784] Avg episode reward: [(0, '0.010')] [2024-03-20 23:01:52,108][04017] Updated weights for policy 0, policy_version 1392 (0.0015) [2024-03-20 23:01:55,521][03784] Fps is (10 sec: 75366.5, 60 sec: 56251.7, 300 sec: 53095.3). Total num frames: 45842432. Throughput: 0: 52271.2. Samples: 47016900. Policy #0 lag: (min: 3.0, avg: 44.6, max: 88.0) [2024-03-20 23:01:55,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 23:01:57,715][04017] Updated weights for policy 0, policy_version 1402 (0.0021) [2024-03-20 23:02:00,521][03784] Fps is (10 sec: 62259.0, 60 sec: 54613.4, 300 sec: 52984.2). Total num frames: 46071808. Throughput: 0: 52551.2. Samples: 47196000. Policy #0 lag: (min: 0.0, avg: 35.7, max: 71.0) [2024-03-20 23:02:00,530][03784] Avg episode reward: [(0, '0.015')] [2024-03-20 23:02:00,545][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000001406_46071808.pth... [2024-03-20 23:02:00,669][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000001023_33521664.pth [2024-03-20 23:02:05,521][03784] Fps is (10 sec: 36044.8, 60 sec: 48605.9, 300 sec: 51984.5). Total num frames: 46202880. Throughput: 0: 53437.9. Samples: 47536500. Policy #0 lag: (min: 0.0, avg: 47.1, max: 102.0) [2024-03-20 23:02:05,530][03784] Avg episode reward: [(0, '0.029')] [2024-03-20 23:02:05,957][03995] Saving new best policy, reward=0.029! [2024-03-20 23:02:05,960][04017] Updated weights for policy 0, policy_version 1412 (0.0010) [2024-03-20 23:02:10,521][03784] Fps is (10 sec: 42598.0, 60 sec: 52428.8, 300 sec: 51984.5). Total num frames: 46497792. Throughput: 0: 53197.7. Samples: 47862500. Policy #0 lag: (min: 0.0, avg: 36.2, max: 75.0) [2024-03-20 23:02:10,522][03784] Avg episode reward: [(0, '0.010')] [2024-03-20 23:02:11,358][04017] Updated weights for policy 0, policy_version 1422 (0.0011) [2024-03-20 23:02:14,407][04017] Updated weights for policy 0, policy_version 1432 (0.0014) [2024-03-20 23:02:15,521][03784] Fps is (10 sec: 72089.2, 60 sec: 53521.1, 300 sec: 52762.0). Total num frames: 46923776. Throughput: 0: 52146.8. Samples: 47979300. Policy #0 lag: (min: 4.0, avg: 47.1, max: 96.0) [2024-03-20 23:02:15,522][03784] Avg episode reward: [(0, '0.020')] [2024-03-20 23:02:20,521][03784] Fps is (10 sec: 58982.6, 60 sec: 51882.7, 300 sec: 52984.2). Total num frames: 47087616. Throughput: 0: 51557.8. Samples: 48315500. Policy #0 lag: (min: 0.0, avg: 51.5, max: 98.0) [2024-03-20 23:02:20,522][03784] Avg episode reward: [(0, '0.020')] [2024-03-20 23:02:25,521][03784] Fps is (10 sec: 26214.6, 60 sec: 50244.4, 300 sec: 52206.7). Total num frames: 47185920. Throughput: 0: 52533.4. Samples: 48667100. Policy #0 lag: (min: 0.0, avg: 34.6, max: 71.0) [2024-03-20 23:02:25,521][03784] Avg episode reward: [(0, '0.023')] [2024-03-20 23:02:25,992][04017] Updated weights for policy 0, policy_version 1442 (0.0011) [2024-03-20 23:02:30,521][03784] Fps is (10 sec: 45875.8, 60 sec: 51882.8, 300 sec: 52428.8). Total num frames: 47546368. Throughput: 0: 51084.5. Samples: 48762500. Policy #0 lag: (min: 4.0, avg: 43.3, max: 87.0) [2024-03-20 23:02:30,522][03784] Avg episode reward: [(0, '0.001')] [2024-03-20 23:02:31,270][04017] Updated weights for policy 0, policy_version 1452 (0.0016) [2024-03-20 23:02:33,196][03995] Signal inference workers to stop experience collection... (950 times) [2024-03-20 23:02:33,281][03995] Signal inference workers to resume experience collection... (950 times) [2024-03-20 23:02:33,316][04017] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-03-20 23:02:33,373][04017] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-03-20 23:02:35,521][03784] Fps is (10 sec: 62258.8, 60 sec: 53521.1, 300 sec: 52095.6). Total num frames: 47808512. Throughput: 0: 51282.2. Samples: 49069300. Policy #0 lag: (min: 0.0, avg: 49.4, max: 101.0) [2024-03-20 23:02:35,531][03784] Avg episode reward: [(0, '0.016')] [2024-03-20 23:02:37,786][04017] Updated weights for policy 0, policy_version 1462 (0.0009) [2024-03-20 23:02:40,521][03784] Fps is (10 sec: 45875.0, 60 sec: 51336.5, 300 sec: 51762.4). Total num frames: 48005120. Throughput: 0: 52828.9. Samples: 49394200. Policy #0 lag: (min: 0.0, avg: 38.0, max: 83.0) [2024-03-20 23:02:40,522][03784] Avg episode reward: [(0, '0.010')] [2024-03-20 23:02:45,521][03784] Fps is (10 sec: 26214.4, 60 sec: 49698.1, 300 sec: 51318.0). Total num frames: 48070656. Throughput: 0: 52837.8. Samples: 49573700. Policy #0 lag: (min: 0.0, avg: 31.6, max: 65.0) [2024-03-20 23:02:45,522][03784] Avg episode reward: [(0, '0.010')] [2024-03-20 23:02:47,717][04017] Updated weights for policy 0, policy_version 1472 (0.0038) [2024-03-20 23:02:50,521][03784] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 51984.5). Total num frames: 48496640. Throughput: 0: 51235.5. Samples: 49842100. Policy #0 lag: (min: 5.0, avg: 43.5, max: 93.0) [2024-03-20 23:02:50,522][03784] Avg episode reward: [(0, '0.017')] [2024-03-20 23:02:50,898][04017] Updated weights for policy 0, policy_version 1482 (0.0022) [2024-03-20 23:02:54,146][04017] Updated weights for policy 0, policy_version 1492 (0.0014) [2024-03-20 23:02:55,521][03784] Fps is (10 sec: 81919.5, 60 sec: 50790.3, 300 sec: 52317.7). Total num frames: 48889856. Throughput: 0: 49737.8. Samples: 50100700. Policy #0 lag: (min: 3.0, avg: 51.9, max: 103.0) [2024-03-20 23:02:55,531][03784] Avg episode reward: [(0, '0.014')] [2024-03-20 23:03:00,521][03784] Fps is (10 sec: 62258.8, 60 sec: 50790.3, 300 sec: 52317.7). Total num frames: 49119232. Throughput: 0: 50755.5. Samples: 50263300. Policy #0 lag: (min: 0.0, avg: 38.0, max: 76.0) [2024-03-20 23:03:00,522][03784] Avg episode reward: [(0, '0.021')] [2024-03-20 23:03:03,502][04017] Updated weights for policy 0, policy_version 1502 (0.0021) [2024-03-20 23:03:05,521][03784] Fps is (10 sec: 39321.6, 60 sec: 51336.4, 300 sec: 52206.6). Total num frames: 49283072. Throughput: 0: 50457.7. Samples: 50586100. Policy #0 lag: (min: 1.0, avg: 40.0, max: 82.0) [2024-03-20 23:03:05,522][03784] Avg episode reward: [(0, '0.010')] [2024-03-20 23:03:09,318][04017] Updated weights for policy 0, policy_version 1512 (0.0022) [2024-03-20 23:03:10,521][03784] Fps is (10 sec: 55705.4, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 49676288. Throughput: 0: 49457.6. Samples: 50892700. Policy #0 lag: (min: 3.0, avg: 48.0, max: 95.0) [2024-03-20 23:03:10,522][03784] Avg episode reward: [(0, '0.007')] [2024-03-20 23:03:11,539][03995] Signal inference workers to stop experience collection... (1000 times) [2024-03-20 23:03:11,609][04017] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-03-20 23:03:11,808][03995] Signal inference workers to resume experience collection... (1000 times) [2024-03-20 23:03:11,809][04017] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-03-20 23:03:13,276][04017] Updated weights for policy 0, policy_version 1522 (0.0020) [2024-03-20 23:03:15,521][03784] Fps is (10 sec: 68813.2, 60 sec: 50790.4, 300 sec: 52762.0). Total num frames: 49971200. Throughput: 0: 50255.4. Samples: 51024000. Policy #0 lag: (min: 0.0, avg: 40.7, max: 81.0) [2024-03-20 23:03:15,522][03784] Avg episode reward: [(0, '0.029')] [2024-03-20 23:03:20,521][03784] Fps is (10 sec: 49151.6, 60 sec: 51336.4, 300 sec: 52317.7). Total num frames: 50167808. Throughput: 0: 50415.3. Samples: 51338000. Policy #0 lag: (min: 1.0, avg: 44.3, max: 89.0) [2024-03-20 23:03:20,522][03784] Avg episode reward: [(0, '0.008')] [2024-03-20 23:03:20,553][04017] Updated weights for policy 0, policy_version 1532 (0.0009) [2024-03-20 23:03:25,521][03784] Fps is (10 sec: 32767.9, 60 sec: 51882.6, 300 sec: 51762.3). Total num frames: 50298880. Throughput: 0: 51282.1. Samples: 51701900. Policy #0 lag: (min: 1.0, avg: 35.0, max: 73.0) [2024-03-20 23:03:25,522][03784] Avg episode reward: [(0, '0.003')] [2024-03-20 23:03:29,260][04017] Updated weights for policy 0, policy_version 1542 (0.0020) [2024-03-20 23:03:30,521][03784] Fps is (10 sec: 45876.0, 60 sec: 51336.4, 300 sec: 51762.3). Total num frames: 50626560. Throughput: 0: 50584.4. Samples: 51850000. Policy #0 lag: (min: 2.0, avg: 46.8, max: 93.0) [2024-03-20 23:03:30,522][03784] Avg episode reward: [(0, '0.021')] [2024-03-20 23:03:34,238][04017] Updated weights for policy 0, policy_version 1552 (0.0012) [2024-03-20 23:03:35,521][03784] Fps is (10 sec: 58982.5, 60 sec: 51336.5, 300 sec: 51984.5). Total num frames: 50888704. Throughput: 0: 49928.9. Samples: 52088900. Policy #0 lag: (min: 0.0, avg: 43.8, max: 94.0) [2024-03-20 23:03:35,522][03784] Avg episode reward: [(0, '0.026')] [2024-03-20 23:03:40,521][03784] Fps is (10 sec: 39321.9, 60 sec: 50244.3, 300 sec: 51873.4). Total num frames: 51019776. Throughput: 0: 51666.8. Samples: 52425700. Policy #0 lag: (min: 0.0, avg: 33.6, max: 77.0) [2024-03-20 23:03:40,522][03784] Avg episode reward: [(0, '0.002')] [2024-03-20 23:03:43,880][04017] Updated weights for policy 0, policy_version 1562 (0.0019) [2024-03-20 23:03:45,521][03784] Fps is (10 sec: 29491.5, 60 sec: 51882.7, 300 sec: 51762.3). Total num frames: 51183616. Throughput: 0: 51784.6. Samples: 52593600. Policy #0 lag: (min: 0.0, avg: 31.6, max: 64.0) [2024-03-20 23:03:45,522][03784] Avg episode reward: [(0, '0.017')] [2024-03-20 23:03:50,521][03784] Fps is (10 sec: 32767.7, 60 sec: 47513.6, 300 sec: 52317.7). Total num frames: 51347456. Throughput: 0: 52004.5. Samples: 52926300. Policy #0 lag: (min: 0.0, avg: 26.5, max: 65.0) [2024-03-20 23:03:50,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 23:03:52,562][04017] Updated weights for policy 0, policy_version 1572 (0.0009) [2024-03-20 23:03:55,521][03784] Fps is (10 sec: 58982.0, 60 sec: 48059.8, 300 sec: 52651.0). Total num frames: 51773440. Throughput: 0: 51360.1. Samples: 53203900. Policy #0 lag: (min: 3.0, avg: 45.3, max: 99.0) [2024-03-20 23:03:55,522][03784] Avg episode reward: [(0, '0.025')] [2024-03-20 23:03:56,085][04017] Updated weights for policy 0, policy_version 1582 (0.0019) [2024-03-20 23:04:00,422][03995] Signal inference workers to stop experience collection... (1050 times) [2024-03-20 23:04:00,489][04017] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-03-20 23:04:00,521][03784] Fps is (10 sec: 78641.8, 60 sec: 50244.2, 300 sec: 52650.9). Total num frames: 52133888. Throughput: 0: 51822.0. Samples: 53356000. Policy #0 lag: (min: 0.0, avg: 36.8, max: 85.0) [2024-03-20 23:04:00,522][03784] Avg episode reward: [(0, '0.006')] [2024-03-20 23:04:00,706][03995] Signal inference workers to resume experience collection... (1050 times) [2024-03-20 23:04:00,706][04017] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-03-20 23:04:00,708][04017] Updated weights for policy 0, policy_version 1592 (0.0009) [2024-03-20 23:04:01,007][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000001593_52199424.pth... [2024-03-20 23:04:01,062][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000001209_39616512.pth [2024-03-20 23:04:04,096][04017] Updated weights for policy 0, policy_version 1602 (0.0019) [2024-03-20 23:04:05,521][03784] Fps is (10 sec: 85196.5, 60 sec: 55705.7, 300 sec: 52984.2). Total num frames: 52625408. Throughput: 0: 50658.0. Samples: 53617600. Policy #0 lag: (min: 3.0, avg: 47.3, max: 91.0) [2024-03-20 23:04:05,522][03784] Avg episode reward: [(0, '0.034')] [2024-03-20 23:04:05,910][03995] Saving new best policy, reward=0.034! [2024-03-20 23:04:08,414][04017] Updated weights for policy 0, policy_version 1612 (0.0011) [2024-03-20 23:04:10,521][03784] Fps is (10 sec: 68813.4, 60 sec: 52428.8, 300 sec: 52206.6). Total num frames: 52822016. Throughput: 0: 49571.0. Samples: 53932600. Policy #0 lag: (min: 3.0, avg: 45.7, max: 76.0) [2024-03-20 23:04:10,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 23:04:15,521][03784] Fps is (10 sec: 39321.6, 60 sec: 50790.4, 300 sec: 51873.4). Total num frames: 53018624. Throughput: 0: 50226.7. Samples: 54110200. Policy #0 lag: (min: 0.0, avg: 35.0, max: 82.0) [2024-03-20 23:04:15,522][03784] Avg episode reward: [(0, '0.015')] [2024-03-20 23:04:17,375][04017] Updated weights for policy 0, policy_version 1622 (0.0014) [2024-03-20 23:04:20,521][03784] Fps is (10 sec: 52429.3, 60 sec: 52975.1, 300 sec: 52428.8). Total num frames: 53346304. Throughput: 0: 51557.8. Samples: 54409000. Policy #0 lag: (min: 2.0, avg: 42.4, max: 83.0) [2024-03-20 23:04:20,522][03784] Avg episode reward: [(0, '0.006')] [2024-03-20 23:04:23,635][04017] Updated weights for policy 0, policy_version 1632 (0.0035) [2024-03-20 23:04:25,521][03784] Fps is (10 sec: 52429.0, 60 sec: 54067.3, 300 sec: 51984.5). Total num frames: 53542912. Throughput: 0: 51435.5. Samples: 54740300. Policy #0 lag: (min: 0.0, avg: 47.1, max: 90.0) [2024-03-20 23:04:25,522][03784] Avg episode reward: [(0, '0.005')] [2024-03-20 23:04:30,521][03784] Fps is (10 sec: 36044.9, 60 sec: 51336.6, 300 sec: 52095.6). Total num frames: 53706752. Throughput: 0: 51768.8. Samples: 54923200. Policy #0 lag: (min: 0.0, avg: 41.5, max: 82.0) [2024-03-20 23:04:30,522][03784] Avg episode reward: [(0, '0.007')] [2024-03-20 23:04:33,113][04017] Updated weights for policy 0, policy_version 1642 (0.0013) [2024-03-20 23:04:35,521][03784] Fps is (10 sec: 29491.4, 60 sec: 49152.1, 300 sec: 51651.3). Total num frames: 53837824. Throughput: 0: 51922.4. Samples: 55262800. Policy #0 lag: (min: 0.0, avg: 33.6, max: 81.0) [2024-03-20 23:04:35,522][03784] Avg episode reward: [(0, '0.012')] [2024-03-20 23:04:40,521][03784] Fps is (10 sec: 19660.8, 60 sec: 48059.7, 300 sec: 51429.1). Total num frames: 53903360. Throughput: 0: 53460.0. Samples: 55609600. Policy #0 lag: (min: 1.0, avg: 34.9, max: 82.0) [2024-03-20 23:04:40,522][03784] Avg episode reward: [(0, '0.006')] [2024-03-20 23:04:44,401][04017] Updated weights for policy 0, policy_version 1652 (0.0025) [2024-03-20 23:04:45,521][03784] Fps is (10 sec: 39321.4, 60 sec: 50790.4, 300 sec: 51762.4). Total num frames: 54231040. Throughput: 0: 53758.1. Samples: 55775100. Policy #0 lag: (min: 2.0, avg: 25.3, max: 68.0) [2024-03-20 23:04:45,522][03784] Avg episode reward: [(0, '0.019')] [2024-03-20 23:04:46,420][03995] Signal inference workers to stop experience collection... (1100 times) [2024-03-20 23:04:46,479][03995] Signal inference workers to resume experience collection... (1100 times) [2024-03-20 23:04:46,546][04017] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-03-20 23:04:46,546][04017] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-03-20 23:04:47,498][04017] Updated weights for policy 0, policy_version 1662 (0.0014) [2024-03-20 23:04:50,521][03784] Fps is (10 sec: 85197.1, 60 sec: 56797.9, 300 sec: 52539.9). Total num frames: 54755328. Throughput: 0: 53537.8. Samples: 56026800. Policy #0 lag: (min: 4.0, avg: 38.9, max: 89.0) [2024-03-20 23:04:50,522][03784] Avg episode reward: [(0, '0.025')] [2024-03-20 23:04:52,091][04017] Updated weights for policy 0, policy_version 1672 (0.0019) [2024-03-20 23:04:55,467][04017] Updated weights for policy 0, policy_version 1682 (0.0018) [2024-03-20 23:04:55,521][03784] Fps is (10 sec: 88473.3, 60 sec: 55705.6, 300 sec: 52539.9). Total num frames: 55115776. Throughput: 0: 53011.3. Samples: 56318100. Policy #0 lag: (min: 0.0, avg: 46.2, max: 90.0) [2024-03-20 23:04:55,522][03784] Avg episode reward: [(0, '0.028')] [2024-03-20 23:04:58,810][04017] Updated weights for policy 0, policy_version 1692 (0.0016) [2024-03-20 23:05:00,521][03784] Fps is (10 sec: 81919.8, 60 sec: 57344.2, 300 sec: 53540.1). Total num frames: 55574528. Throughput: 0: 52364.4. Samples: 56466600. Policy #0 lag: (min: 2.0, avg: 50.8, max: 94.0) [2024-03-20 23:05:00,522][03784] Avg episode reward: [(0, '0.028')] [2024-03-20 23:05:05,521][03784] Fps is (10 sec: 62259.0, 60 sec: 51882.7, 300 sec: 52539.9). Total num frames: 55738368. Throughput: 0: 52433.4. Samples: 56768500. Policy #0 lag: (min: 1.0, avg: 42.4, max: 68.0) [2024-03-20 23:05:05,522][03784] Avg episode reward: [(0, '0.009')] [2024-03-20 23:05:06,495][04017] Updated weights for policy 0, policy_version 1702 (0.0009) [2024-03-20 23:05:10,521][03784] Fps is (10 sec: 42598.3, 60 sec: 52975.0, 300 sec: 52206.6). Total num frames: 56000512. Throughput: 0: 52066.6. Samples: 57083300. Policy #0 lag: (min: 0.0, avg: 45.7, max: 81.0) [2024-03-20 23:05:10,531][03784] Avg episode reward: [(0, '0.028')] [2024-03-20 23:05:15,310][04017] Updated weights for policy 0, policy_version 1712 (0.0015) [2024-03-20 23:05:15,521][03784] Fps is (10 sec: 36044.9, 60 sec: 51336.6, 300 sec: 51873.4). Total num frames: 56098816. Throughput: 0: 52051.1. Samples: 57265500. Policy #0 lag: (min: 0.0, avg: 48.3, max: 92.0) [2024-03-20 23:05:15,522][03784] Avg episode reward: [(0, '0.033')] [2024-03-20 23:05:20,269][04017] Updated weights for policy 0, policy_version 1722 (0.0018) [2024-03-20 23:05:20,521][03784] Fps is (10 sec: 45875.1, 60 sec: 51882.6, 300 sec: 52428.8). Total num frames: 56459264. Throughput: 0: 51104.3. Samples: 57562500. Policy #0 lag: (min: 2.0, avg: 37.6, max: 87.0) [2024-03-20 23:05:20,522][03784] Avg episode reward: [(0, '0.068')] [2024-03-20 23:05:20,539][03995] Saving new best policy, reward=0.068! [2024-03-20 23:05:22,625][03995] Signal inference workers to stop experience collection... (1150 times) [2024-03-20 23:05:22,714][04017] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-03-20 23:05:22,829][03995] Signal inference workers to resume experience collection... (1150 times) [2024-03-20 23:05:22,829][04017] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-03-20 23:05:25,521][03784] Fps is (10 sec: 45875.3, 60 sec: 50244.3, 300 sec: 51762.3). Total num frames: 56557568. Throughput: 0: 51484.5. Samples: 57926400. Policy #0 lag: (min: 0.0, avg: 40.5, max: 87.0) [2024-03-20 23:05:25,522][03784] Avg episode reward: [(0, '0.068')] [2024-03-20 23:05:30,521][03784] Fps is (10 sec: 16384.1, 60 sec: 48605.9, 300 sec: 51318.0). Total num frames: 56623104. Throughput: 0: 51506.6. Samples: 58092900. Policy #0 lag: (min: 0.0, avg: 27.2, max: 77.0) [2024-03-20 23:05:30,522][03784] Avg episode reward: [(0, '0.068')] [2024-03-20 23:05:34,576][04017] Updated weights for policy 0, policy_version 1732 (0.0027) [2024-03-20 23:05:35,521][03784] Fps is (10 sec: 26214.3, 60 sec: 49698.1, 300 sec: 50984.8). Total num frames: 56819712. Throughput: 0: 53293.3. Samples: 58425000. Policy #0 lag: (min: 0.0, avg: 26.2, max: 80.0) [2024-03-20 23:05:35,522][03784] Avg episode reward: [(0, '0.068')] [2024-03-20 23:05:39,513][04017] Updated weights for policy 0, policy_version 1742 (0.0021) [2024-03-20 23:05:40,521][03784] Fps is (10 sec: 55705.2, 60 sec: 54613.3, 300 sec: 51318.0). Total num frames: 57180160. Throughput: 0: 52504.3. Samples: 58680800. Policy #0 lag: (min: 4.0, avg: 27.3, max: 89.0) [2024-03-20 23:05:40,522][03784] Avg episode reward: [(0, '0.007')] [2024-03-20 23:05:42,502][04017] Updated weights for policy 0, policy_version 1752 (0.0023) [2024-03-20 23:05:45,521][03784] Fps is (10 sec: 81920.8, 60 sec: 56797.9, 300 sec: 52317.7). Total num frames: 57638912. Throughput: 0: 52020.1. Samples: 58807500. Policy #0 lag: (min: 7.0, avg: 42.7, max: 93.0) [2024-03-20 23:05:45,521][03784] Avg episode reward: [(0, '0.007')] [2024-03-20 23:05:46,092][04017] Updated weights for policy 0, policy_version 1762 (0.0015) [2024-03-20 23:05:50,521][03784] Fps is (10 sec: 85196.5, 60 sec: 54613.2, 300 sec: 52762.0). Total num frames: 58032128. Throughput: 0: 51584.3. Samples: 59089800. Policy #0 lag: (min: 1.0, avg: 40.4, max: 65.0) [2024-03-20 23:05:50,522][03784] Avg episode reward: [(0, '0.007')] [2024-03-20 23:05:50,684][04017] Updated weights for policy 0, policy_version 1772 (0.0009) [2024-03-20 23:05:55,521][03784] Fps is (10 sec: 62258.8, 60 sec: 52428.8, 300 sec: 52428.8). Total num frames: 58261504. Throughput: 0: 51406.8. Samples: 59396600. Policy #0 lag: (min: 2.0, avg: 44.8, max: 74.0) [2024-03-20 23:05:55,522][03784] Avg episode reward: [(0, '0.007')] [2024-03-20 23:06:00,521][03784] Fps is (10 sec: 26214.4, 60 sec: 45329.0, 300 sec: 50873.7). Total num frames: 58294272. Throughput: 0: 51055.4. Samples: 59563000. Policy #0 lag: (min: 0.0, avg: 37.1, max: 68.0) [2024-03-20 23:06:00,522][03784] Avg episode reward: [(0, '0.013')] [2024-03-20 23:06:00,948][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000001781_58359808.pth... [2024-03-20 23:06:01,073][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000001406_46071808.pth [2024-03-20 23:06:01,593][04017] Updated weights for policy 0, policy_version 1782 (0.0010) [2024-03-20 23:06:04,994][03995] Signal inference workers to stop experience collection... (1200 times) [2024-03-20 23:06:05,072][04017] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-03-20 23:06:05,248][03995] Signal inference workers to resume experience collection... (1200 times) [2024-03-20 23:06:05,248][04017] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-03-20 23:06:05,496][04017] Updated weights for policy 0, policy_version 1792 (0.0020) [2024-03-20 23:06:05,521][03784] Fps is (10 sec: 45874.6, 60 sec: 49698.1, 300 sec: 52095.6). Total num frames: 58720256. Throughput: 0: 51226.6. Samples: 59867700. Policy #0 lag: (min: 0.0, avg: 53.0, max: 111.0) [2024-03-20 23:06:05,523][03784] Avg episode reward: [(0, '0.023')] [2024-03-20 23:06:10,521][03784] Fps is (10 sec: 72090.6, 60 sec: 50244.3, 300 sec: 51873.4). Total num frames: 59015168. Throughput: 0: 49528.9. Samples: 60155200. Policy #0 lag: (min: 0.0, avg: 43.0, max: 84.0) [2024-03-20 23:06:10,527][03784] Avg episode reward: [(0, '0.013')] [2024-03-20 23:06:15,038][04017] Updated weights for policy 0, policy_version 1802 (0.0021) [2024-03-20 23:06:15,521][03784] Fps is (10 sec: 36045.0, 60 sec: 49698.1, 300 sec: 51206.9). Total num frames: 59080704. Throughput: 0: 49680.0. Samples: 60328500. Policy #0 lag: (min: 0.0, avg: 40.8, max: 86.0) [2024-03-20 23:06:15,530][03784] Avg episode reward: [(0, '0.051')] [2024-03-20 23:06:20,521][03784] Fps is (10 sec: 19660.5, 60 sec: 45875.2, 300 sec: 50984.8). Total num frames: 59211776. Throughput: 0: 49233.2. Samples: 60640500. Policy #0 lag: (min: 2.0, avg: 23.3, max: 68.0) [2024-03-20 23:06:20,531][03784] Avg episode reward: [(0, '0.016')] [2024-03-20 23:06:22,420][04017] Updated weights for policy 0, policy_version 1812 (0.0014) [2024-03-20 23:06:25,521][03784] Fps is (10 sec: 45875.3, 60 sec: 49698.1, 300 sec: 51207.0). Total num frames: 59539456. Throughput: 0: 49562.3. Samples: 60911100. Policy #0 lag: (min: 0.0, avg: 26.8, max: 82.0) [2024-03-20 23:06:25,522][03784] Avg episode reward: [(0, '0.040')] [2024-03-20 23:06:27,069][04017] Updated weights for policy 0, policy_version 1822 (0.0012) [2024-03-20 23:06:30,521][03784] Fps is (10 sec: 65536.7, 60 sec: 54067.2, 300 sec: 51762.3). Total num frames: 59867136. Throughput: 0: 49995.4. Samples: 61057300. Policy #0 lag: (min: 1.0, avg: 36.2, max: 88.0) [2024-03-20 23:06:30,531][03784] Avg episode reward: [(0, '0.038')] [2024-03-20 23:06:32,787][04017] Updated weights for policy 0, policy_version 1832 (0.0021) [2024-03-20 23:06:35,521][03784] Fps is (10 sec: 65536.0, 60 sec: 56251.7, 300 sec: 51762.3). Total num frames: 60194816. Throughput: 0: 50544.5. Samples: 61364300. Policy #0 lag: (min: 2.0, avg: 36.3, max: 72.0) [2024-03-20 23:06:35,522][03784] Avg episode reward: [(0, '0.051')] [2024-03-20 23:06:40,154][04017] Updated weights for policy 0, policy_version 1842 (0.0012) [2024-03-20 23:06:40,521][03784] Fps is (10 sec: 49151.9, 60 sec: 52975.0, 300 sec: 51762.3). Total num frames: 60358656. Throughput: 0: 49982.1. Samples: 61645800. Policy #0 lag: (min: 0.0, avg: 37.6, max: 69.0) [2024-03-20 23:06:40,522][03784] Avg episode reward: [(0, '0.060')] [2024-03-20 23:06:45,521][03784] Fps is (10 sec: 19660.9, 60 sec: 45875.2, 300 sec: 50651.6). Total num frames: 60391424. Throughput: 0: 49869.1. Samples: 61807100. Policy #0 lag: (min: 0.0, avg: 37.6, max: 69.0) [2024-03-20 23:06:45,522][03784] Avg episode reward: [(0, '0.011')] [2024-03-20 23:06:48,619][04017] Updated weights for policy 0, policy_version 1852 (0.0017) [2024-03-20 23:06:50,159][03995] Signal inference workers to stop experience collection... (1250 times) [2024-03-20 23:06:50,160][03995] Signal inference workers to resume experience collection... (1250 times) [2024-03-20 23:06:50,262][04017] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-03-20 23:06:50,262][04017] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-03-20 23:06:50,521][03784] Fps is (10 sec: 49152.6, 60 sec: 46967.6, 300 sec: 50873.7). Total num frames: 60850176. Throughput: 0: 49186.9. Samples: 62081100. Policy #0 lag: (min: 4.0, avg: 51.6, max: 117.0) [2024-03-20 23:06:50,521][03784] Avg episode reward: [(0, '0.014')] [2024-03-20 23:06:52,097][04017] Updated weights for policy 0, policy_version 1862 (0.0017) [2024-03-20 23:06:55,521][03784] Fps is (10 sec: 72089.5, 60 sec: 47513.6, 300 sec: 50984.8). Total num frames: 61112320. Throughput: 0: 49264.4. Samples: 62372100. Policy #0 lag: (min: 3.0, avg: 43.6, max: 94.0) [2024-03-20 23:06:55,522][03784] Avg episode reward: [(0, '0.013')] [2024-03-20 23:07:00,521][03784] Fps is (10 sec: 42597.2, 60 sec: 49698.1, 300 sec: 51095.8). Total num frames: 61276160. Throughput: 0: 49099.8. Samples: 62538000. Policy #0 lag: (min: 0.0, avg: 46.3, max: 95.0) [2024-03-20 23:07:00,523][03784] Avg episode reward: [(0, '0.060')] [2024-03-20 23:07:02,373][04017] Updated weights for policy 0, policy_version 1872 (0.0009) [2024-03-20 23:07:05,521][03784] Fps is (10 sec: 36044.5, 60 sec: 45875.2, 300 sec: 50762.6). Total num frames: 61472768. Throughput: 0: 49693.4. Samples: 62876700. Policy #0 lag: (min: 0.0, avg: 31.7, max: 70.0) [2024-03-20 23:07:05,522][03784] Avg episode reward: [(0, '0.008')] [2024-03-20 23:07:07,692][04017] Updated weights for policy 0, policy_version 1882 (0.0009) [2024-03-20 23:07:10,521][03784] Fps is (10 sec: 62260.1, 60 sec: 48059.7, 300 sec: 50762.6). Total num frames: 61898752. Throughput: 0: 50020.0. Samples: 63162000. Policy #0 lag: (min: 1.0, avg: 48.0, max: 98.0) [2024-03-20 23:07:10,522][03784] Avg episode reward: [(0, '0.044')] [2024-03-20 23:07:11,254][04017] Updated weights for policy 0, policy_version 1892 (0.0011) [2024-03-20 23:07:15,521][03784] Fps is (10 sec: 72090.5, 60 sec: 51882.7, 300 sec: 51207.0). Total num frames: 62193664. Throughput: 0: 50249.0. Samples: 63318500. Policy #0 lag: (min: 1.0, avg: 56.4, max: 105.0) [2024-03-20 23:07:15,522][03784] Avg episode reward: [(0, '0.014')] [2024-03-20 23:07:17,545][04017] Updated weights for policy 0, policy_version 1902 (0.0013) [2024-03-20 23:07:20,521][03784] Fps is (10 sec: 58982.4, 60 sec: 54613.4, 300 sec: 51873.4). Total num frames: 62488576. Throughput: 0: 50502.2. Samples: 63636900. Policy #0 lag: (min: 1.0, avg: 40.4, max: 76.0) [2024-03-20 23:07:20,522][03784] Avg episode reward: [(0, '0.014')] [2024-03-20 23:07:23,896][04017] Updated weights for policy 0, policy_version 1912 (0.0011) [2024-03-20 23:07:25,521][03784] Fps is (10 sec: 58981.9, 60 sec: 54067.2, 300 sec: 51651.2). Total num frames: 62783488. Throughput: 0: 50131.1. Samples: 63901700. Policy #0 lag: (min: 0.0, avg: 42.0, max: 81.0) [2024-03-20 23:07:25,522][03784] Avg episode reward: [(0, '0.048')] [2024-03-20 23:07:30,521][03784] Fps is (10 sec: 45874.7, 60 sec: 51336.4, 300 sec: 51318.0). Total num frames: 62947328. Throughput: 0: 49890.9. Samples: 64052200. Policy #0 lag: (min: 0.0, avg: 34.7, max: 78.0) [2024-03-20 23:07:30,522][03784] Avg episode reward: [(0, '0.068')] [2024-03-20 23:07:30,890][04017] Updated weights for policy 0, policy_version 1922 (0.0023) [2024-03-20 23:07:35,521][03784] Fps is (10 sec: 32768.2, 60 sec: 48605.9, 300 sec: 51206.9). Total num frames: 63111168. Throughput: 0: 50986.6. Samples: 64375500. Policy #0 lag: (min: 1.0, avg: 36.0, max: 78.0) [2024-03-20 23:07:35,522][03784] Avg episode reward: [(0, '0.025')] [2024-03-20 23:07:40,521][03784] Fps is (10 sec: 32768.6, 60 sec: 48605.9, 300 sec: 51540.2). Total num frames: 63275008. Throughput: 0: 51786.7. Samples: 64702500. Policy #0 lag: (min: 0.0, avg: 32.3, max: 69.0) [2024-03-20 23:07:40,530][03784] Avg episode reward: [(0, '0.045')] [2024-03-20 23:07:41,749][04017] Updated weights for policy 0, policy_version 1932 (0.0011) [2024-03-20 23:07:45,521][03784] Fps is (10 sec: 29491.2, 60 sec: 50244.2, 300 sec: 50540.5). Total num frames: 63406080. Throughput: 0: 52126.9. Samples: 64883700. Policy #0 lag: (min: 0.0, avg: 32.3, max: 75.0) [2024-03-20 23:07:45,522][03784] Avg episode reward: [(0, '0.018')] [2024-03-20 23:07:48,031][03995] Signal inference workers to stop experience collection... (1300 times) [2024-03-20 23:07:48,096][03995] Signal inference workers to resume experience collection... (1300 times) [2024-03-20 23:07:48,154][04017] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-03-20 23:07:48,220][04017] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-03-20 23:07:49,889][04017] Updated weights for policy 0, policy_version 1942 (0.0011) [2024-03-20 23:07:50,521][03784] Fps is (10 sec: 42598.4, 60 sec: 47513.5, 300 sec: 50207.3). Total num frames: 63700992. Throughput: 0: 52620.1. Samples: 65244600. Policy #0 lag: (min: 0.0, avg: 26.8, max: 91.0) [2024-03-20 23:07:50,522][03784] Avg episode reward: [(0, '0.018')] [2024-03-20 23:07:53,622][04017] Updated weights for policy 0, policy_version 1952 (0.0022) [2024-03-20 23:07:55,521][03784] Fps is (10 sec: 62258.7, 60 sec: 48605.8, 300 sec: 50540.5). Total num frames: 64028672. Throughput: 0: 52902.2. Samples: 65542600. Policy #0 lag: (min: 5.0, avg: 30.7, max: 98.0) [2024-03-20 23:07:55,531][03784] Avg episode reward: [(0, '0.013')] [2024-03-20 23:07:58,138][04017] Updated weights for policy 0, policy_version 1962 (0.0015) [2024-03-20 23:08:00,521][03784] Fps is (10 sec: 78642.4, 60 sec: 53521.2, 300 sec: 51540.2). Total num frames: 64487424. Throughput: 0: 52324.3. Samples: 65673100. Policy #0 lag: (min: 1.0, avg: 37.5, max: 84.0) [2024-03-20 23:08:00,522][03784] Avg episode reward: [(0, '0.013')] [2024-03-20 23:08:00,931][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000001970_64552960.pth... [2024-03-20 23:08:01,041][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000001593_52199424.pth [2024-03-20 23:08:02,091][04017] Updated weights for policy 0, policy_version 1972 (0.0019) [2024-03-20 23:08:05,521][03784] Fps is (10 sec: 88474.2, 60 sec: 57344.1, 300 sec: 51651.3). Total num frames: 64913408. Throughput: 0: 51186.7. Samples: 65940300. Policy #0 lag: (min: 3.0, avg: 51.1, max: 112.0) [2024-03-20 23:08:05,522][03784] Avg episode reward: [(0, '0.017')] [2024-03-20 23:08:05,653][04017] Updated weights for policy 0, policy_version 1982 (0.0028) [2024-03-20 23:08:10,358][04017] Updated weights for policy 0, policy_version 1992 (0.0014) [2024-03-20 23:08:10,521][03784] Fps is (10 sec: 78643.6, 60 sec: 56251.7, 300 sec: 51873.4). Total num frames: 65273856. Throughput: 0: 51491.1. Samples: 66218800. Policy #0 lag: (min: 0.0, avg: 50.4, max: 86.0) [2024-03-20 23:08:10,522][03784] Avg episode reward: [(0, '0.031')] [2024-03-20 23:08:15,521][03784] Fps is (10 sec: 55705.6, 60 sec: 54613.3, 300 sec: 51873.4). Total num frames: 65470464. Throughput: 0: 52151.3. Samples: 66399000. Policy #0 lag: (min: 0.0, avg: 45.7, max: 77.0) [2024-03-20 23:08:15,522][03784] Avg episode reward: [(0, '0.011')] [2024-03-20 23:08:20,521][03784] Fps is (10 sec: 29491.3, 60 sec: 51336.6, 300 sec: 51762.3). Total num frames: 65568768. Throughput: 0: 52866.6. Samples: 66754500. Policy #0 lag: (min: 0.0, avg: 49.1, max: 88.0) [2024-03-20 23:08:20,522][03784] Avg episode reward: [(0, '0.014')] [2024-03-20 23:08:20,616][04017] Updated weights for policy 0, policy_version 2002 (0.0019) [2024-03-20 23:08:22,701][03995] Signal inference workers to stop experience collection... (1350 times) [2024-03-20 23:08:22,702][03995] Signal inference workers to resume experience collection... (1350 times) [2024-03-20 23:08:22,761][04017] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-03-20 23:08:22,761][04017] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-03-20 23:08:25,521][03784] Fps is (10 sec: 36044.4, 60 sec: 50790.4, 300 sec: 51540.2). Total num frames: 65830912. Throughput: 0: 53357.6. Samples: 67103600. Policy #0 lag: (min: 0.0, avg: 39.6, max: 78.0) [2024-03-20 23:08:25,522][03784] Avg episode reward: [(0, '0.014')] [2024-03-20 23:08:29,457][04017] Updated weights for policy 0, policy_version 2012 (0.0010) [2024-03-20 23:08:30,521][03784] Fps is (10 sec: 42598.3, 60 sec: 50790.5, 300 sec: 51206.9). Total num frames: 65994752. Throughput: 0: 53342.2. Samples: 67284100. Policy #0 lag: (min: 0.0, avg: 38.2, max: 89.0) [2024-03-20 23:08:30,522][03784] Avg episode reward: [(0, '0.043')] [2024-03-20 23:08:35,521][03784] Fps is (10 sec: 29491.2, 60 sec: 50244.2, 300 sec: 51206.9). Total num frames: 66125824. Throughput: 0: 52624.3. Samples: 67612700. Policy #0 lag: (min: 0.0, avg: 24.0, max: 79.0) [2024-03-20 23:08:35,522][03784] Avg episode reward: [(0, '0.061')] [2024-03-20 23:08:37,168][04017] Updated weights for policy 0, policy_version 2022 (0.0022) [2024-03-20 23:08:40,521][03784] Fps is (10 sec: 36044.2, 60 sec: 51336.3, 300 sec: 51429.0). Total num frames: 66355200. Throughput: 0: 53835.4. Samples: 67965200. Policy #0 lag: (min: 1.0, avg: 29.2, max: 86.0) [2024-03-20 23:08:40,522][03784] Avg episode reward: [(0, '0.035')] [2024-03-20 23:08:42,801][04017] Updated weights for policy 0, policy_version 2032 (0.0022) [2024-03-20 23:08:45,521][03784] Fps is (10 sec: 62259.5, 60 sec: 55705.5, 300 sec: 52206.6). Total num frames: 66748416. Throughput: 0: 53766.7. Samples: 68092600. Policy #0 lag: (min: 5.0, avg: 31.4, max: 63.0) [2024-03-20 23:08:45,522][03784] Avg episode reward: [(0, '0.035')] [2024-03-20 23:08:47,434][04017] Updated weights for policy 0, policy_version 2042 (0.0011) [2024-03-20 23:08:50,521][03784] Fps is (10 sec: 68813.5, 60 sec: 55705.5, 300 sec: 51762.3). Total num frames: 67043328. Throughput: 0: 55097.7. Samples: 68419700. Policy #0 lag: (min: 1.0, avg: 33.4, max: 66.0) [2024-03-20 23:08:50,522][03784] Avg episode reward: [(0, '0.032')] [2024-03-20 23:08:52,242][04017] Updated weights for policy 0, policy_version 2052 (0.0012) [2024-03-20 23:08:55,411][04017] Updated weights for policy 0, policy_version 2062 (0.0010) [2024-03-20 23:08:55,521][03784] Fps is (10 sec: 81919.5, 60 sec: 58982.4, 300 sec: 52317.7). Total num frames: 67567616. Throughput: 0: 54551.0. Samples: 68673600. Policy #0 lag: (min: 7.0, avg: 44.3, max: 75.0) [2024-03-20 23:08:55,522][03784] Avg episode reward: [(0, '0.034')] [2024-03-20 23:08:56,951][03995] Signal inference workers to stop experience collection... (1400 times) [2024-03-20 23:08:57,007][04017] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-03-20 23:08:57,197][03995] Signal inference workers to resume experience collection... (1400 times) [2024-03-20 23:08:57,198][04017] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-03-20 23:08:59,789][04017] Updated weights for policy 0, policy_version 2072 (0.0014) [2024-03-20 23:09:00,521][03784] Fps is (10 sec: 88472.9, 60 sec: 57343.9, 300 sec: 51873.4). Total num frames: 67928064. Throughput: 0: 53453.1. Samples: 68804400. Policy #0 lag: (min: 0.0, avg: 47.3, max: 103.0) [2024-03-20 23:09:00,522][03784] Avg episode reward: [(0, '0.048')] [2024-03-20 23:09:05,521][03784] Fps is (10 sec: 52429.6, 60 sec: 52975.0, 300 sec: 51762.4). Total num frames: 68091904. Throughput: 0: 53400.1. Samples: 69157500. Policy #0 lag: (min: 0.0, avg: 46.8, max: 89.0) [2024-03-20 23:09:05,530][03784] Avg episode reward: [(0, '0.048')] [2024-03-20 23:09:07,707][04017] Updated weights for policy 0, policy_version 2082 (0.0011) [2024-03-20 23:09:10,521][03784] Fps is (10 sec: 42598.9, 60 sec: 51336.5, 300 sec: 51984.5). Total num frames: 68354048. Throughput: 0: 52620.1. Samples: 69471500. Policy #0 lag: (min: 0.0, avg: 47.6, max: 86.0) [2024-03-20 23:09:10,531][03784] Avg episode reward: [(0, '0.024')] [2024-03-20 23:09:15,521][03784] Fps is (10 sec: 39321.3, 60 sec: 50244.2, 300 sec: 51318.0). Total num frames: 68485120. Throughput: 0: 52351.1. Samples: 69639900. Policy #0 lag: (min: 0.0, avg: 33.1, max: 80.0) [2024-03-20 23:09:15,522][03784] Avg episode reward: [(0, '0.031')] [2024-03-20 23:09:17,249][04017] Updated weights for policy 0, policy_version 2092 (0.0019) [2024-03-20 23:09:20,521][03784] Fps is (10 sec: 22937.4, 60 sec: 50244.2, 300 sec: 50984.8). Total num frames: 68583424. Throughput: 0: 52637.8. Samples: 69981400. Policy #0 lag: (min: 0.0, avg: 33.1, max: 80.0) [2024-03-20 23:09:20,531][03784] Avg episode reward: [(0, '0.031')] [2024-03-20 23:09:25,521][03784] Fps is (10 sec: 36045.1, 60 sec: 50244.4, 300 sec: 51318.0). Total num frames: 68845568. Throughput: 0: 51220.3. Samples: 70270100. Policy #0 lag: (min: 1.0, avg: 31.2, max: 79.0) [2024-03-20 23:09:25,521][03784] Avg episode reward: [(0, '0.043')] [2024-03-20 23:09:25,621][04017] Updated weights for policy 0, policy_version 2102 (0.0020) [2024-03-20 23:09:30,521][03784] Fps is (10 sec: 55705.9, 60 sec: 52428.8, 300 sec: 51873.4). Total num frames: 69140480. Throughput: 0: 51288.9. Samples: 70400600. Policy #0 lag: (min: 0.0, avg: 36.8, max: 87.0) [2024-03-20 23:09:30,522][03784] Avg episode reward: [(0, '0.065')] [2024-03-20 23:09:32,057][04017] Updated weights for policy 0, policy_version 2112 (0.0022) [2024-03-20 23:09:35,521][03784] Fps is (10 sec: 45873.9, 60 sec: 52974.8, 300 sec: 52206.6). Total num frames: 69304320. Throughput: 0: 51555.4. Samples: 70739700. Policy #0 lag: (min: 1.0, avg: 25.5, max: 66.0) [2024-03-20 23:09:35,523][03784] Avg episode reward: [(0, '0.037')] [2024-03-20 23:09:38,213][04017] Updated weights for policy 0, policy_version 2122 (0.0010) [2024-03-20 23:09:40,521][03784] Fps is (10 sec: 55705.8, 60 sec: 55705.8, 300 sec: 52428.8). Total num frames: 69697536. Throughput: 0: 52235.7. Samples: 71024200. Policy #0 lag: (min: 0.0, avg: 39.5, max: 88.0) [2024-03-20 23:09:40,522][03784] Avg episode reward: [(0, '0.026')] [2024-03-20 23:09:42,763][04017] Updated weights for policy 0, policy_version 2132 (0.0010) [2024-03-20 23:09:45,521][03784] Fps is (10 sec: 68814.1, 60 sec: 54067.2, 300 sec: 51651.2). Total num frames: 69992448. Throughput: 0: 52946.8. Samples: 71187000. Policy #0 lag: (min: 1.0, avg: 41.7, max: 92.0) [2024-03-20 23:09:45,522][03784] Avg episode reward: [(0, '0.064')] [2024-03-20 23:09:47,192][04017] Updated weights for policy 0, policy_version 2142 (0.0013) [2024-03-20 23:09:50,521][03784] Fps is (10 sec: 58981.8, 60 sec: 54067.2, 300 sec: 51429.1). Total num frames: 70287360. Throughput: 0: 51670.9. Samples: 71482700. Policy #0 lag: (min: 5.0, avg: 47.4, max: 95.0) [2024-03-20 23:09:50,522][03784] Avg episode reward: [(0, '0.056')] [2024-03-20 23:09:50,820][03995] Signal inference workers to stop experience collection... (1450 times) [2024-03-20 23:09:50,869][04017] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-03-20 23:09:51,064][03995] Signal inference workers to resume experience collection... (1450 times) [2024-03-20 23:09:51,065][04017] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-03-20 23:09:54,943][04017] Updated weights for policy 0, policy_version 2152 (0.0021) [2024-03-20 23:09:55,521][03784] Fps is (10 sec: 58982.4, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 70582272. Throughput: 0: 51880.0. Samples: 71806100. Policy #0 lag: (min: 0.0, avg: 48.5, max: 108.0) [2024-03-20 23:09:55,522][03784] Avg episode reward: [(0, '0.058')] [2024-03-20 23:09:58,213][04017] Updated weights for policy 0, policy_version 2162 (0.0016) [2024-03-20 23:10:00,521][03784] Fps is (10 sec: 55705.3, 60 sec: 48605.9, 300 sec: 51206.9). Total num frames: 70844416. Throughput: 0: 50928.7. Samples: 71931700. Policy #0 lag: (min: 0.0, avg: 48.5, max: 108.0) [2024-03-20 23:10:00,522][03784] Avg episode reward: [(0, '0.032')] [2024-03-20 23:10:01,114][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000002164_70909952.pth... [2024-03-20 23:10:01,164][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000001781_58359808.pth [2024-03-20 23:10:05,521][03784] Fps is (10 sec: 52428.8, 60 sec: 50244.2, 300 sec: 51206.9). Total num frames: 71106560. Throughput: 0: 50309.0. Samples: 72245300. Policy #0 lag: (min: 0.0, avg: 35.8, max: 78.0) [2024-03-20 23:10:05,522][03784] Avg episode reward: [(0, '0.071')] [2024-03-20 23:10:05,523][03995] Saving new best policy, reward=0.071! [2024-03-20 23:10:08,251][04017] Updated weights for policy 0, policy_version 2172 (0.0010) [2024-03-20 23:10:10,521][03784] Fps is (10 sec: 49152.5, 60 sec: 49698.1, 300 sec: 51651.2). Total num frames: 71335936. Throughput: 0: 50546.5. Samples: 72544700. Policy #0 lag: (min: 1.0, avg: 33.4, max: 73.0) [2024-03-20 23:10:10,522][03784] Avg episode reward: [(0, '0.049')] [2024-03-20 23:10:12,567][04017] Updated weights for policy 0, policy_version 2182 (0.0010) [2024-03-20 23:10:15,521][03784] Fps is (10 sec: 45875.4, 60 sec: 51336.6, 300 sec: 51207.0). Total num frames: 71565312. Throughput: 0: 50937.9. Samples: 72692800. Policy #0 lag: (min: 1.0, avg: 37.4, max: 78.0) [2024-03-20 23:10:15,522][03784] Avg episode reward: [(0, '0.036')] [2024-03-20 23:10:20,521][03784] Fps is (10 sec: 42598.5, 60 sec: 52975.0, 300 sec: 51540.2). Total num frames: 71761920. Throughput: 0: 50280.2. Samples: 73002300. Policy #0 lag: (min: 0.0, avg: 33.8, max: 77.0) [2024-03-20 23:10:20,522][03784] Avg episode reward: [(0, '0.066')] [2024-03-20 23:10:22,031][04017] Updated weights for policy 0, policy_version 2192 (0.0018) [2024-03-20 23:10:25,521][03784] Fps is (10 sec: 45874.2, 60 sec: 52974.7, 300 sec: 52206.6). Total num frames: 72024064. Throughput: 0: 50502.0. Samples: 73296800. Policy #0 lag: (min: 1.0, avg: 28.7, max: 59.0) [2024-03-20 23:10:25,522][03784] Avg episode reward: [(0, '0.086')] [2024-03-20 23:10:25,873][03995] Saving new best policy, reward=0.086! [2024-03-20 23:10:29,536][04017] Updated weights for policy 0, policy_version 2202 (0.0010) [2024-03-20 23:10:30,521][03784] Fps is (10 sec: 42598.4, 60 sec: 50790.4, 300 sec: 52095.6). Total num frames: 72187904. Throughput: 0: 50311.1. Samples: 73451000. Policy #0 lag: (min: 0.0, avg: 32.0, max: 72.0) [2024-03-20 23:10:30,522][03784] Avg episode reward: [(0, '0.070')] [2024-03-20 23:10:35,257][04017] Updated weights for policy 0, policy_version 2212 (0.0010) [2024-03-20 23:10:35,521][03784] Fps is (10 sec: 45876.3, 60 sec: 52975.2, 300 sec: 51873.4). Total num frames: 72482816. Throughput: 0: 50478.0. Samples: 73754200. Policy #0 lag: (min: 3.0, avg: 35.2, max: 77.0) [2024-03-20 23:10:35,522][03784] Avg episode reward: [(0, '0.118')] [2024-03-20 23:10:35,835][03995] Saving new best policy, reward=0.118! [2024-03-20 23:10:40,521][03784] Fps is (10 sec: 52429.2, 60 sec: 50244.3, 300 sec: 51095.9). Total num frames: 72712192. Throughput: 0: 50086.7. Samples: 74060000. Policy #0 lag: (min: 0.0, avg: 44.9, max: 100.0) [2024-03-20 23:10:40,522][03784] Avg episode reward: [(0, '0.039')] [2024-03-20 23:10:42,330][04017] Updated weights for policy 0, policy_version 2222 (0.0015) [2024-03-20 23:10:43,539][03995] Signal inference workers to stop experience collection... (1500 times) [2024-03-20 23:10:43,540][03995] Signal inference workers to resume experience collection... (1500 times) [2024-03-20 23:10:43,611][04017] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-03-20 23:10:43,612][04017] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-03-20 23:10:45,521][03784] Fps is (10 sec: 45874.1, 60 sec: 49151.9, 300 sec: 50540.5). Total num frames: 72941568. Throughput: 0: 50748.8. Samples: 74215400. Policy #0 lag: (min: 0.0, avg: 27.0, max: 56.0) [2024-03-20 23:10:45,522][03784] Avg episode reward: [(0, '0.073')] [2024-03-20 23:10:48,922][04017] Updated weights for policy 0, policy_version 2232 (0.0015) [2024-03-20 23:10:50,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49152.1, 300 sec: 50762.6). Total num frames: 73236480. Throughput: 0: 51226.7. Samples: 74550500. Policy #0 lag: (min: 2.0, avg: 42.8, max: 98.0) [2024-03-20 23:10:50,522][03784] Avg episode reward: [(0, '0.038')] [2024-03-20 23:10:53,021][04017] Updated weights for policy 0, policy_version 2242 (0.0009) [2024-03-20 23:10:55,521][03784] Fps is (10 sec: 55706.9, 60 sec: 48605.9, 300 sec: 51540.2). Total num frames: 73498624. Throughput: 0: 51677.9. Samples: 74870200. Policy #0 lag: (min: 2.0, avg: 42.8, max: 98.0) [2024-03-20 23:10:55,522][03784] Avg episode reward: [(0, '0.052')] [2024-03-20 23:10:59,818][04017] Updated weights for policy 0, policy_version 2252 (0.0021) [2024-03-20 23:11:00,521][03784] Fps is (10 sec: 62258.5, 60 sec: 50244.3, 300 sec: 51318.0). Total num frames: 73859072. Throughput: 0: 52075.4. Samples: 75036200. Policy #0 lag: (min: 1.0, avg: 37.9, max: 70.0) [2024-03-20 23:11:00,522][03784] Avg episode reward: [(0, '0.052')] [2024-03-20 23:11:04,902][04017] Updated weights for policy 0, policy_version 2262 (0.0023) [2024-03-20 23:11:05,521][03784] Fps is (10 sec: 65535.1, 60 sec: 50790.3, 300 sec: 51318.0). Total num frames: 74153984. Throughput: 0: 51924.4. Samples: 75338900. Policy #0 lag: (min: 0.0, avg: 52.2, max: 104.0) [2024-03-20 23:11:05,522][03784] Avg episode reward: [(0, '0.055')] [2024-03-20 23:11:10,521][03784] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 51873.4). Total num frames: 74383360. Throughput: 0: 51824.7. Samples: 75628900. Policy #0 lag: (min: 2.0, avg: 40.8, max: 80.0) [2024-03-20 23:11:10,522][03784] Avg episode reward: [(0, '0.016')] [2024-03-20 23:11:11,995][04017] Updated weights for policy 0, policy_version 2272 (0.0012) [2024-03-20 23:11:15,521][03784] Fps is (10 sec: 58983.0, 60 sec: 52974.9, 300 sec: 52651.0). Total num frames: 74743808. Throughput: 0: 51553.4. Samples: 75770900. Policy #0 lag: (min: 5.0, avg: 44.0, max: 103.0) [2024-03-20 23:11:15,522][03784] Avg episode reward: [(0, '0.139')] [2024-03-20 23:11:15,522][03995] Saving new best policy, reward=0.139! [2024-03-20 23:11:16,138][04017] Updated weights for policy 0, policy_version 2282 (0.0010) [2024-03-20 23:11:20,521][03784] Fps is (10 sec: 55705.5, 60 sec: 52975.0, 300 sec: 52206.6). Total num frames: 74940416. Throughput: 0: 51568.9. Samples: 76074800. Policy #0 lag: (min: 2.0, avg: 42.6, max: 80.0) [2024-03-20 23:11:20,522][03784] Avg episode reward: [(0, '0.017')] [2024-03-20 23:11:22,782][04017] Updated weights for policy 0, policy_version 2292 (0.0015) [2024-03-20 23:11:25,521][03784] Fps is (10 sec: 49152.2, 60 sec: 53521.3, 300 sec: 52095.6). Total num frames: 75235328. Throughput: 0: 51748.9. Samples: 76388700. Policy #0 lag: (min: 1.0, avg: 37.4, max: 74.0) [2024-03-20 23:11:25,522][03784] Avg episode reward: [(0, '0.017')] [2024-03-20 23:11:26,811][03995] Signal inference workers to stop experience collection... (1550 times) [2024-03-20 23:11:26,857][04017] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-03-20 23:11:27,129][03995] Signal inference workers to resume experience collection... (1550 times) [2024-03-20 23:11:27,129][04017] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-03-20 23:11:30,521][03784] Fps is (10 sec: 45875.1, 60 sec: 53521.1, 300 sec: 51540.2). Total num frames: 75399168. Throughput: 0: 51273.6. Samples: 76522700. Policy #0 lag: (min: 0.0, avg: 43.1, max: 93.0) [2024-03-20 23:11:30,522][03784] Avg episode reward: [(0, '0.053')] [2024-03-20 23:11:32,820][04017] Updated weights for policy 0, policy_version 2302 (0.0011) [2024-03-20 23:11:35,521][03784] Fps is (10 sec: 29491.0, 60 sec: 50790.4, 300 sec: 51429.1). Total num frames: 75530240. Throughput: 0: 51588.8. Samples: 76872000. Policy #0 lag: (min: 0.0, avg: 29.8, max: 70.0) [2024-03-20 23:11:35,522][03784] Avg episode reward: [(0, '0.058')] [2024-03-20 23:11:40,521][03784] Fps is (10 sec: 19661.0, 60 sec: 48059.8, 300 sec: 51540.2). Total num frames: 75595776. Throughput: 0: 52284.5. Samples: 77223000. Policy #0 lag: (min: 0.0, avg: 29.8, max: 70.0) [2024-03-20 23:11:40,521][03784] Avg episode reward: [(0, '0.112')] [2024-03-20 23:11:41,958][04017] Updated weights for policy 0, policy_version 2312 (0.0018) [2024-03-20 23:11:45,521][03784] Fps is (10 sec: 42598.3, 60 sec: 50244.4, 300 sec: 51206.9). Total num frames: 75956224. Throughput: 0: 51500.0. Samples: 77353700. Policy #0 lag: (min: 2.0, avg: 39.3, max: 93.0) [2024-03-20 23:11:45,522][03784] Avg episode reward: [(0, '0.081')] [2024-03-20 23:11:48,380][04017] Updated weights for policy 0, policy_version 2322 (0.0014) [2024-03-20 23:11:50,521][03784] Fps is (10 sec: 62258.2, 60 sec: 49698.0, 300 sec: 51206.9). Total num frames: 76218368. Throughput: 0: 51595.6. Samples: 77660700. Policy #0 lag: (min: 0.0, avg: 31.9, max: 78.0) [2024-03-20 23:11:50,523][03784] Avg episode reward: [(0, '0.018')] [2024-03-20 23:11:53,198][04017] Updated weights for policy 0, policy_version 2332 (0.0010) [2024-03-20 23:11:55,521][03784] Fps is (10 sec: 68812.5, 60 sec: 52428.7, 300 sec: 52095.6). Total num frames: 76644352. Throughput: 0: 51713.2. Samples: 77956000. Policy #0 lag: (min: 0.0, avg: 31.2, max: 64.0) [2024-03-20 23:11:55,522][03784] Avg episode reward: [(0, '0.083')] [2024-03-20 23:11:56,316][04017] Updated weights for policy 0, policy_version 2342 (0.0014) [2024-03-20 23:11:59,447][04017] Updated weights for policy 0, policy_version 2352 (0.0016) [2024-03-20 23:12:00,521][03784] Fps is (10 sec: 91750.5, 60 sec: 54613.3, 300 sec: 53095.3). Total num frames: 77135872. Throughput: 0: 51177.7. Samples: 78073900. Policy #0 lag: (min: 2.0, avg: 43.9, max: 78.0) [2024-03-20 23:12:00,531][03784] Avg episode reward: [(0, '0.105')] [2024-03-20 23:12:00,547][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000002354_77135872.pth... [2024-03-20 23:12:00,660][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000001970_64552960.pth [2024-03-20 23:12:05,521][03784] Fps is (10 sec: 72089.3, 60 sec: 53521.0, 300 sec: 52428.8). Total num frames: 77365248. Throughput: 0: 50873.2. Samples: 78364100. Policy #0 lag: (min: 0.0, avg: 41.9, max: 72.0) [2024-03-20 23:12:05,522][03784] Avg episode reward: [(0, '0.128')] [2024-03-20 23:12:05,747][04017] Updated weights for policy 0, policy_version 2362 (0.0016) [2024-03-20 23:12:05,793][03995] Signal inference workers to stop experience collection... (1600 times) [2024-03-20 23:12:05,918][04017] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-03-20 23:12:06,003][03995] Signal inference workers to resume experience collection... (1600 times) [2024-03-20 23:12:06,003][04017] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-03-20 23:12:10,521][03784] Fps is (10 sec: 45875.5, 60 sec: 53521.0, 300 sec: 52206.6). Total num frames: 77594624. Throughput: 0: 51255.5. Samples: 78695200. Policy #0 lag: (min: 0.0, avg: 49.3, max: 90.0) [2024-03-20 23:12:10,522][03784] Avg episode reward: [(0, '0.019')] [2024-03-20 23:12:12,535][04017] Updated weights for policy 0, policy_version 2372 (0.0021) [2024-03-20 23:12:15,521][03784] Fps is (10 sec: 45875.7, 60 sec: 51336.5, 300 sec: 51984.5). Total num frames: 77824000. Throughput: 0: 51513.3. Samples: 78840800. Policy #0 lag: (min: 0.0, avg: 44.2, max: 78.0) [2024-03-20 23:12:15,530][03784] Avg episode reward: [(0, '0.039')] [2024-03-20 23:12:19,771][04017] Updated weights for policy 0, policy_version 2382 (0.0016) [2024-03-20 23:12:20,521][03784] Fps is (10 sec: 52428.7, 60 sec: 52974.9, 300 sec: 51984.5). Total num frames: 78118912. Throughput: 0: 51151.1. Samples: 79173800. Policy #0 lag: (min: 3.0, avg: 40.6, max: 82.0) [2024-03-20 23:12:20,522][03784] Avg episode reward: [(0, '0.051')] [2024-03-20 23:12:25,521][03784] Fps is (10 sec: 39321.7, 60 sec: 49698.1, 300 sec: 51762.4). Total num frames: 78217216. Throughput: 0: 50997.7. Samples: 79517900. Policy #0 lag: (min: 0.0, avg: 32.3, max: 83.0) [2024-03-20 23:12:25,522][03784] Avg episode reward: [(0, '0.018')] [2024-03-20 23:12:30,521][03784] Fps is (10 sec: 16383.9, 60 sec: 48059.7, 300 sec: 51429.1). Total num frames: 78282752. Throughput: 0: 52188.9. Samples: 79702200. Policy #0 lag: (min: 0.0, avg: 30.3, max: 82.0) [2024-03-20 23:12:30,522][03784] Avg episode reward: [(0, '0.050')] [2024-03-20 23:12:32,912][04017] Updated weights for policy 0, policy_version 2392 (0.0013) [2024-03-20 23:12:35,521][03784] Fps is (10 sec: 39321.4, 60 sec: 51336.5, 300 sec: 51984.5). Total num frames: 78610432. Throughput: 0: 52517.8. Samples: 80024000. Policy #0 lag: (min: 1.0, avg: 24.9, max: 86.0) [2024-03-20 23:12:35,522][03784] Avg episode reward: [(0, '0.142')] [2024-03-20 23:12:35,652][03995] Saving new best policy, reward=0.142! [2024-03-20 23:12:37,506][04017] Updated weights for policy 0, policy_version 2402 (0.0009) [2024-03-20 23:12:40,521][03784] Fps is (10 sec: 62259.5, 60 sec: 55159.4, 300 sec: 52539.9). Total num frames: 78905344. Throughput: 0: 52831.2. Samples: 80333400. Policy #0 lag: (min: 0.0, avg: 32.4, max: 87.0) [2024-03-20 23:12:40,522][03784] Avg episode reward: [(0, '0.142')] [2024-03-20 23:12:41,907][04017] Updated weights for policy 0, policy_version 2412 (0.0031) [2024-03-20 23:12:45,521][03784] Fps is (10 sec: 62259.0, 60 sec: 54613.3, 300 sec: 52650.9). Total num frames: 79233024. Throughput: 0: 53284.4. Samples: 80471700. Policy #0 lag: (min: 1.0, avg: 32.8, max: 62.0) [2024-03-20 23:12:45,522][03784] Avg episode reward: [(0, '0.053')] [2024-03-20 23:12:47,389][04017] Updated weights for policy 0, policy_version 2422 (0.0009) [2024-03-20 23:12:50,521][03784] Fps is (10 sec: 72089.9, 60 sec: 56798.0, 300 sec: 52873.1). Total num frames: 79626240. Throughput: 0: 53435.7. Samples: 80768700. Policy #0 lag: (min: 1.0, avg: 44.1, max: 91.0) [2024-03-20 23:12:50,522][03784] Avg episode reward: [(0, '0.021')] [2024-03-20 23:12:51,079][04017] Updated weights for policy 0, policy_version 2432 (0.0011) [2024-03-20 23:12:54,120][03995] Signal inference workers to stop experience collection... (1650 times) [2024-03-20 23:12:54,165][04017] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-03-20 23:12:54,437][03995] Signal inference workers to resume experience collection... (1650 times) [2024-03-20 23:12:54,437][04017] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-03-20 23:12:55,521][03784] Fps is (10 sec: 68813.9, 60 sec: 54613.5, 300 sec: 52317.8). Total num frames: 79921152. Throughput: 0: 52540.1. Samples: 81059500. Policy #0 lag: (min: 0.0, avg: 40.7, max: 89.0) [2024-03-20 23:12:55,522][03784] Avg episode reward: [(0, '0.037')] [2024-03-20 23:12:56,740][04017] Updated weights for policy 0, policy_version 2442 (0.0018) [2024-03-20 23:13:00,521][03784] Fps is (10 sec: 65535.3, 60 sec: 52428.8, 300 sec: 52095.6). Total num frames: 80281600. Throughput: 0: 52757.7. Samples: 81214900. Policy #0 lag: (min: 3.0, avg: 41.0, max: 75.0) [2024-03-20 23:13:00,522][03784] Avg episode reward: [(0, '0.169')] [2024-03-20 23:13:00,534][03995] Saving new best policy, reward=0.169! [2024-03-20 23:13:02,874][04017] Updated weights for policy 0, policy_version 2452 (0.0016) [2024-03-20 23:13:05,521][03784] Fps is (10 sec: 58981.9, 60 sec: 52428.9, 300 sec: 51651.3). Total num frames: 80510976. Throughput: 0: 52580.0. Samples: 81539900. Policy #0 lag: (min: 0.0, avg: 47.0, max: 83.0) [2024-03-20 23:13:05,522][03784] Avg episode reward: [(0, '0.169')] [2024-03-20 23:13:09,362][04017] Updated weights for policy 0, policy_version 2462 (0.0016) [2024-03-20 23:13:10,521][03784] Fps is (10 sec: 49152.1, 60 sec: 52974.9, 300 sec: 51873.4). Total num frames: 80773120. Throughput: 0: 51593.3. Samples: 81839600. Policy #0 lag: (min: 2.0, avg: 47.9, max: 88.0) [2024-03-20 23:13:10,522][03784] Avg episode reward: [(0, '0.082')] [2024-03-20 23:13:15,521][03784] Fps is (10 sec: 32768.1, 60 sec: 50244.3, 300 sec: 51762.3). Total num frames: 80838656. Throughput: 0: 50840.1. Samples: 81990000. Policy #0 lag: (min: 2.0, avg: 47.9, max: 88.0) [2024-03-20 23:13:15,522][03784] Avg episode reward: [(0, '0.073')] [2024-03-20 23:13:20,380][04017] Updated weights for policy 0, policy_version 2472 (0.0011) [2024-03-20 23:13:20,521][03784] Fps is (10 sec: 22937.6, 60 sec: 48059.7, 300 sec: 51429.1). Total num frames: 81002496. Throughput: 0: 50962.2. Samples: 82317300. Policy #0 lag: (min: 0.0, avg: 29.1, max: 80.0) [2024-03-20 23:13:20,522][03784] Avg episode reward: [(0, '0.101')] [2024-03-20 23:13:25,521][03784] Fps is (10 sec: 39321.5, 60 sec: 50244.3, 300 sec: 51651.3). Total num frames: 81231872. Throughput: 0: 51471.1. Samples: 82649600. Policy #0 lag: (min: 0.0, avg: 31.0, max: 82.0) [2024-03-20 23:13:25,522][03784] Avg episode reward: [(0, '0.056')] [2024-03-20 23:13:26,350][04017] Updated weights for policy 0, policy_version 2482 (0.0018) [2024-03-20 23:13:30,521][03784] Fps is (10 sec: 52428.2, 60 sec: 54067.1, 300 sec: 52206.6). Total num frames: 81526784. Throughput: 0: 51695.4. Samples: 82798000. Policy #0 lag: (min: 0.0, avg: 38.8, max: 92.0) [2024-03-20 23:13:30,522][03784] Avg episode reward: [(0, '0.019')] [2024-03-20 23:13:31,870][04017] Updated weights for policy 0, policy_version 2492 (0.0015) [2024-03-20 23:13:35,521][03784] Fps is (10 sec: 68813.0, 60 sec: 55159.5, 300 sec: 52762.1). Total num frames: 81920000. Throughput: 0: 51680.0. Samples: 83094300. Policy #0 lag: (min: 1.0, avg: 35.5, max: 82.0) [2024-03-20 23:13:35,522][03784] Avg episode reward: [(0, '0.098')] [2024-03-20 23:13:36,719][04017] Updated weights for policy 0, policy_version 2502 (0.0012) [2024-03-20 23:13:40,523][03784] Fps is (10 sec: 49144.8, 60 sec: 51881.3, 300 sec: 51762.1). Total num frames: 82018304. Throughput: 0: 53066.8. Samples: 83447600. Policy #0 lag: (min: 0.0, avg: 39.3, max: 95.0) [2024-03-20 23:13:40,524][03784] Avg episode reward: [(0, '0.073')] [2024-03-20 23:13:45,200][04017] Updated weights for policy 0, policy_version 2512 (0.0017) [2024-03-20 23:13:45,521][03784] Fps is (10 sec: 39321.6, 60 sec: 51336.6, 300 sec: 51762.4). Total num frames: 82313216. Throughput: 0: 53413.5. Samples: 83618500. Policy #0 lag: (min: 3.0, avg: 32.3, max: 81.0) [2024-03-20 23:13:45,522][03784] Avg episode reward: [(0, '0.106')] [2024-03-20 23:13:49,666][03995] Signal inference workers to stop experience collection... (1700 times) [2024-03-20 23:13:49,744][03995] Signal inference workers to resume experience collection... (1700 times) [2024-03-20 23:13:49,787][04017] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-03-20 23:13:49,840][04017] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-03-20 23:13:50,129][04017] Updated weights for policy 0, policy_version 2522 (0.0014) [2024-03-20 23:13:50,521][03784] Fps is (10 sec: 65545.3, 60 sec: 50790.2, 300 sec: 51206.9). Total num frames: 82673664. Throughput: 0: 52719.7. Samples: 83912300. Policy #0 lag: (min: 0.0, avg: 38.0, max: 88.0) [2024-03-20 23:13:50,523][03784] Avg episode reward: [(0, '0.106')] [2024-03-20 23:13:53,693][04017] Updated weights for policy 0, policy_version 2532 (0.0013) [2024-03-20 23:13:55,521][03784] Fps is (10 sec: 81918.8, 60 sec: 53520.9, 300 sec: 51540.2). Total num frames: 83132416. Throughput: 0: 51833.3. Samples: 84172100. Policy #0 lag: (min: 2.0, avg: 51.4, max: 110.0) [2024-03-20 23:13:55,522][03784] Avg episode reward: [(0, '0.025')] [2024-03-20 23:14:00,521][03784] Fps is (10 sec: 58983.5, 60 sec: 49698.2, 300 sec: 51429.1). Total num frames: 83263488. Throughput: 0: 52386.6. Samples: 84347400. Policy #0 lag: (min: 0.0, avg: 46.2, max: 116.0) [2024-03-20 23:14:00,522][03784] Avg episode reward: [(0, '0.150')] [2024-03-20 23:14:00,624][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000002542_83296256.pth... [2024-03-20 23:14:00,653][04017] Updated weights for policy 0, policy_version 2542 (0.0019) [2024-03-20 23:14:00,725][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000002164_70909952.pth [2024-03-20 23:14:05,521][03784] Fps is (10 sec: 36045.1, 60 sec: 49698.1, 300 sec: 51318.0). Total num frames: 83492864. Throughput: 0: 51517.8. Samples: 84635600. Policy #0 lag: (min: 1.0, avg: 43.3, max: 79.0) [2024-03-20 23:14:05,522][03784] Avg episode reward: [(0, '0.022')] [2024-03-20 23:14:07,231][04017] Updated weights for policy 0, policy_version 2552 (0.0015) [2024-03-20 23:14:10,521][03784] Fps is (10 sec: 52428.9, 60 sec: 50244.3, 300 sec: 51873.4). Total num frames: 83787776. Throughput: 0: 50460.0. Samples: 84920300. Policy #0 lag: (min: 1.0, avg: 38.7, max: 81.0) [2024-03-20 23:14:10,522][03784] Avg episode reward: [(0, '0.035')] [2024-03-20 23:14:13,437][04017] Updated weights for policy 0, policy_version 2562 (0.0017) [2024-03-20 23:14:15,521][03784] Fps is (10 sec: 55705.6, 60 sec: 53521.0, 300 sec: 52428.8). Total num frames: 84049920. Throughput: 0: 50615.7. Samples: 85075700. Policy #0 lag: (min: 0.0, avg: 42.3, max: 103.0) [2024-03-20 23:14:15,522][03784] Avg episode reward: [(0, '0.061')] [2024-03-20 23:14:20,521][03784] Fps is (10 sec: 42598.7, 60 sec: 53521.2, 300 sec: 52095.6). Total num frames: 84213760. Throughput: 0: 50231.1. Samples: 85354700. Policy #0 lag: (min: 0.0, avg: 34.1, max: 76.0) [2024-03-20 23:14:20,522][03784] Avg episode reward: [(0, '0.021')] [2024-03-20 23:14:23,384][04017] Updated weights for policy 0, policy_version 2572 (0.0016) [2024-03-20 23:14:25,521][03784] Fps is (10 sec: 26214.5, 60 sec: 51336.5, 300 sec: 51429.1). Total num frames: 84312064. Throughput: 0: 49075.1. Samples: 85655900. Policy #0 lag: (min: 0.0, avg: 30.4, max: 71.0) [2024-03-20 23:14:25,522][03784] Avg episode reward: [(0, '0.035')] [2024-03-20 23:14:30,165][04017] Updated weights for policy 0, policy_version 2582 (0.0010) [2024-03-20 23:14:30,521][03784] Fps is (10 sec: 42598.3, 60 sec: 51882.8, 300 sec: 51984.5). Total num frames: 84639744. Throughput: 0: 48975.5. Samples: 85822400. Policy #0 lag: (min: 1.0, avg: 27.6, max: 70.0) [2024-03-20 23:14:30,522][03784] Avg episode reward: [(0, '0.120')] [2024-03-20 23:14:35,521][03784] Fps is (10 sec: 45875.6, 60 sec: 47513.6, 300 sec: 51095.9). Total num frames: 84770816. Throughput: 0: 48767.0. Samples: 86106800. Policy #0 lag: (min: 3.0, avg: 40.1, max: 102.0) [2024-03-20 23:14:35,522][03784] Avg episode reward: [(0, '0.030')] [2024-03-20 23:14:38,093][04017] Updated weights for policy 0, policy_version 2592 (0.0010) [2024-03-20 23:14:40,521][03784] Fps is (10 sec: 32768.0, 60 sec: 49153.4, 300 sec: 50762.6). Total num frames: 84967424. Throughput: 0: 49809.1. Samples: 86413500. Policy #0 lag: (min: 0.0, avg: 23.1, max: 74.0) [2024-03-20 23:14:40,522][03784] Avg episode reward: [(0, '0.076')] [2024-03-20 23:14:42,738][03995] Signal inference workers to stop experience collection... (1750 times) [2024-03-20 23:14:42,794][04017] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-03-20 23:14:43,012][03995] Signal inference workers to resume experience collection... (1750 times) [2024-03-20 23:14:43,012][04017] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-03-20 23:14:45,219][04017] Updated weights for policy 0, policy_version 2602 (0.0019) [2024-03-20 23:14:45,521][03784] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 50762.7). Total num frames: 85262336. Throughput: 0: 49291.3. Samples: 86565500. Policy #0 lag: (min: 1.0, avg: 32.6, max: 79.0) [2024-03-20 23:14:45,521][03784] Avg episode reward: [(0, '0.160')] [2024-03-20 23:14:49,458][04017] Updated weights for policy 0, policy_version 2612 (0.0017) [2024-03-20 23:14:50,521][03784] Fps is (10 sec: 68812.3, 60 sec: 49698.3, 300 sec: 51095.9). Total num frames: 85655552. Throughput: 0: 49046.7. Samples: 86842700. Policy #0 lag: (min: 3.0, avg: 31.1, max: 62.0) [2024-03-20 23:14:50,522][03784] Avg episode reward: [(0, '0.039')] [2024-03-20 23:14:55,521][03784] Fps is (10 sec: 62258.6, 60 sec: 45875.3, 300 sec: 50984.8). Total num frames: 85884928. Throughput: 0: 49313.4. Samples: 87139400. Policy #0 lag: (min: 1.0, avg: 35.6, max: 60.0) [2024-03-20 23:14:55,522][03784] Avg episode reward: [(0, '0.131')] [2024-03-20 23:14:55,597][04017] Updated weights for policy 0, policy_version 2622 (0.0014) [2024-03-20 23:15:00,040][04017] Updated weights for policy 0, policy_version 2632 (0.0016) [2024-03-20 23:15:00,521][03784] Fps is (10 sec: 62259.1, 60 sec: 50244.3, 300 sec: 51429.1). Total num frames: 86278144. Throughput: 0: 49066.7. Samples: 87283700. Policy #0 lag: (min: 3.0, avg: 45.5, max: 115.0) [2024-03-20 23:15:00,522][03784] Avg episode reward: [(0, '0.042')] [2024-03-20 23:15:04,930][04017] Updated weights for policy 0, policy_version 2642 (0.0010) [2024-03-20 23:15:05,521][03784] Fps is (10 sec: 72089.7, 60 sec: 51882.7, 300 sec: 51762.3). Total num frames: 86605824. Throughput: 0: 49051.1. Samples: 87562000. Policy #0 lag: (min: 1.0, avg: 54.5, max: 105.0) [2024-03-20 23:15:05,522][03784] Avg episode reward: [(0, '0.044')] [2024-03-20 23:15:10,521][03784] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 51540.2). Total num frames: 86769664. Throughput: 0: 49028.9. Samples: 87862200. Policy #0 lag: (min: 1.0, avg: 40.4, max: 73.0) [2024-03-20 23:15:10,522][03784] Avg episode reward: [(0, '0.155')] [2024-03-20 23:15:15,521][03784] Fps is (10 sec: 29490.6, 60 sec: 47513.5, 300 sec: 51318.0). Total num frames: 86900736. Throughput: 0: 49024.2. Samples: 88028500. Policy #0 lag: (min: 0.0, avg: 38.0, max: 98.0) [2024-03-20 23:15:15,523][03784] Avg episode reward: [(0, '0.166')] [2024-03-20 23:15:15,542][04017] Updated weights for policy 0, policy_version 2652 (0.0018) [2024-03-20 23:15:20,521][03784] Fps is (10 sec: 39321.6, 60 sec: 49151.9, 300 sec: 51318.0). Total num frames: 87162880. Throughput: 0: 49506.5. Samples: 88334600. Policy #0 lag: (min: 0.0, avg: 33.2, max: 85.0) [2024-03-20 23:15:20,522][03784] Avg episode reward: [(0, '0.066')] [2024-03-20 23:15:21,847][04017] Updated weights for policy 0, policy_version 2662 (0.0018) [2024-03-20 23:15:25,521][03784] Fps is (10 sec: 45875.2, 60 sec: 50790.3, 300 sec: 51429.1). Total num frames: 87359488. Throughput: 0: 49457.6. Samples: 88639100. Policy #0 lag: (min: 2.0, avg: 37.5, max: 89.0) [2024-03-20 23:15:25,522][03784] Avg episode reward: [(0, '0.050')] [2024-03-20 23:15:29,383][03995] Signal inference workers to stop experience collection... (1800 times) [2024-03-20 23:15:29,424][04017] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-03-20 23:15:29,606][03995] Signal inference workers to resume experience collection... (1800 times) [2024-03-20 23:15:29,606][04017] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-03-20 23:15:29,608][04017] Updated weights for policy 0, policy_version 2672 (0.0010) [2024-03-20 23:15:30,521][03784] Fps is (10 sec: 42598.8, 60 sec: 49152.0, 300 sec: 51206.9). Total num frames: 87588864. Throughput: 0: 49453.3. Samples: 88790900. Policy #0 lag: (min: 0.0, avg: 32.2, max: 80.0) [2024-03-20 23:15:30,522][03784] Avg episode reward: [(0, '0.040')] [2024-03-20 23:15:35,381][04017] Updated weights for policy 0, policy_version 2682 (0.0010) [2024-03-20 23:15:35,521][03784] Fps is (10 sec: 52429.4, 60 sec: 51882.5, 300 sec: 51429.1). Total num frames: 87883776. Throughput: 0: 49808.9. Samples: 89084100. Policy #0 lag: (min: 0.0, avg: 28.1, max: 75.0) [2024-03-20 23:15:35,522][03784] Avg episode reward: [(0, '0.042')] [2024-03-20 23:15:40,521][03784] Fps is (10 sec: 55704.9, 60 sec: 52974.8, 300 sec: 51540.2). Total num frames: 88145920. Throughput: 0: 50122.1. Samples: 89394900. Policy #0 lag: (min: 1.0, avg: 32.7, max: 73.0) [2024-03-20 23:15:40,522][03784] Avg episode reward: [(0, '0.095')] [2024-03-20 23:15:41,113][04017] Updated weights for policy 0, policy_version 2692 (0.0020) [2024-03-20 23:15:45,521][03784] Fps is (10 sec: 45875.0, 60 sec: 51336.4, 300 sec: 51206.9). Total num frames: 88342528. Throughput: 0: 50677.7. Samples: 89564200. Policy #0 lag: (min: 0.0, avg: 32.1, max: 67.0) [2024-03-20 23:15:45,522][03784] Avg episode reward: [(0, '0.116')] [2024-03-20 23:15:50,278][04017] Updated weights for policy 0, policy_version 2702 (0.0010) [2024-03-20 23:15:50,521][03784] Fps is (10 sec: 39321.5, 60 sec: 48059.7, 300 sec: 50984.8). Total num frames: 88539136. Throughput: 0: 51826.5. Samples: 89894200. Policy #0 lag: (min: 0.0, avg: 47.3, max: 105.0) [2024-03-20 23:15:50,522][03784] Avg episode reward: [(0, '0.168')] [2024-03-20 23:15:54,604][04017] Updated weights for policy 0, policy_version 2712 (0.0009) [2024-03-20 23:15:55,521][03784] Fps is (10 sec: 52429.3, 60 sec: 49698.1, 300 sec: 50873.7). Total num frames: 88866816. Throughput: 0: 52133.4. Samples: 90208200. Policy #0 lag: (min: 1.0, avg: 37.4, max: 79.0) [2024-03-20 23:15:55,522][03784] Avg episode reward: [(0, '0.045')] [2024-03-20 23:16:00,521][03784] Fps is (10 sec: 62260.3, 60 sec: 48059.8, 300 sec: 50873.7). Total num frames: 89161728. Throughput: 0: 51804.7. Samples: 90359700. Policy #0 lag: (min: 1.0, avg: 40.4, max: 76.0) [2024-03-20 23:16:00,521][03784] Avg episode reward: [(0, '0.045')] [2024-03-20 23:16:00,564][04017] Updated weights for policy 0, policy_version 2722 (0.0017) [2024-03-20 23:16:00,860][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000002723_89227264.pth... [2024-03-20 23:16:00,982][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000002354_77135872.pth [2024-03-20 23:16:05,521][03784] Fps is (10 sec: 55705.4, 60 sec: 46967.4, 300 sec: 50984.8). Total num frames: 89423872. Throughput: 0: 51791.1. Samples: 90665200. Policy #0 lag: (min: 0.0, avg: 48.5, max: 101.0) [2024-03-20 23:16:05,522][03784] Avg episode reward: [(0, '0.201')] [2024-03-20 23:16:05,522][03995] Saving new best policy, reward=0.201! [2024-03-20 23:16:06,620][04017] Updated weights for policy 0, policy_version 2732 (0.0015) [2024-03-20 23:16:10,521][03784] Fps is (10 sec: 52428.0, 60 sec: 48605.8, 300 sec: 50651.5). Total num frames: 89686016. Throughput: 0: 52211.2. Samples: 90988600. Policy #0 lag: (min: 0.0, avg: 40.6, max: 75.0) [2024-03-20 23:16:10,522][03784] Avg episode reward: [(0, '0.141')] [2024-03-20 23:16:12,259][04017] Updated weights for policy 0, policy_version 2742 (0.0037) [2024-03-20 23:16:14,764][03995] Signal inference workers to stop experience collection... (1850 times) [2024-03-20 23:16:14,821][03995] Signal inference workers to resume experience collection... (1850 times) [2024-03-20 23:16:14,851][04017] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-03-20 23:16:14,907][04017] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-03-20 23:16:15,521][03784] Fps is (10 sec: 68813.2, 60 sec: 53521.2, 300 sec: 51429.1). Total num frames: 90112000. Throughput: 0: 52160.0. Samples: 91138100. Policy #0 lag: (min: 1.0, avg: 40.6, max: 93.0) [2024-03-20 23:16:15,522][03784] Avg episode reward: [(0, '0.095')] [2024-03-20 23:16:16,302][04017] Updated weights for policy 0, policy_version 2752 (0.0015) [2024-03-20 23:16:20,521][03784] Fps is (10 sec: 62259.5, 60 sec: 52428.8, 300 sec: 51095.9). Total num frames: 90308608. Throughput: 0: 52626.7. Samples: 91452300. Policy #0 lag: (min: 1.0, avg: 42.5, max: 83.0) [2024-03-20 23:16:20,522][03784] Avg episode reward: [(0, '0.095')] [2024-03-20 23:16:24,328][04017] Updated weights for policy 0, policy_version 2763 (0.0015) [2024-03-20 23:16:25,521][03784] Fps is (10 sec: 49151.9, 60 sec: 54067.4, 300 sec: 51540.2). Total num frames: 90603520. Throughput: 0: 52715.6. Samples: 91767100. Policy #0 lag: (min: 0.0, avg: 38.6, max: 83.0) [2024-03-20 23:16:25,522][03784] Avg episode reward: [(0, '0.138')] [2024-03-20 23:16:30,521][03784] Fps is (10 sec: 49151.9, 60 sec: 53521.0, 300 sec: 51762.3). Total num frames: 90800128. Throughput: 0: 52682.3. Samples: 91934900. Policy #0 lag: (min: 0.0, avg: 43.0, max: 89.0) [2024-03-20 23:16:30,524][03784] Avg episode reward: [(0, '0.100')] [2024-03-20 23:16:31,513][04017] Updated weights for policy 0, policy_version 2773 (0.0010) [2024-03-20 23:16:35,521][03784] Fps is (10 sec: 36044.5, 60 sec: 51336.5, 300 sec: 52095.5). Total num frames: 90963968. Throughput: 0: 53031.1. Samples: 92280600. Policy #0 lag: (min: 0.0, avg: 35.0, max: 82.0) [2024-03-20 23:16:35,522][03784] Avg episode reward: [(0, '0.100')] [2024-03-20 23:16:40,521][03784] Fps is (10 sec: 36045.1, 60 sec: 50244.4, 300 sec: 51540.2). Total num frames: 91160576. Throughput: 0: 53575.6. Samples: 92619100. Policy #0 lag: (min: 0.0, avg: 29.6, max: 75.0) [2024-03-20 23:16:40,522][03784] Avg episode reward: [(0, '0.130')] [2024-03-20 23:16:42,755][04017] Updated weights for policy 0, policy_version 2783 (0.0019) [2024-03-20 23:16:45,521][03784] Fps is (10 sec: 36045.2, 60 sec: 49698.2, 300 sec: 51207.0). Total num frames: 91324416. Throughput: 0: 53824.4. Samples: 92781800. Policy #0 lag: (min: 0.0, avg: 34.0, max: 83.0) [2024-03-20 23:16:45,522][03784] Avg episode reward: [(0, '0.107')] [2024-03-20 23:16:49,546][04017] Updated weights for policy 0, policy_version 2793 (0.0037) [2024-03-20 23:16:50,521][03784] Fps is (10 sec: 45874.4, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 91619328. Throughput: 0: 54237.7. Samples: 93105900. Policy #0 lag: (min: 1.0, avg: 24.4, max: 59.0) [2024-03-20 23:16:50,531][03784] Avg episode reward: [(0, '0.107')] [2024-03-20 23:16:52,840][04017] Updated weights for policy 0, policy_version 2803 (0.0011) [2024-03-20 23:16:55,521][03784] Fps is (10 sec: 78642.5, 60 sec: 54067.2, 300 sec: 50762.6). Total num frames: 92110848. Throughput: 0: 52793.3. Samples: 93364300. Policy #0 lag: (min: 4.0, avg: 38.6, max: 81.0) [2024-03-20 23:16:55,522][03784] Avg episode reward: [(0, '0.101')] [2024-03-20 23:16:56,118][04017] Updated weights for policy 0, policy_version 2813 (0.0010) [2024-03-20 23:16:59,585][04017] Updated weights for policy 0, policy_version 2823 (0.0021) [2024-03-20 23:16:59,819][03995] Signal inference workers to stop experience collection... (1900 times) [2024-03-20 23:16:59,871][04017] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-03-20 23:16:59,939][03995] Signal inference workers to resume experience collection... (1900 times) [2024-03-20 23:16:59,940][04017] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-03-20 23:17:00,521][03784] Fps is (10 sec: 95028.5, 60 sec: 56797.8, 300 sec: 51540.2). Total num frames: 92569600. Throughput: 0: 52026.7. Samples: 93479300. Policy #0 lag: (min: 1.0, avg: 49.3, max: 87.0) [2024-03-20 23:17:00,522][03784] Avg episode reward: [(0, '0.101')] [2024-03-20 23:17:05,098][04017] Updated weights for policy 0, policy_version 2833 (0.0046) [2024-03-20 23:17:05,521][03784] Fps is (10 sec: 75366.3, 60 sec: 57344.0, 300 sec: 51762.3). Total num frames: 92864512. Throughput: 0: 51959.9. Samples: 93790500. Policy #0 lag: (min: 1.0, avg: 39.0, max: 66.0) [2024-03-20 23:17:05,522][03784] Avg episode reward: [(0, '0.109')] [2024-03-20 23:17:10,521][03784] Fps is (10 sec: 52427.6, 60 sec: 56797.7, 300 sec: 51762.3). Total num frames: 93093888. Throughput: 0: 51373.1. Samples: 94078900. Policy #0 lag: (min: 1.0, avg: 49.5, max: 85.0) [2024-03-20 23:17:10,522][03784] Avg episode reward: [(0, '0.145')] [2024-03-20 23:17:13,221][04017] Updated weights for policy 0, policy_version 2843 (0.0011) [2024-03-20 23:17:15,521][03784] Fps is (10 sec: 29490.9, 60 sec: 50790.2, 300 sec: 50984.8). Total num frames: 93159424. Throughput: 0: 50964.3. Samples: 94228300. Policy #0 lag: (min: 0.0, avg: 54.8, max: 105.0) [2024-03-20 23:17:15,522][03784] Avg episode reward: [(0, '0.081')] [2024-03-20 23:17:20,521][03784] Fps is (10 sec: 22938.2, 60 sec: 50244.3, 300 sec: 51206.9). Total num frames: 93323264. Throughput: 0: 49377.9. Samples: 94502600. Policy #0 lag: (min: 0.0, avg: 37.2, max: 78.0) [2024-03-20 23:17:20,521][03784] Avg episode reward: [(0, '0.048')] [2024-03-20 23:17:24,233][04017] Updated weights for policy 0, policy_version 2853 (0.0013) [2024-03-20 23:17:25,521][03784] Fps is (10 sec: 36045.7, 60 sec: 48605.9, 300 sec: 51651.3). Total num frames: 93519872. Throughput: 0: 48315.6. Samples: 94793300. Policy #0 lag: (min: 0.0, avg: 42.3, max: 85.0) [2024-03-20 23:17:25,522][03784] Avg episode reward: [(0, '0.090')] [2024-03-20 23:17:30,521][03784] Fps is (10 sec: 39321.1, 60 sec: 48605.8, 300 sec: 51206.9). Total num frames: 93716480. Throughput: 0: 48402.1. Samples: 94959900. Policy #0 lag: (min: 1.0, avg: 29.3, max: 73.0) [2024-03-20 23:17:30,522][03784] Avg episode reward: [(0, '0.055')] [2024-03-20 23:17:31,374][04017] Updated weights for policy 0, policy_version 2863 (0.0011) [2024-03-20 23:17:35,521][03784] Fps is (10 sec: 45875.0, 60 sec: 50244.4, 300 sec: 51095.9). Total num frames: 93978624. Throughput: 0: 47702.4. Samples: 95252500. Policy #0 lag: (min: 0.0, avg: 23.4, max: 80.0) [2024-03-20 23:17:35,522][03784] Avg episode reward: [(0, '0.157')] [2024-03-20 23:17:39,088][04017] Updated weights for policy 0, policy_version 2873 (0.0014) [2024-03-20 23:17:40,521][03784] Fps is (10 sec: 52429.9, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 94240768. Throughput: 0: 48473.5. Samples: 95545600. Policy #0 lag: (min: 2.0, avg: 41.8, max: 91.0) [2024-03-20 23:17:40,521][03784] Avg episode reward: [(0, '0.136')] [2024-03-20 23:17:45,320][04017] Updated weights for policy 0, policy_version 2883 (0.0009) [2024-03-20 23:17:45,521][03784] Fps is (10 sec: 49152.0, 60 sec: 52428.8, 300 sec: 50318.3). Total num frames: 94470144. Throughput: 0: 49024.5. Samples: 95685400. Policy #0 lag: (min: 0.0, avg: 27.8, max: 78.0) [2024-03-20 23:17:45,522][03784] Avg episode reward: [(0, '0.152')] [2024-03-20 23:17:49,350][04017] Updated weights for policy 0, policy_version 2893 (0.0020) [2024-03-20 23:17:50,521][03784] Fps is (10 sec: 58981.4, 60 sec: 53521.1, 300 sec: 50540.5). Total num frames: 94830592. Throughput: 0: 48146.7. Samples: 95957100. Policy #0 lag: (min: 1.0, avg: 34.0, max: 70.0) [2024-03-20 23:17:50,522][03784] Avg episode reward: [(0, '0.183')] [2024-03-20 23:17:54,977][03995] Signal inference workers to stop experience collection... (1950 times) [2024-03-20 23:17:55,022][04017] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-03-20 23:17:55,196][03995] Signal inference workers to resume experience collection... (1950 times) [2024-03-20 23:17:55,196][04017] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-03-20 23:17:55,521][03784] Fps is (10 sec: 52428.2, 60 sec: 48059.7, 300 sec: 49874.0). Total num frames: 94994432. Throughput: 0: 48484.6. Samples: 96260700. Policy #0 lag: (min: 0.0, avg: 32.7, max: 65.0) [2024-03-20 23:17:55,522][03784] Avg episode reward: [(0, '0.075')] [2024-03-20 23:17:56,563][04017] Updated weights for policy 0, policy_version 2903 (0.0010) [2024-03-20 23:18:00,521][03784] Fps is (10 sec: 42598.7, 60 sec: 44782.9, 300 sec: 49985.1). Total num frames: 95256576. Throughput: 0: 48229.1. Samples: 96398600. Policy #0 lag: (min: 0.0, avg: 38.9, max: 90.0) [2024-03-20 23:18:00,521][03784] Avg episode reward: [(0, '0.177')] [2024-03-20 23:18:00,923][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000002909_95322112.pth... [2024-03-20 23:18:01,055][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000002542_83296256.pth [2024-03-20 23:18:02,623][04017] Updated weights for policy 0, policy_version 2913 (0.0011) [2024-03-20 23:18:05,521][03784] Fps is (10 sec: 72090.7, 60 sec: 47513.7, 300 sec: 50651.6). Total num frames: 95715328. Throughput: 0: 47986.7. Samples: 96662000. Policy #0 lag: (min: 2.0, avg: 52.7, max: 114.0) [2024-03-20 23:18:05,522][03784] Avg episode reward: [(0, '0.060')] [2024-03-20 23:18:07,931][04017] Updated weights for policy 0, policy_version 2923 (0.0017) [2024-03-20 23:18:10,521][03784] Fps is (10 sec: 58982.4, 60 sec: 45875.4, 300 sec: 50873.7). Total num frames: 95846400. Throughput: 0: 48597.7. Samples: 96980200. Policy #0 lag: (min: 2.0, avg: 52.7, max: 114.0) [2024-03-20 23:18:10,522][03784] Avg episode reward: [(0, '0.060')] [2024-03-20 23:18:15,521][03784] Fps is (10 sec: 32767.7, 60 sec: 48059.9, 300 sec: 50984.8). Total num frames: 96043008. Throughput: 0: 48024.5. Samples: 97121000. Policy #0 lag: (min: 0.0, avg: 28.6, max: 65.0) [2024-03-20 23:18:15,522][03784] Avg episode reward: [(0, '0.192')] [2024-03-20 23:18:16,446][04017] Updated weights for policy 0, policy_version 2933 (0.0010) [2024-03-20 23:18:20,521][03784] Fps is (10 sec: 36044.4, 60 sec: 48059.6, 300 sec: 50762.6). Total num frames: 96206848. Throughput: 0: 48246.5. Samples: 97423600. Policy #0 lag: (min: 0.0, avg: 29.7, max: 75.0) [2024-03-20 23:18:20,522][03784] Avg episode reward: [(0, '0.050')] [2024-03-20 23:18:23,711][04017] Updated weights for policy 0, policy_version 2943 (0.0012) [2024-03-20 23:18:25,521][03784] Fps is (10 sec: 39322.0, 60 sec: 48605.9, 300 sec: 50540.5). Total num frames: 96436224. Throughput: 0: 47386.6. Samples: 97678000. Policy #0 lag: (min: 0.0, avg: 29.7, max: 75.0) [2024-03-20 23:18:25,522][03784] Avg episode reward: [(0, '0.223')] [2024-03-20 23:18:25,523][03995] Saving new best policy, reward=0.223! [2024-03-20 23:18:30,521][03784] Fps is (10 sec: 52429.3, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 96731136. Throughput: 0: 46962.2. Samples: 97798700. Policy #0 lag: (min: 0.0, avg: 27.3, max: 69.0) [2024-03-20 23:18:30,522][03784] Avg episode reward: [(0, '0.133')] [2024-03-20 23:18:30,667][04017] Updated weights for policy 0, policy_version 2953 (0.0032) [2024-03-20 23:18:35,521][03784] Fps is (10 sec: 45874.8, 60 sec: 48605.8, 300 sec: 50429.7). Total num frames: 96894976. Throughput: 0: 46493.4. Samples: 98049300. Policy #0 lag: (min: 3.0, avg: 32.9, max: 84.0) [2024-03-20 23:18:35,522][03784] Avg episode reward: [(0, '0.198')] [2024-03-20 23:18:38,605][04017] Updated weights for policy 0, policy_version 2963 (0.0018) [2024-03-20 23:18:40,521][03784] Fps is (10 sec: 52428.5, 60 sec: 50244.1, 300 sec: 50651.5). Total num frames: 97255424. Throughput: 0: 45766.7. Samples: 98320200. Policy #0 lag: (min: 3.0, avg: 46.6, max: 110.0) [2024-03-20 23:18:40,522][03784] Avg episode reward: [(0, '0.102')] [2024-03-20 23:18:45,390][03995] Signal inference workers to stop experience collection... (2000 times) [2024-03-20 23:18:45,451][04017] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-03-20 23:18:45,521][03784] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 49874.1). Total num frames: 97386496. Throughput: 0: 46460.0. Samples: 98489300. Policy #0 lag: (min: 0.0, avg: 43.9, max: 99.0) [2024-03-20 23:18:45,522][03784] Avg episode reward: [(0, '0.187')] [2024-03-20 23:18:45,620][03995] Signal inference workers to resume experience collection... (2000 times) [2024-03-20 23:18:45,621][04017] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-03-20 23:18:45,630][04017] Updated weights for policy 0, policy_version 2973 (0.0020) [2024-03-20 23:18:50,521][03784] Fps is (10 sec: 32768.1, 60 sec: 45875.2, 300 sec: 48985.4). Total num frames: 97583104. Throughput: 0: 47888.8. Samples: 98817000. Policy #0 lag: (min: 1.0, avg: 37.6, max: 74.0) [2024-03-20 23:18:50,522][03784] Avg episode reward: [(0, '0.187')] [2024-03-20 23:18:53,834][04017] Updated weights for policy 0, policy_version 2983 (0.0014) [2024-03-20 23:18:55,521][03784] Fps is (10 sec: 39321.6, 60 sec: 46421.4, 300 sec: 49207.6). Total num frames: 97779712. Throughput: 0: 47744.5. Samples: 99128700. Policy #0 lag: (min: 1.0, avg: 29.2, max: 75.0) [2024-03-20 23:18:55,522][03784] Avg episode reward: [(0, '0.095')] [2024-03-20 23:19:00,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45329.1, 300 sec: 49096.5). Total num frames: 97976320. Throughput: 0: 47935.6. Samples: 99278100. Policy #0 lag: (min: 0.0, avg: 27.4, max: 60.0) [2024-03-20 23:19:00,522][03784] Avg episode reward: [(0, '0.087')] [2024-03-20 23:19:03,216][04017] Updated weights for policy 0, policy_version 2993 (0.0010) [2024-03-20 23:19:05,521][03784] Fps is (10 sec: 36045.1, 60 sec: 40413.9, 300 sec: 48652.2). Total num frames: 98140160. Throughput: 0: 48538.0. Samples: 99607800. Policy #0 lag: (min: 0.0, avg: 27.7, max: 63.0) [2024-03-20 23:19:05,521][03784] Avg episode reward: [(0, '0.094')] [2024-03-20 23:19:09,113][04017] Updated weights for policy 0, policy_version 3003 (0.0029) [2024-03-20 23:19:10,521][03784] Fps is (10 sec: 55705.2, 60 sec: 44782.9, 300 sec: 49096.5). Total num frames: 98533376. Throughput: 0: 48793.2. Samples: 99873700. Policy #0 lag: (min: 0.0, avg: 26.7, max: 67.0) [2024-03-20 23:19:10,522][03784] Avg episode reward: [(0, '0.217')] [2024-03-20 23:19:12,102][04017] Updated weights for policy 0, policy_version 3013 (0.0017) [2024-03-20 23:19:15,521][03784] Fps is (10 sec: 88472.1, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 99024896. Throughput: 0: 48611.1. Samples: 99986200. Policy #0 lag: (min: 0.0, avg: 49.8, max: 125.0) [2024-03-20 23:19:15,522][03784] Avg episode reward: [(0, '0.068')] [2024-03-20 23:19:15,805][04017] Updated weights for policy 0, policy_version 3023 (0.0013) [2024-03-20 23:19:20,521][03784] Fps is (10 sec: 72090.2, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 99254272. Throughput: 0: 49924.5. Samples: 100295900. Policy #0 lag: (min: 1.0, avg: 38.0, max: 82.0) [2024-03-20 23:19:20,522][03784] Avg episode reward: [(0, '0.068')] [2024-03-20 23:19:22,745][04017] Updated weights for policy 0, policy_version 3033 (0.0011) [2024-03-20 23:19:25,521][03784] Fps is (10 sec: 42598.7, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 99450880. Throughput: 0: 51442.4. Samples: 100635100. Policy #0 lag: (min: 1.0, avg: 38.0, max: 82.0) [2024-03-20 23:19:25,522][03784] Avg episode reward: [(0, '0.147')] [2024-03-20 23:19:29,393][03995] Signal inference workers to stop experience collection... (2050 times) [2024-03-20 23:19:29,446][04017] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-03-20 23:19:29,659][03995] Signal inference workers to resume experience collection... (2050 times) [2024-03-20 23:19:29,659][04017] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-03-20 23:19:30,521][03784] Fps is (10 sec: 42598.4, 60 sec: 49152.0, 300 sec: 50540.5). Total num frames: 99680256. Throughput: 0: 51064.4. Samples: 100787200. Policy #0 lag: (min: 0.0, avg: 40.4, max: 81.0) [2024-03-20 23:19:30,525][03784] Avg episode reward: [(0, '0.147')] [2024-03-20 23:19:33,686][04017] Updated weights for policy 0, policy_version 3043 (0.0011) [2024-03-20 23:19:35,521][03784] Fps is (10 sec: 32768.1, 60 sec: 48059.8, 300 sec: 50207.2). Total num frames: 99778560. Throughput: 0: 50649.0. Samples: 101096200. Policy #0 lag: (min: 0.0, avg: 42.7, max: 95.0) [2024-03-20 23:19:35,522][03784] Avg episode reward: [(0, '0.238')] [2024-03-20 23:19:35,522][03995] Saving new best policy, reward=0.238! [2024-03-20 23:19:40,521][03784] Fps is (10 sec: 29491.0, 60 sec: 45329.1, 300 sec: 49874.0). Total num frames: 99975168. Throughput: 0: 50519.9. Samples: 101402100. Policy #0 lag: (min: 1.0, avg: 30.6, max: 71.0) [2024-03-20 23:19:40,522][03784] Avg episode reward: [(0, '0.190')] [2024-03-20 23:19:41,261][04017] Updated weights for policy 0, policy_version 3053 (0.0015) [2024-03-20 23:19:45,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46421.3, 300 sec: 49207.5). Total num frames: 100171776. Throughput: 0: 50600.0. Samples: 101555100. Policy #0 lag: (min: 0.0, avg: 33.7, max: 76.0) [2024-03-20 23:19:45,522][03784] Avg episode reward: [(0, '0.103')] [2024-03-20 23:19:47,639][04017] Updated weights for policy 0, policy_version 3063 (0.0014) [2024-03-20 23:19:50,521][03784] Fps is (10 sec: 65535.9, 60 sec: 50790.4, 300 sec: 49985.1). Total num frames: 100630528. Throughput: 0: 49802.0. Samples: 101848900. Policy #0 lag: (min: 2.0, avg: 38.9, max: 99.0) [2024-03-20 23:19:50,522][03784] Avg episode reward: [(0, '0.171')] [2024-03-20 23:19:51,391][04017] Updated weights for policy 0, policy_version 3073 (0.0013) [2024-03-20 23:19:55,521][03784] Fps is (10 sec: 78643.5, 60 sec: 52974.9, 300 sec: 49762.9). Total num frames: 100958208. Throughput: 0: 49951.2. Samples: 102121500. Policy #0 lag: (min: 3.0, avg: 45.4, max: 89.0) [2024-03-20 23:19:55,522][03784] Avg episode reward: [(0, '0.274')] [2024-03-20 23:19:55,523][03995] Saving new best policy, reward=0.274! [2024-03-20 23:20:00,503][04017] Updated weights for policy 0, policy_version 3083 (0.0020) [2024-03-20 23:20:00,521][03784] Fps is (10 sec: 39322.3, 60 sec: 50790.5, 300 sec: 48874.3). Total num frames: 101023744. Throughput: 0: 51395.8. Samples: 102299000. Policy #0 lag: (min: 0.0, avg: 37.7, max: 86.0) [2024-03-20 23:20:00,521][03784] Avg episode reward: [(0, '0.241')] [2024-03-20 23:20:00,736][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000003084_101056512.pth... [2024-03-20 23:20:00,852][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000002723_89227264.pth [2024-03-20 23:20:05,521][03784] Fps is (10 sec: 36044.7, 60 sec: 52974.8, 300 sec: 49318.6). Total num frames: 101318656. Throughput: 0: 51757.7. Samples: 102625000. Policy #0 lag: (min: 2.0, avg: 38.5, max: 88.0) [2024-03-20 23:20:05,522][03784] Avg episode reward: [(0, '0.355')] [2024-03-20 23:20:05,523][03995] Saving new best policy, reward=0.355! [2024-03-20 23:20:06,096][04017] Updated weights for policy 0, policy_version 3093 (0.0015) [2024-03-20 23:20:10,241][04017] Updated weights for policy 0, policy_version 3103 (0.0022) [2024-03-20 23:20:10,521][03784] Fps is (10 sec: 65534.2, 60 sec: 52428.7, 300 sec: 50096.2). Total num frames: 101679104. Throughput: 0: 49948.7. Samples: 102882800. Policy #0 lag: (min: 1.0, avg: 42.8, max: 72.0) [2024-03-20 23:20:10,522][03784] Avg episode reward: [(0, '0.164')] [2024-03-20 23:20:15,521][03784] Fps is (10 sec: 49152.0, 60 sec: 46421.3, 300 sec: 49651.9). Total num frames: 101810176. Throughput: 0: 50004.4. Samples: 103037400. Policy #0 lag: (min: 1.0, avg: 42.8, max: 72.0) [2024-03-20 23:20:15,522][03784] Avg episode reward: [(0, '0.164')] [2024-03-20 23:20:18,547][04017] Updated weights for policy 0, policy_version 3113 (0.0013) [2024-03-20 23:20:18,580][03995] Signal inference workers to stop experience collection... (2100 times) [2024-03-20 23:20:18,650][04017] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-03-20 23:20:18,839][03995] Signal inference workers to resume experience collection... (2100 times) [2024-03-20 23:20:18,839][04017] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-03-20 23:20:20,521][03784] Fps is (10 sec: 45875.3, 60 sec: 48059.6, 300 sec: 50096.2). Total num frames: 102137856. Throughput: 0: 50024.2. Samples: 103347300. Policy #0 lag: (min: 0.0, avg: 49.3, max: 100.0) [2024-03-20 23:20:20,522][03784] Avg episode reward: [(0, '0.129')] [2024-03-20 23:20:24,083][04017] Updated weights for policy 0, policy_version 3123 (0.0017) [2024-03-20 23:20:25,521][03784] Fps is (10 sec: 52429.3, 60 sec: 48059.8, 300 sec: 49985.1). Total num frames: 102334464. Throughput: 0: 50377.9. Samples: 103669100. Policy #0 lag: (min: 0.0, avg: 37.5, max: 80.0) [2024-03-20 23:20:25,522][03784] Avg episode reward: [(0, '0.165')] [2024-03-20 23:20:30,521][03784] Fps is (10 sec: 45875.9, 60 sec: 48605.9, 300 sec: 49874.0). Total num frames: 102596608. Throughput: 0: 50397.8. Samples: 103823000. Policy #0 lag: (min: 1.0, avg: 38.6, max: 80.0) [2024-03-20 23:20:30,522][03784] Avg episode reward: [(0, '0.165')] [2024-03-20 23:20:32,819][04017] Updated weights for policy 0, policy_version 3133 (0.0010) [2024-03-20 23:20:35,521][03784] Fps is (10 sec: 52428.5, 60 sec: 51336.5, 300 sec: 49874.0). Total num frames: 102858752. Throughput: 0: 50100.1. Samples: 104103400. Policy #0 lag: (min: 0.0, avg: 39.3, max: 84.0) [2024-03-20 23:20:35,522][03784] Avg episode reward: [(0, '0.367')] [2024-03-20 23:20:35,523][03995] Saving new best policy, reward=0.367! [2024-03-20 23:20:38,369][04017] Updated weights for policy 0, policy_version 3143 (0.0011) [2024-03-20 23:20:40,521][03784] Fps is (10 sec: 39321.5, 60 sec: 50244.3, 300 sec: 49651.9). Total num frames: 102989824. Throughput: 0: 50126.6. Samples: 104377200. Policy #0 lag: (min: 0.0, avg: 39.3, max: 84.0) [2024-03-20 23:20:40,522][03784] Avg episode reward: [(0, '0.218')] [2024-03-20 23:20:45,521][03784] Fps is (10 sec: 42598.3, 60 sec: 51882.7, 300 sec: 49985.1). Total num frames: 103284736. Throughput: 0: 48955.4. Samples: 104502000. Policy #0 lag: (min: 0.0, avg: 47.5, max: 95.0) [2024-03-20 23:20:45,522][03784] Avg episode reward: [(0, '0.196')] [2024-03-20 23:20:45,645][04017] Updated weights for policy 0, policy_version 3153 (0.0011) [2024-03-20 23:20:50,521][03784] Fps is (10 sec: 52428.9, 60 sec: 48059.8, 300 sec: 49651.9). Total num frames: 103514112. Throughput: 0: 48115.6. Samples: 104790200. Policy #0 lag: (min: 0.0, avg: 47.5, max: 95.0) [2024-03-20 23:20:50,522][03784] Avg episode reward: [(0, '0.093')] [2024-03-20 23:20:52,499][04017] Updated weights for policy 0, policy_version 3163 (0.0009) [2024-03-20 23:20:55,521][03784] Fps is (10 sec: 42598.6, 60 sec: 45875.2, 300 sec: 49318.6). Total num frames: 103710720. Throughput: 0: 49502.4. Samples: 105110400. Policy #0 lag: (min: 0.0, avg: 43.9, max: 93.0) [2024-03-20 23:20:55,522][03784] Avg episode reward: [(0, '0.093')] [2024-03-20 23:21:00,521][03784] Fps is (10 sec: 29491.1, 60 sec: 46421.2, 300 sec: 48763.2). Total num frames: 103809024. Throughput: 0: 49444.5. Samples: 105262400. Policy #0 lag: (min: 0.0, avg: 41.3, max: 89.0) [2024-03-20 23:21:00,522][03784] Avg episode reward: [(0, '0.172')] [2024-03-20 23:21:04,665][04017] Updated weights for policy 0, policy_version 3173 (0.0015) [2024-03-20 23:21:05,521][03784] Fps is (10 sec: 32767.7, 60 sec: 45329.0, 300 sec: 48652.2). Total num frames: 104038400. Throughput: 0: 49326.7. Samples: 105567000. Policy #0 lag: (min: 0.0, avg: 30.3, max: 74.0) [2024-03-20 23:21:05,522][03784] Avg episode reward: [(0, '0.110')] [2024-03-20 23:21:08,726][04017] Updated weights for policy 0, policy_version 3183 (0.0010) [2024-03-20 23:21:10,521][03784] Fps is (10 sec: 65536.0, 60 sec: 46421.4, 300 sec: 48652.1). Total num frames: 104464384. Throughput: 0: 48364.4. Samples: 105845500. Policy #0 lag: (min: 0.0, avg: 32.2, max: 84.0) [2024-03-20 23:21:10,522][03784] Avg episode reward: [(0, '0.124')] [2024-03-20 23:21:12,400][04017] Updated weights for policy 0, policy_version 3193 (0.0013) [2024-03-20 23:21:15,521][03784] Fps is (10 sec: 75366.2, 60 sec: 49698.1, 300 sec: 49096.4). Total num frames: 104792064. Throughput: 0: 47768.8. Samples: 105972600. Policy #0 lag: (min: 6.0, avg: 50.0, max: 103.0) [2024-03-20 23:21:15,522][03784] Avg episode reward: [(0, '0.057')] [2024-03-20 23:21:16,123][03995] Signal inference workers to stop experience collection... (2150 times) [2024-03-20 23:21:16,193][04017] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-03-20 23:21:16,347][03995] Signal inference workers to resume experience collection... (2150 times) [2024-03-20 23:21:16,347][04017] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-03-20 23:21:16,966][04017] Updated weights for policy 0, policy_version 3203 (0.0015) [2024-03-20 23:21:20,521][03784] Fps is (10 sec: 72089.4, 60 sec: 50790.5, 300 sec: 49429.7). Total num frames: 105185280. Throughput: 0: 47842.2. Samples: 106256300. Policy #0 lag: (min: 0.0, avg: 46.1, max: 97.0) [2024-03-20 23:21:20,522][03784] Avg episode reward: [(0, '0.057')] [2024-03-20 23:21:24,372][04017] Updated weights for policy 0, policy_version 3213 (0.0014) [2024-03-20 23:21:25,521][03784] Fps is (10 sec: 55706.0, 60 sec: 50244.2, 300 sec: 49318.6). Total num frames: 105349120. Throughput: 0: 48264.4. Samples: 106549100. Policy #0 lag: (min: 0.0, avg: 46.8, max: 103.0) [2024-03-20 23:21:25,522][03784] Avg episode reward: [(0, '0.130')] [2024-03-20 23:21:30,313][04017] Updated weights for policy 0, policy_version 3223 (0.0016) [2024-03-20 23:21:30,521][03784] Fps is (10 sec: 42598.3, 60 sec: 50244.2, 300 sec: 49651.9). Total num frames: 105611264. Throughput: 0: 47884.4. Samples: 106656800. Policy #0 lag: (min: 1.0, avg: 39.7, max: 69.0) [2024-03-20 23:21:30,522][03784] Avg episode reward: [(0, '0.198')] [2024-03-20 23:21:35,189][04017] Updated weights for policy 0, policy_version 3233 (0.0012) [2024-03-20 23:21:35,521][03784] Fps is (10 sec: 62259.2, 60 sec: 51882.6, 300 sec: 50207.2). Total num frames: 105971712. Throughput: 0: 46508.8. Samples: 106883100. Policy #0 lag: (min: 0.0, avg: 48.6, max: 101.0) [2024-03-20 23:21:35,522][03784] Avg episode reward: [(0, '0.223')] [2024-03-20 23:21:40,521][03784] Fps is (10 sec: 36044.8, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 105971712. Throughput: 0: 46524.4. Samples: 107204000. Policy #0 lag: (min: 0.0, avg: 48.6, max: 101.0) [2024-03-20 23:21:40,522][03784] Avg episode reward: [(0, '0.293')] [2024-03-20 23:21:45,521][03784] Fps is (10 sec: 3276.8, 60 sec: 45329.1, 300 sec: 48763.3). Total num frames: 106004480. Throughput: 0: 46851.2. Samples: 107370700. Policy #0 lag: (min: 0.0, avg: 48.7, max: 88.0) [2024-03-20 23:21:45,522][03784] Avg episode reward: [(0, '0.241')] [2024-03-20 23:21:50,521][03784] Fps is (10 sec: 22937.5, 60 sec: 44782.9, 300 sec: 47763.5). Total num frames: 106201088. Throughput: 0: 46762.2. Samples: 107671300. Policy #0 lag: (min: 0.0, avg: 22.2, max: 66.0) [2024-03-20 23:21:50,522][03784] Avg episode reward: [(0, '0.144')] [2024-03-20 23:21:51,419][04017] Updated weights for policy 0, policy_version 3243 (0.0020) [2024-03-20 23:21:55,521][03784] Fps is (10 sec: 55705.2, 60 sec: 47513.6, 300 sec: 47430.3). Total num frames: 106561536. Throughput: 0: 47011.1. Samples: 107961000. Policy #0 lag: (min: 2.0, avg: 28.1, max: 80.0) [2024-03-20 23:21:55,522][03784] Avg episode reward: [(0, '0.300')] [2024-03-20 23:21:55,788][04017] Updated weights for policy 0, policy_version 3253 (0.0018) [2024-03-20 23:22:00,521][03784] Fps is (10 sec: 52429.1, 60 sec: 48605.9, 300 sec: 46986.0). Total num frames: 106725376. Throughput: 0: 47380.1. Samples: 108104700. Policy #0 lag: (min: 0.0, avg: 39.5, max: 95.0) [2024-03-20 23:22:00,522][03784] Avg episode reward: [(0, '0.178')] [2024-03-20 23:22:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000003257_106725376.pth... [2024-03-20 23:22:00,695][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000002909_95322112.pth [2024-03-20 23:22:03,527][04017] Updated weights for policy 0, policy_version 3263 (0.0015) [2024-03-20 23:22:05,521][03784] Fps is (10 sec: 42598.4, 60 sec: 49152.0, 300 sec: 47097.1). Total num frames: 106987520. Throughput: 0: 47560.0. Samples: 108396500. Policy #0 lag: (min: 2.0, avg: 32.6, max: 56.0) [2024-03-20 23:22:05,522][03784] Avg episode reward: [(0, '0.135')] [2024-03-20 23:22:10,456][04017] Updated weights for policy 0, policy_version 3273 (0.0016) [2024-03-20 23:22:10,521][03784] Fps is (10 sec: 52428.7, 60 sec: 46421.3, 300 sec: 47763.6). Total num frames: 107249664. Throughput: 0: 47806.7. Samples: 108700400. Policy #0 lag: (min: 0.0, avg: 32.5, max: 86.0) [2024-03-20 23:22:10,522][03784] Avg episode reward: [(0, '0.166')] [2024-03-20 23:22:13,354][03995] Signal inference workers to stop experience collection... (2200 times) [2024-03-20 23:22:13,410][04017] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-03-20 23:22:13,483][03995] Signal inference workers to resume experience collection... (2200 times) [2024-03-20 23:22:13,484][04017] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-03-20 23:22:14,499][04017] Updated weights for policy 0, policy_version 3283 (0.0015) [2024-03-20 23:22:15,521][03784] Fps is (10 sec: 65535.9, 60 sec: 47513.6, 300 sec: 48541.1). Total num frames: 107642880. Throughput: 0: 48291.1. Samples: 108829900. Policy #0 lag: (min: 0.0, avg: 26.7, max: 49.0) [2024-03-20 23:22:15,522][03784] Avg episode reward: [(0, '0.194')] [2024-03-20 23:22:18,350][04017] Updated weights for policy 0, policy_version 3293 (0.0014) [2024-03-20 23:22:20,521][03784] Fps is (10 sec: 85197.0, 60 sec: 48605.9, 300 sec: 49429.7). Total num frames: 108101632. Throughput: 0: 49420.0. Samples: 109107000. Policy #0 lag: (min: 5.0, avg: 56.6, max: 114.0) [2024-03-20 23:22:20,522][03784] Avg episode reward: [(0, '0.087')] [2024-03-20 23:22:21,601][04017] Updated weights for policy 0, policy_version 3303 (0.0018) [2024-03-20 23:22:25,521][03784] Fps is (10 sec: 68813.3, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 108331008. Throughput: 0: 48817.9. Samples: 109400800. Policy #0 lag: (min: 5.0, avg: 56.6, max: 114.0) [2024-03-20 23:22:25,522][03784] Avg episode reward: [(0, '0.287')] [2024-03-20 23:22:30,521][03784] Fps is (10 sec: 32768.0, 60 sec: 46967.5, 300 sec: 48985.4). Total num frames: 108429312. Throughput: 0: 48497.7. Samples: 109553100. Policy #0 lag: (min: 0.0, avg: 53.8, max: 108.0) [2024-03-20 23:22:30,522][03784] Avg episode reward: [(0, '0.098')] [2024-03-20 23:22:34,441][04017] Updated weights for policy 0, policy_version 3313 (0.0017) [2024-03-20 23:22:35,521][03784] Fps is (10 sec: 29490.9, 60 sec: 44236.8, 300 sec: 48763.2). Total num frames: 108625920. Throughput: 0: 48302.3. Samples: 109844900. Policy #0 lag: (min: 0.0, avg: 46.3, max: 107.0) [2024-03-20 23:22:35,522][03784] Avg episode reward: [(0, '0.159')] [2024-03-20 23:22:40,399][04017] Updated weights for policy 0, policy_version 3323 (0.0023) [2024-03-20 23:22:40,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48605.9, 300 sec: 48874.3). Total num frames: 108888064. Throughput: 0: 47500.0. Samples: 110098500. Policy #0 lag: (min: 0.0, avg: 36.0, max: 72.0) [2024-03-20 23:22:40,522][03784] Avg episode reward: [(0, '0.188')] [2024-03-20 23:22:45,521][03784] Fps is (10 sec: 42598.8, 60 sec: 50790.4, 300 sec: 48207.9). Total num frames: 109051904. Throughput: 0: 47597.8. Samples: 110246600. Policy #0 lag: (min: 0.0, avg: 39.2, max: 102.0) [2024-03-20 23:22:45,522][03784] Avg episode reward: [(0, '0.172')] [2024-03-20 23:22:50,521][03784] Fps is (10 sec: 29491.1, 60 sec: 49698.1, 300 sec: 48096.8). Total num frames: 109182976. Throughput: 0: 47893.3. Samples: 110551700. Policy #0 lag: (min: 0.0, avg: 38.4, max: 86.0) [2024-03-20 23:22:50,522][03784] Avg episode reward: [(0, '0.211')] [2024-03-20 23:22:50,752][04017] Updated weights for policy 0, policy_version 3333 (0.0010) [2024-03-20 23:22:55,521][03784] Fps is (10 sec: 36044.5, 60 sec: 47513.6, 300 sec: 47985.7). Total num frames: 109412352. Throughput: 0: 47184.4. Samples: 110823700. Policy #0 lag: (min: 2.0, avg: 26.4, max: 65.0) [2024-03-20 23:22:55,522][03784] Avg episode reward: [(0, '0.044')] [2024-03-20 23:22:57,504][04017] Updated weights for policy 0, policy_version 3343 (0.0025) [2024-03-20 23:23:00,521][03784] Fps is (10 sec: 52429.2, 60 sec: 49698.2, 300 sec: 47430.3). Total num frames: 109707264. Throughput: 0: 47453.4. Samples: 110965300. Policy #0 lag: (min: 0.0, avg: 40.4, max: 104.0) [2024-03-20 23:23:00,522][03784] Avg episode reward: [(0, '0.270')] [2024-03-20 23:23:00,584][03995] Signal inference workers to stop experience collection... (2250 times) [2024-03-20 23:23:00,595][03995] Signal inference workers to resume experience collection... (2250 times) [2024-03-20 23:23:00,643][04017] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-03-20 23:23:00,688][04017] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-03-20 23:23:02,135][04017] Updated weights for policy 0, policy_version 3353 (0.0017) [2024-03-20 23:23:05,521][03784] Fps is (10 sec: 58982.8, 60 sec: 50244.3, 300 sec: 47985.7). Total num frames: 110002176. Throughput: 0: 48188.9. Samples: 111275500. Policy #0 lag: (min: 0.0, avg: 33.0, max: 77.0) [2024-03-20 23:23:05,522][03784] Avg episode reward: [(0, '0.244')] [2024-03-20 23:23:09,278][04017] Updated weights for policy 0, policy_version 3363 (0.0010) [2024-03-20 23:23:10,521][03784] Fps is (10 sec: 55705.5, 60 sec: 50244.3, 300 sec: 48207.8). Total num frames: 110264320. Throughput: 0: 48340.0. Samples: 111576100. Policy #0 lag: (min: 0.0, avg: 45.9, max: 95.0) [2024-03-20 23:23:10,522][03784] Avg episode reward: [(0, '0.155')] [2024-03-20 23:23:15,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46421.4, 300 sec: 48207.9). Total num frames: 110428160. Throughput: 0: 48331.2. Samples: 111728000. Policy #0 lag: (min: 0.0, avg: 39.6, max: 78.0) [2024-03-20 23:23:15,522][03784] Avg episode reward: [(0, '0.155')] [2024-03-20 23:23:18,993][04017] Updated weights for policy 0, policy_version 3373 (0.0011) [2024-03-20 23:23:20,521][03784] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 48207.8). Total num frames: 110657536. Throughput: 0: 48493.3. Samples: 112027100. Policy #0 lag: (min: 0.0, avg: 35.7, max: 99.0) [2024-03-20 23:23:20,522][03784] Avg episode reward: [(0, '0.216')] [2024-03-20 23:23:22,394][04017] Updated weights for policy 0, policy_version 3383 (0.0024) [2024-03-20 23:23:25,521][03784] Fps is (10 sec: 62259.2, 60 sec: 45329.1, 300 sec: 48541.1). Total num frames: 111050752. Throughput: 0: 48351.2. Samples: 112274300. Policy #0 lag: (min: 1.0, avg: 42.7, max: 92.0) [2024-03-20 23:23:25,522][03784] Avg episode reward: [(0, '0.388')] [2024-03-20 23:23:25,522][03995] Saving new best policy, reward=0.388! [2024-03-20 23:23:27,924][04017] Updated weights for policy 0, policy_version 3393 (0.0024) [2024-03-20 23:23:30,521][03784] Fps is (10 sec: 72089.5, 60 sec: 49151.9, 300 sec: 49096.4). Total num frames: 111378432. Throughput: 0: 48657.6. Samples: 112436200. Policy #0 lag: (min: 0.0, avg: 41.7, max: 88.0) [2024-03-20 23:23:30,522][03784] Avg episode reward: [(0, '0.310')] [2024-03-20 23:23:32,294][04017] Updated weights for policy 0, policy_version 3403 (0.0028) [2024-03-20 23:23:35,521][03784] Fps is (10 sec: 58982.3, 60 sec: 50244.3, 300 sec: 48763.2). Total num frames: 111640576. Throughput: 0: 48549.0. Samples: 112736400. Policy #0 lag: (min: 1.0, avg: 35.9, max: 68.0) [2024-03-20 23:23:35,522][03784] Avg episode reward: [(0, '0.278')] [2024-03-20 23:23:40,521][03784] Fps is (10 sec: 36045.2, 60 sec: 47513.6, 300 sec: 48652.1). Total num frames: 111738880. Throughput: 0: 49200.0. Samples: 113037700. Policy #0 lag: (min: 0.0, avg: 44.3, max: 103.0) [2024-03-20 23:23:40,522][03784] Avg episode reward: [(0, '0.061')] [2024-03-20 23:23:41,556][04017] Updated weights for policy 0, policy_version 3413 (0.0013) [2024-03-20 23:23:45,521][03784] Fps is (10 sec: 36044.4, 60 sec: 49151.9, 300 sec: 48874.3). Total num frames: 112001024. Throughput: 0: 48577.7. Samples: 113151300. Policy #0 lag: (min: 1.0, avg: 45.1, max: 106.0) [2024-03-20 23:23:45,522][03784] Avg episode reward: [(0, '0.118')] [2024-03-20 23:23:48,173][04017] Updated weights for policy 0, policy_version 3423 (0.0010) [2024-03-20 23:23:50,521][03784] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 48985.4). Total num frames: 112230400. Throughput: 0: 48464.4. Samples: 113456400. Policy #0 lag: (min: 0.0, avg: 37.5, max: 96.0) [2024-03-20 23:23:50,531][03784] Avg episode reward: [(0, '0.163')] [2024-03-20 23:23:53,917][03995] Signal inference workers to stop experience collection... (2300 times) [2024-03-20 23:23:53,918][03995] Signal inference workers to resume experience collection... (2300 times) [2024-03-20 23:23:53,971][04017] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-03-20 23:23:53,971][04017] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-03-20 23:23:55,521][03784] Fps is (10 sec: 42598.8, 60 sec: 50244.3, 300 sec: 48985.4). Total num frames: 112427008. Throughput: 0: 48751.1. Samples: 113769900. Policy #0 lag: (min: 0.0, avg: 32.1, max: 66.0) [2024-03-20 23:23:55,531][03784] Avg episode reward: [(0, '0.329')] [2024-03-20 23:23:59,422][04017] Updated weights for policy 0, policy_version 3433 (0.0010) [2024-03-20 23:24:00,521][03784] Fps is (10 sec: 36045.4, 60 sec: 48059.8, 300 sec: 48985.4). Total num frames: 112590848. Throughput: 0: 48997.9. Samples: 113932900. Policy #0 lag: (min: 0.0, avg: 35.9, max: 79.0) [2024-03-20 23:24:00,522][03784] Avg episode reward: [(0, '0.362')] [2024-03-20 23:24:00,808][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000003437_112623616.pth... [2024-03-20 23:24:00,864][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000003084_101056512.pth [2024-03-20 23:24:05,178][04017] Updated weights for policy 0, policy_version 3443 (0.0014) [2024-03-20 23:24:05,521][03784] Fps is (10 sec: 39321.2, 60 sec: 46967.4, 300 sec: 48430.0). Total num frames: 112820224. Throughput: 0: 49035.6. Samples: 114233700. Policy #0 lag: (min: 0.0, avg: 35.9, max: 78.0) [2024-03-20 23:24:05,522][03784] Avg episode reward: [(0, '0.135')] [2024-03-20 23:24:10,521][03784] Fps is (10 sec: 49151.1, 60 sec: 46967.4, 300 sec: 47652.4). Total num frames: 113082368. Throughput: 0: 49706.5. Samples: 114511100. Policy #0 lag: (min: 2.0, avg: 38.4, max: 93.0) [2024-03-20 23:24:10,531][03784] Avg episode reward: [(0, '0.287')] [2024-03-20 23:24:11,453][04017] Updated weights for policy 0, policy_version 3453 (0.0014) [2024-03-20 23:24:15,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45875.2, 300 sec: 47208.1). Total num frames: 113180672. Throughput: 0: 49142.3. Samples: 114647600. Policy #0 lag: (min: 0.0, avg: 28.1, max: 70.0) [2024-03-20 23:24:15,522][03784] Avg episode reward: [(0, '0.155')] [2024-03-20 23:24:19,593][04017] Updated weights for policy 0, policy_version 3463 (0.0012) [2024-03-20 23:24:20,521][03784] Fps is (10 sec: 42598.7, 60 sec: 47513.7, 300 sec: 47652.4). Total num frames: 113508352. Throughput: 0: 48797.7. Samples: 114932300. Policy #0 lag: (min: 3.0, avg: 36.2, max: 94.0) [2024-03-20 23:24:20,522][03784] Avg episode reward: [(0, '0.299')] [2024-03-20 23:24:24,396][04017] Updated weights for policy 0, policy_version 3473 (0.0011) [2024-03-20 23:24:25,521][03784] Fps is (10 sec: 72090.3, 60 sec: 47513.6, 300 sec: 48207.8). Total num frames: 113901568. Throughput: 0: 48289.0. Samples: 115210700. Policy #0 lag: (min: 0.0, avg: 37.5, max: 93.0) [2024-03-20 23:24:25,528][03784] Avg episode reward: [(0, '0.118')] [2024-03-20 23:24:27,639][04017] Updated weights for policy 0, policy_version 3483 (0.0025) [2024-03-20 23:24:30,521][03784] Fps is (10 sec: 78643.3, 60 sec: 48606.0, 300 sec: 49207.5). Total num frames: 114294784. Throughput: 0: 48655.7. Samples: 115340800. Policy #0 lag: (min: 2.0, avg: 47.7, max: 101.0) [2024-03-20 23:24:30,522][03784] Avg episode reward: [(0, '0.118')] [2024-03-20 23:24:34,744][04017] Updated weights for policy 0, policy_version 3493 (0.0010) [2024-03-20 23:24:35,521][03784] Fps is (10 sec: 55704.9, 60 sec: 46967.4, 300 sec: 49096.5). Total num frames: 114458624. Throughput: 0: 48922.2. Samples: 115657900. Policy #0 lag: (min: 1.0, avg: 43.8, max: 70.0) [2024-03-20 23:24:35,522][03784] Avg episode reward: [(0, '0.149')] [2024-03-20 23:24:38,960][03995] Signal inference workers to stop experience collection... (2350 times) [2024-03-20 23:24:39,038][04017] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-03-20 23:24:39,228][03995] Signal inference workers to resume experience collection... (2350 times) [2024-03-20 23:24:39,228][04017] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-03-20 23:24:40,521][03784] Fps is (10 sec: 39321.5, 60 sec: 49152.0, 300 sec: 49207.5). Total num frames: 114688000. Throughput: 0: 48764.4. Samples: 115964300. Policy #0 lag: (min: 0.0, avg: 46.9, max: 86.0) [2024-03-20 23:24:40,522][03784] Avg episode reward: [(0, '0.204')] [2024-03-20 23:24:42,166][04017] Updated weights for policy 0, policy_version 3503 (0.0011) [2024-03-20 23:24:45,521][03784] Fps is (10 sec: 52429.2, 60 sec: 49698.2, 300 sec: 48652.2). Total num frames: 114982912. Throughput: 0: 48199.9. Samples: 116101900. Policy #0 lag: (min: 0.0, avg: 45.8, max: 86.0) [2024-03-20 23:24:45,522][03784] Avg episode reward: [(0, '0.200')] [2024-03-20 23:24:47,495][04017] Updated weights for policy 0, policy_version 3513 (0.0012) [2024-03-20 23:24:50,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49698.1, 300 sec: 48318.9). Total num frames: 115212288. Throughput: 0: 48033.4. Samples: 116395200. Policy #0 lag: (min: 0.0, avg: 48.9, max: 93.0) [2024-03-20 23:24:50,522][03784] Avg episode reward: [(0, '0.173')] [2024-03-20 23:24:55,521][03784] Fps is (10 sec: 36044.8, 60 sec: 48605.9, 300 sec: 48541.1). Total num frames: 115343360. Throughput: 0: 48775.7. Samples: 116706000. Policy #0 lag: (min: 0.0, avg: 42.2, max: 82.0) [2024-03-20 23:24:55,522][03784] Avg episode reward: [(0, '0.150')] [2024-03-20 23:24:56,524][04017] Updated weights for policy 0, policy_version 3523 (0.0014) [2024-03-20 23:25:00,521][03784] Fps is (10 sec: 32767.9, 60 sec: 49151.8, 300 sec: 48207.8). Total num frames: 115539968. Throughput: 0: 48893.3. Samples: 116847800. Policy #0 lag: (min: 0.0, avg: 37.5, max: 84.0) [2024-03-20 23:25:00,522][03784] Avg episode reward: [(0, '0.247')] [2024-03-20 23:25:05,521][03784] Fps is (10 sec: 29491.1, 60 sec: 46967.6, 300 sec: 47319.2). Total num frames: 115638272. Throughput: 0: 49013.3. Samples: 117137900. Policy #0 lag: (min: 0.0, avg: 34.8, max: 85.0) [2024-03-20 23:25:05,522][03784] Avg episode reward: [(0, '0.328')] [2024-03-20 23:25:08,391][04017] Updated weights for policy 0, policy_version 3533 (0.0011) [2024-03-20 23:25:10,521][03784] Fps is (10 sec: 39322.0, 60 sec: 47513.7, 300 sec: 47874.6). Total num frames: 115933184. Throughput: 0: 48862.2. Samples: 117409500. Policy #0 lag: (min: 2.0, avg: 25.4, max: 57.0) [2024-03-20 23:25:10,522][03784] Avg episode reward: [(0, '0.166')] [2024-03-20 23:25:12,127][04017] Updated weights for policy 0, policy_version 3543 (0.0016) [2024-03-20 23:25:15,521][03784] Fps is (10 sec: 58982.3, 60 sec: 50790.4, 300 sec: 47763.5). Total num frames: 116228096. Throughput: 0: 48917.8. Samples: 117542100. Policy #0 lag: (min: 0.0, avg: 40.1, max: 88.0) [2024-03-20 23:25:15,522][03784] Avg episode reward: [(0, '0.235')] [2024-03-20 23:25:17,987][04017] Updated weights for policy 0, policy_version 3553 (0.0018) [2024-03-20 23:25:20,521][03784] Fps is (10 sec: 62259.1, 60 sec: 50790.4, 300 sec: 48207.8). Total num frames: 116555776. Throughput: 0: 48515.6. Samples: 117841100. Policy #0 lag: (min: 0.0, avg: 40.3, max: 79.0) [2024-03-20 23:25:20,522][03784] Avg episode reward: [(0, '0.371')] [2024-03-20 23:25:25,521][03784] Fps is (10 sec: 36044.5, 60 sec: 44782.8, 300 sec: 47430.3). Total num frames: 116588544. Throughput: 0: 48426.6. Samples: 118143500. Policy #0 lag: (min: 0.0, avg: 32.7, max: 64.0) [2024-03-20 23:25:25,531][03784] Avg episode reward: [(0, '0.380')] [2024-03-20 23:25:28,039][04017] Updated weights for policy 0, policy_version 3563 (0.0014) [2024-03-20 23:25:30,521][03784] Fps is (10 sec: 36044.5, 60 sec: 43690.6, 300 sec: 47652.4). Total num frames: 116916224. Throughput: 0: 48217.7. Samples: 118271700. Policy #0 lag: (min: 0.0, avg: 32.7, max: 64.0) [2024-03-20 23:25:30,531][03784] Avg episode reward: [(0, '0.280')] [2024-03-20 23:25:32,269][04017] Updated weights for policy 0, policy_version 3573 (0.0016) [2024-03-20 23:25:32,539][03995] Signal inference workers to stop experience collection... (2400 times) [2024-03-20 23:25:32,591][04017] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-03-20 23:25:32,607][03995] Signal inference workers to resume experience collection... (2400 times) [2024-03-20 23:25:32,627][04017] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-03-20 23:25:35,521][03784] Fps is (10 sec: 72091.0, 60 sec: 47513.7, 300 sec: 48541.1). Total num frames: 117309440. Throughput: 0: 47926.8. Samples: 118551900. Policy #0 lag: (min: 0.0, avg: 41.9, max: 99.0) [2024-03-20 23:25:35,522][03784] Avg episode reward: [(0, '0.299')] [2024-03-20 23:25:37,227][04017] Updated weights for policy 0, policy_version 3583 (0.0012) [2024-03-20 23:25:40,521][03784] Fps is (10 sec: 68813.5, 60 sec: 48605.9, 300 sec: 48541.1). Total num frames: 117604352. Throughput: 0: 46928.9. Samples: 118817800. Policy #0 lag: (min: 1.0, avg: 51.2, max: 97.0) [2024-03-20 23:25:40,521][03784] Avg episode reward: [(0, '0.303')] [2024-03-20 23:25:41,945][04017] Updated weights for policy 0, policy_version 3593 (0.0012) [2024-03-20 23:25:45,521][03784] Fps is (10 sec: 52428.4, 60 sec: 47513.6, 300 sec: 48541.1). Total num frames: 117833728. Throughput: 0: 46689.0. Samples: 118948800. Policy #0 lag: (min: 0.0, avg: 41.9, max: 77.0) [2024-03-20 23:25:45,522][03784] Avg episode reward: [(0, '0.155')] [2024-03-20 23:25:48,783][04017] Updated weights for policy 0, policy_version 3603 (0.0012) [2024-03-20 23:25:50,521][03784] Fps is (10 sec: 55705.5, 60 sec: 49152.0, 300 sec: 48985.4). Total num frames: 118161408. Throughput: 0: 46537.8. Samples: 119232100. Policy #0 lag: (min: 1.0, avg: 39.5, max: 80.0) [2024-03-20 23:25:50,522][03784] Avg episode reward: [(0, '0.299')] [2024-03-20 23:25:55,521][03784] Fps is (10 sec: 52428.8, 60 sec: 50244.3, 300 sec: 49318.6). Total num frames: 118358016. Throughput: 0: 46315.6. Samples: 119493700. Policy #0 lag: (min: 1.0, avg: 39.5, max: 80.0) [2024-03-20 23:25:55,522][03784] Avg episode reward: [(0, '0.318')] [2024-03-20 23:25:56,555][04017] Updated weights for policy 0, policy_version 3613 (0.0022) [2024-03-20 23:26:00,521][03784] Fps is (10 sec: 39321.5, 60 sec: 50244.3, 300 sec: 49207.6). Total num frames: 118554624. Throughput: 0: 46700.0. Samples: 119643600. Policy #0 lag: (min: 0.0, avg: 44.7, max: 92.0) [2024-03-20 23:26:00,522][03784] Avg episode reward: [(0, '0.275')] [2024-03-20 23:26:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000003618_118554624.pth... [2024-03-20 23:26:00,663][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000003257_106725376.pth [2024-03-20 23:26:02,894][04017] Updated weights for policy 0, policy_version 3623 (0.0010) [2024-03-20 23:26:05,521][03784] Fps is (10 sec: 45875.3, 60 sec: 52975.0, 300 sec: 48652.2). Total num frames: 118816768. Throughput: 0: 46302.3. Samples: 119924700. Policy #0 lag: (min: 0.0, avg: 37.9, max: 79.0) [2024-03-20 23:26:05,522][03784] Avg episode reward: [(0, '0.360')] [2024-03-20 23:26:10,521][03784] Fps is (10 sec: 32767.9, 60 sec: 49152.0, 300 sec: 47763.5). Total num frames: 118882304. Throughput: 0: 46773.4. Samples: 120248300. Policy #0 lag: (min: 0.0, avg: 45.8, max: 97.0) [2024-03-20 23:26:10,522][03784] Avg episode reward: [(0, '0.350')] [2024-03-20 23:26:15,521][03784] Fps is (10 sec: 19660.7, 60 sec: 46421.3, 300 sec: 46874.9). Total num frames: 119013376. Throughput: 0: 47342.3. Samples: 120402100. Policy #0 lag: (min: 0.0, avg: 31.1, max: 72.0) [2024-03-20 23:26:15,522][03784] Avg episode reward: [(0, '0.254')] [2024-03-20 23:26:16,636][04017] Updated weights for policy 0, policy_version 3633 (0.0010) [2024-03-20 23:26:20,521][03784] Fps is (10 sec: 32768.1, 60 sec: 44236.8, 300 sec: 46986.0). Total num frames: 119209984. Throughput: 0: 47437.7. Samples: 120686600. Policy #0 lag: (min: 0.0, avg: 26.2, max: 91.0) [2024-03-20 23:26:20,522][03784] Avg episode reward: [(0, '0.336')] [2024-03-20 23:26:23,334][04017] Updated weights for policy 0, policy_version 3643 (0.0010) [2024-03-20 23:26:25,521][03784] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 47097.1). Total num frames: 119504896. Throughput: 0: 47608.8. Samples: 120960200. Policy #0 lag: (min: 0.0, avg: 26.0, max: 100.0) [2024-03-20 23:26:25,522][03784] Avg episode reward: [(0, '0.166')] [2024-03-20 23:26:29,315][03995] Signal inference workers to stop experience collection... (2450 times) [2024-03-20 23:26:29,315][03995] Signal inference workers to resume experience collection... (2450 times) [2024-03-20 23:26:29,354][04017] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-03-20 23:26:29,354][04017] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-03-20 23:26:29,649][04017] Updated weights for policy 0, policy_version 3653 (0.0012) [2024-03-20 23:26:30,521][03784] Fps is (10 sec: 49151.8, 60 sec: 46421.4, 300 sec: 46541.7). Total num frames: 119701504. Throughput: 0: 47902.2. Samples: 121104400. Policy #0 lag: (min: 0.0, avg: 34.0, max: 69.0) [2024-03-20 23:26:30,522][03784] Avg episode reward: [(0, '0.195')] [2024-03-20 23:26:35,500][04017] Updated weights for policy 0, policy_version 3663 (0.0015) [2024-03-20 23:26:35,521][03784] Fps is (10 sec: 52429.5, 60 sec: 45329.1, 300 sec: 47652.5). Total num frames: 120029184. Throughput: 0: 47726.8. Samples: 121379800. Policy #0 lag: (min: 1.0, avg: 25.8, max: 54.0) [2024-03-20 23:26:35,521][03784] Avg episode reward: [(0, '0.073')] [2024-03-20 23:26:39,310][04017] Updated weights for policy 0, policy_version 3673 (0.0010) [2024-03-20 23:26:40,521][03784] Fps is (10 sec: 75366.7, 60 sec: 47513.6, 300 sec: 48985.4). Total num frames: 120455168. Throughput: 0: 47397.8. Samples: 121626600. Policy #0 lag: (min: 1.0, avg: 38.6, max: 73.0) [2024-03-20 23:26:40,522][03784] Avg episode reward: [(0, '0.274')] [2024-03-20 23:26:45,500][04017] Updated weights for policy 0, policy_version 3683 (0.0011) [2024-03-20 23:26:45,521][03784] Fps is (10 sec: 65534.3, 60 sec: 47513.5, 300 sec: 49096.5). Total num frames: 120684544. Throughput: 0: 47262.1. Samples: 121770400. Policy #0 lag: (min: 0.0, avg: 38.5, max: 107.0) [2024-03-20 23:26:45,522][03784] Avg episode reward: [(0, '0.382')] [2024-03-20 23:26:49,388][04017] Updated weights for policy 0, policy_version 3693 (0.0015) [2024-03-20 23:26:50,521][03784] Fps is (10 sec: 65535.3, 60 sec: 49151.9, 300 sec: 49318.6). Total num frames: 121110528. Throughput: 0: 46491.0. Samples: 122016800. Policy #0 lag: (min: 3.0, avg: 41.1, max: 67.0) [2024-03-20 23:26:50,522][03784] Avg episode reward: [(0, '0.246')] [2024-03-20 23:26:55,521][03784] Fps is (10 sec: 55706.3, 60 sec: 48059.7, 300 sec: 49207.5). Total num frames: 121241600. Throughput: 0: 45922.2. Samples: 122314800. Policy #0 lag: (min: 0.0, avg: 49.7, max: 107.0) [2024-03-20 23:26:55,522][03784] Avg episode reward: [(0, '0.428')] [2024-03-20 23:26:55,747][03995] Saving new best policy, reward=0.428! [2024-03-20 23:26:56,480][04017] Updated weights for policy 0, policy_version 3703 (0.0024) [2024-03-20 23:27:00,521][03784] Fps is (10 sec: 42599.0, 60 sec: 49698.2, 300 sec: 49318.6). Total num frames: 121536512. Throughput: 0: 45531.2. Samples: 122451000. Policy #0 lag: (min: 1.0, avg: 52.1, max: 109.0) [2024-03-20 23:27:00,522][03784] Avg episode reward: [(0, '0.115')] [2024-03-20 23:27:05,521][03784] Fps is (10 sec: 29491.4, 60 sec: 45329.1, 300 sec: 48430.0). Total num frames: 121536512. Throughput: 0: 46106.7. Samples: 122761400. Policy #0 lag: (min: 1.0, avg: 52.1, max: 109.0) [2024-03-20 23:27:05,522][03784] Avg episode reward: [(0, '0.252')] [2024-03-20 23:27:10,521][03784] Fps is (10 sec: 9830.3, 60 sec: 45875.2, 300 sec: 47430.3). Total num frames: 121634816. Throughput: 0: 46124.4. Samples: 123035800. Policy #0 lag: (min: 0.0, avg: 31.9, max: 75.0) [2024-03-20 23:27:10,522][03784] Avg episode reward: [(0, '0.257')] [2024-03-20 23:27:11,640][04017] Updated weights for policy 0, policy_version 3713 (0.0015) [2024-03-20 23:27:15,521][03784] Fps is (10 sec: 19660.7, 60 sec: 45329.1, 300 sec: 46208.4). Total num frames: 121733120. Throughput: 0: 45913.3. Samples: 123170500. Policy #0 lag: (min: 0.0, avg: 37.2, max: 82.0) [2024-03-20 23:27:15,522][03784] Avg episode reward: [(0, '0.104')] [2024-03-20 23:27:17,383][03995] Signal inference workers to stop experience collection... (2500 times) [2024-03-20 23:27:17,495][04017] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-03-20 23:27:17,583][03995] Signal inference workers to resume experience collection... (2500 times) [2024-03-20 23:27:17,584][04017] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-03-20 23:27:18,613][04017] Updated weights for policy 0, policy_version 3723 (0.0016) [2024-03-20 23:27:20,521][03784] Fps is (10 sec: 45875.6, 60 sec: 48059.8, 300 sec: 46652.7). Total num frames: 122093568. Throughput: 0: 46199.9. Samples: 123458800. Policy #0 lag: (min: 2.0, avg: 30.9, max: 75.0) [2024-03-20 23:27:20,522][03784] Avg episode reward: [(0, '0.308')] [2024-03-20 23:27:25,490][04017] Updated weights for policy 0, policy_version 3733 (0.0013) [2024-03-20 23:27:25,521][03784] Fps is (10 sec: 58982.0, 60 sec: 46967.4, 300 sec: 47097.0). Total num frames: 122322944. Throughput: 0: 47111.0. Samples: 123746600. Policy #0 lag: (min: 0.0, avg: 36.0, max: 86.0) [2024-03-20 23:27:25,522][03784] Avg episode reward: [(0, '0.334')] [2024-03-20 23:27:30,493][04017] Updated weights for policy 0, policy_version 3743 (0.0024) [2024-03-20 23:27:30,521][03784] Fps is (10 sec: 55704.6, 60 sec: 49151.9, 300 sec: 47541.4). Total num frames: 122650624. Throughput: 0: 47057.8. Samples: 123888000. Policy #0 lag: (min: 2.0, avg: 30.3, max: 65.0) [2024-03-20 23:27:30,522][03784] Avg episode reward: [(0, '0.341')] [2024-03-20 23:27:35,521][03784] Fps is (10 sec: 58982.7, 60 sec: 48059.6, 300 sec: 47541.4). Total num frames: 122912768. Throughput: 0: 47977.8. Samples: 124175800. Policy #0 lag: (min: 1.0, avg: 36.3, max: 77.0) [2024-03-20 23:27:35,522][03784] Avg episode reward: [(0, '0.112')] [2024-03-20 23:27:36,178][04017] Updated weights for policy 0, policy_version 3753 (0.0014) [2024-03-20 23:27:40,310][04017] Updated weights for policy 0, policy_version 3763 (0.0016) [2024-03-20 23:27:40,521][03784] Fps is (10 sec: 65536.4, 60 sec: 47513.5, 300 sec: 48318.9). Total num frames: 123305984. Throughput: 0: 47415.5. Samples: 124448500. Policy #0 lag: (min: 0.0, avg: 38.8, max: 73.0) [2024-03-20 23:27:40,522][03784] Avg episode reward: [(0, '0.458')] [2024-03-20 23:27:40,677][03995] Saving new best policy, reward=0.458! [2024-03-20 23:27:45,368][04017] Updated weights for policy 0, policy_version 3773 (0.0011) [2024-03-20 23:27:45,521][03784] Fps is (10 sec: 72089.5, 60 sec: 49152.1, 300 sec: 48985.4). Total num frames: 123633664. Throughput: 0: 47586.6. Samples: 124592400. Policy #0 lag: (min: 1.0, avg: 45.6, max: 86.0) [2024-03-20 23:27:45,522][03784] Avg episode reward: [(0, '0.360')] [2024-03-20 23:27:50,521][03784] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 48541.1). Total num frames: 123731968. Throughput: 0: 47513.3. Samples: 124899500. Policy #0 lag: (min: 1.0, avg: 45.6, max: 86.0) [2024-03-20 23:27:50,522][03784] Avg episode reward: [(0, '0.236')] [2024-03-20 23:27:55,521][03784] Fps is (10 sec: 19660.7, 60 sec: 43144.5, 300 sec: 47874.6). Total num frames: 123830272. Throughput: 0: 48626.6. Samples: 125224000. Policy #0 lag: (min: 0.0, avg: 47.9, max: 91.0) [2024-03-20 23:27:55,522][03784] Avg episode reward: [(0, '0.232')] [2024-03-20 23:27:58,441][04017] Updated weights for policy 0, policy_version 3783 (0.0015) [2024-03-20 23:28:00,521][03784] Fps is (10 sec: 42598.0, 60 sec: 43690.6, 300 sec: 47985.7). Total num frames: 124157952. Throughput: 0: 48975.5. Samples: 125374400. Policy #0 lag: (min: 1.0, avg: 40.3, max: 100.0) [2024-03-20 23:28:00,522][03784] Avg episode reward: [(0, '0.293')] [2024-03-20 23:28:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000003789_124157952.pth... [2024-03-20 23:28:00,652][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000003437_112623616.pth [2024-03-20 23:28:03,720][04017] Updated weights for policy 0, policy_version 3793 (0.0011) [2024-03-20 23:28:03,770][03995] Signal inference workers to stop experience collection... (2550 times) [2024-03-20 23:28:03,771][03995] Signal inference workers to resume experience collection... (2550 times) [2024-03-20 23:28:03,844][04017] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-03-20 23:28:03,844][04017] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-03-20 23:28:05,521][03784] Fps is (10 sec: 55706.6, 60 sec: 47513.6, 300 sec: 47874.6). Total num frames: 124387328. Throughput: 0: 48531.1. Samples: 125642700. Policy #0 lag: (min: 1.0, avg: 42.7, max: 79.0) [2024-03-20 23:28:05,522][03784] Avg episode reward: [(0, '0.244')] [2024-03-20 23:28:09,048][04017] Updated weights for policy 0, policy_version 3803 (0.0015) [2024-03-20 23:28:10,521][03784] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 48207.8). Total num frames: 124649472. Throughput: 0: 47575.7. Samples: 125887500. Policy #0 lag: (min: 1.0, avg: 37.8, max: 81.0) [2024-03-20 23:28:10,522][03784] Avg episode reward: [(0, '0.303')] [2024-03-20 23:28:15,521][03784] Fps is (10 sec: 42598.2, 60 sec: 51336.6, 300 sec: 47985.7). Total num frames: 124813312. Throughput: 0: 47857.9. Samples: 126041600. Policy #0 lag: (min: 0.0, avg: 40.2, max: 87.0) [2024-03-20 23:28:15,522][03784] Avg episode reward: [(0, '0.328')] [2024-03-20 23:28:18,998][04017] Updated weights for policy 0, policy_version 3813 (0.0016) [2024-03-20 23:28:20,521][03784] Fps is (10 sec: 39321.6, 60 sec: 49152.0, 300 sec: 47430.3). Total num frames: 125042688. Throughput: 0: 48071.2. Samples: 126339000. Policy #0 lag: (min: 1.0, avg: 30.6, max: 81.0) [2024-03-20 23:28:20,522][03784] Avg episode reward: [(0, '0.311')] [2024-03-20 23:28:22,856][04017] Updated weights for policy 0, policy_version 3823 (0.0009) [2024-03-20 23:28:25,521][03784] Fps is (10 sec: 55705.4, 60 sec: 50790.5, 300 sec: 47430.3). Total num frames: 125370368. Throughput: 0: 48148.9. Samples: 126615200. Policy #0 lag: (min: 0.0, avg: 41.3, max: 81.0) [2024-03-20 23:28:25,522][03784] Avg episode reward: [(0, '0.356')] [2024-03-20 23:28:30,521][03784] Fps is (10 sec: 52428.4, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 125566976. Throughput: 0: 47744.4. Samples: 126740900. Policy #0 lag: (min: 1.0, avg: 41.1, max: 103.0) [2024-03-20 23:28:30,522][03784] Avg episode reward: [(0, '0.368')] [2024-03-20 23:28:31,163][04017] Updated weights for policy 0, policy_version 3833 (0.0013) [2024-03-20 23:28:35,521][03784] Fps is (10 sec: 39321.1, 60 sec: 47513.5, 300 sec: 47541.3). Total num frames: 125763584. Throughput: 0: 48124.3. Samples: 127065100. Policy #0 lag: (min: 0.0, avg: 38.4, max: 103.0) [2024-03-20 23:28:35,522][03784] Avg episode reward: [(0, '0.143')] [2024-03-20 23:28:36,980][04017] Updated weights for policy 0, policy_version 3843 (0.0010) [2024-03-20 23:28:40,521][03784] Fps is (10 sec: 42598.9, 60 sec: 44783.0, 300 sec: 47430.3). Total num frames: 125992960. Throughput: 0: 47469.0. Samples: 127360100. Policy #0 lag: (min: 0.0, avg: 45.4, max: 104.0) [2024-03-20 23:28:40,522][03784] Avg episode reward: [(0, '0.224')] [2024-03-20 23:28:45,521][03784] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 47208.1). Total num frames: 126156800. Throughput: 0: 47488.9. Samples: 127511400. Policy #0 lag: (min: 0.0, avg: 34.0, max: 92.0) [2024-03-20 23:28:45,522][03784] Avg episode reward: [(0, '0.233')] [2024-03-20 23:28:46,352][04017] Updated weights for policy 0, policy_version 3853 (0.0013) [2024-03-20 23:28:50,521][03784] Fps is (10 sec: 49152.0, 60 sec: 45875.2, 300 sec: 47652.5). Total num frames: 126484480. Throughput: 0: 47688.9. Samples: 127788700. Policy #0 lag: (min: 1.0, avg: 46.3, max: 88.0) [2024-03-20 23:28:50,522][03784] Avg episode reward: [(0, '0.331')] [2024-03-20 23:28:55,154][04017] Updated weights for policy 0, policy_version 3863 (0.0015) [2024-03-20 23:28:55,521][03784] Fps is (10 sec: 45875.5, 60 sec: 46421.4, 300 sec: 47541.3). Total num frames: 126615552. Throughput: 0: 48777.7. Samples: 128082500. Policy #0 lag: (min: 0.0, avg: 47.9, max: 89.0) [2024-03-20 23:28:55,522][03784] Avg episode reward: [(0, '0.366')] [2024-03-20 23:28:58,744][03995] Signal inference workers to stop experience collection... (2600 times) [2024-03-20 23:28:58,811][03995] Signal inference workers to resume experience collection... (2600 times) [2024-03-20 23:28:58,836][04017] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-03-20 23:28:58,887][04017] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-03-20 23:29:00,521][03784] Fps is (10 sec: 36044.5, 60 sec: 44783.0, 300 sec: 47541.4). Total num frames: 126844928. Throughput: 0: 48695.5. Samples: 128232900. Policy #0 lag: (min: 1.0, avg: 38.1, max: 77.0) [2024-03-20 23:29:00,522][03784] Avg episode reward: [(0, '0.306')] [2024-03-20 23:29:01,521][04017] Updated weights for policy 0, policy_version 3873 (0.0011) [2024-03-20 23:29:05,033][04017] Updated weights for policy 0, policy_version 3883 (0.0012) [2024-03-20 23:29:05,521][03784] Fps is (10 sec: 65535.7, 60 sec: 48059.6, 300 sec: 48096.8). Total num frames: 127270912. Throughput: 0: 48424.3. Samples: 128518100. Policy #0 lag: (min: 2.0, avg: 45.1, max: 97.0) [2024-03-20 23:29:05,522][03784] Avg episode reward: [(0, '0.306')] [2024-03-20 23:29:08,668][04017] Updated weights for policy 0, policy_version 3893 (0.0011) [2024-03-20 23:29:10,521][03784] Fps is (10 sec: 85197.3, 60 sec: 50790.4, 300 sec: 49207.5). Total num frames: 127696896. Throughput: 0: 47651.2. Samples: 128759500. Policy #0 lag: (min: 1.0, avg: 44.6, max: 80.0) [2024-03-20 23:29:10,522][03784] Avg episode reward: [(0, '0.374')] [2024-03-20 23:29:14,701][04017] Updated weights for policy 0, policy_version 3903 (0.0009) [2024-03-20 23:29:15,521][03784] Fps is (10 sec: 65536.6, 60 sec: 51882.7, 300 sec: 48874.3). Total num frames: 127926272. Throughput: 0: 48204.5. Samples: 128910100. Policy #0 lag: (min: 0.0, avg: 49.4, max: 105.0) [2024-03-20 23:29:15,522][03784] Avg episode reward: [(0, '0.146')] [2024-03-20 23:29:20,521][03784] Fps is (10 sec: 39321.6, 60 sec: 50790.4, 300 sec: 48096.8). Total num frames: 128090112. Throughput: 0: 47506.8. Samples: 129202900. Policy #0 lag: (min: 0.0, avg: 37.7, max: 75.0) [2024-03-20 23:29:20,522][03784] Avg episode reward: [(0, '0.171')] [2024-03-20 23:29:21,810][04017] Updated weights for policy 0, policy_version 3913 (0.0010) [2024-03-20 23:29:25,521][03784] Fps is (10 sec: 32768.1, 60 sec: 48059.8, 300 sec: 47319.2). Total num frames: 128253952. Throughput: 0: 47288.9. Samples: 129488100. Policy #0 lag: (min: 0.0, avg: 37.7, max: 75.0) [2024-03-20 23:29:25,522][03784] Avg episode reward: [(0, '0.196')] [2024-03-20 23:29:30,521][03784] Fps is (10 sec: 19660.7, 60 sec: 45329.1, 300 sec: 46874.9). Total num frames: 128286720. Throughput: 0: 47260.1. Samples: 129638100. Policy #0 lag: (min: 0.0, avg: 29.6, max: 89.0) [2024-03-20 23:29:30,522][03784] Avg episode reward: [(0, '0.379')] [2024-03-20 23:29:35,521][03784] Fps is (10 sec: 22937.4, 60 sec: 45329.2, 300 sec: 46763.8). Total num frames: 128483328. Throughput: 0: 47691.0. Samples: 129934800. Policy #0 lag: (min: 0.0, avg: 30.9, max: 71.0) [2024-03-20 23:29:35,522][03784] Avg episode reward: [(0, '0.456')] [2024-03-20 23:29:36,632][04017] Updated weights for policy 0, policy_version 3923 (0.0009) [2024-03-20 23:29:40,521][03784] Fps is (10 sec: 45875.6, 60 sec: 45875.2, 300 sec: 46652.7). Total num frames: 128745472. Throughput: 0: 47066.7. Samples: 130200500. Policy #0 lag: (min: 1.0, avg: 30.7, max: 94.0) [2024-03-20 23:29:40,522][03784] Avg episode reward: [(0, '0.495')] [2024-03-20 23:29:40,534][03995] Saving new best policy, reward=0.495! [2024-03-20 23:29:43,254][04017] Updated weights for policy 0, policy_version 3933 (0.0011) [2024-03-20 23:29:45,521][03784] Fps is (10 sec: 58982.6, 60 sec: 48605.9, 300 sec: 46986.0). Total num frames: 129073152. Throughput: 0: 46564.5. Samples: 130328300. Policy #0 lag: (min: 2.0, avg: 36.2, max: 94.0) [2024-03-20 23:29:45,522][03784] Avg episode reward: [(0, '0.266')] [2024-03-20 23:29:49,505][04017] Updated weights for policy 0, policy_version 3943 (0.0013) [2024-03-20 23:29:50,105][03995] Signal inference workers to stop experience collection... (2650 times) [2024-03-20 23:29:50,189][04017] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-03-20 23:29:50,228][03995] Signal inference workers to resume experience collection... (2650 times) [2024-03-20 23:29:50,239][04017] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-03-20 23:29:50,521][03784] Fps is (10 sec: 49151.6, 60 sec: 45875.1, 300 sec: 47097.0). Total num frames: 129236992. Throughput: 0: 46722.3. Samples: 130620600. Policy #0 lag: (min: 0.0, avg: 42.7, max: 96.0) [2024-03-20 23:29:50,522][03784] Avg episode reward: [(0, '0.145')] [2024-03-20 23:29:54,552][04017] Updated weights for policy 0, policy_version 3953 (0.0018) [2024-03-20 23:29:55,521][03784] Fps is (10 sec: 52428.9, 60 sec: 49698.2, 300 sec: 47652.5). Total num frames: 129597440. Throughput: 0: 47066.7. Samples: 130877500. Policy #0 lag: (min: 1.0, avg: 36.5, max: 83.0) [2024-03-20 23:29:55,522][03784] Avg episode reward: [(0, '0.158')] [2024-03-20 23:29:59,206][04017] Updated weights for policy 0, policy_version 3963 (0.0017) [2024-03-20 23:30:00,521][03784] Fps is (10 sec: 65536.1, 60 sec: 50790.4, 300 sec: 48318.9). Total num frames: 129892352. Throughput: 0: 46848.8. Samples: 131018300. Policy #0 lag: (min: 4.0, avg: 31.6, max: 72.0) [2024-03-20 23:30:00,522][03784] Avg episode reward: [(0, '0.179')] [2024-03-20 23:30:00,589][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000003965_129925120.pth... [2024-03-20 23:30:00,674][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000003618_118554624.pth [2024-03-20 23:30:04,197][04017] Updated weights for policy 0, policy_version 3973 (0.0015) [2024-03-20 23:30:05,521][03784] Fps is (10 sec: 68812.4, 60 sec: 50244.3, 300 sec: 48652.1). Total num frames: 130285568. Throughput: 0: 45977.7. Samples: 131271900. Policy #0 lag: (min: 1.0, avg: 42.1, max: 75.0) [2024-03-20 23:30:05,522][03784] Avg episode reward: [(0, '0.453')] [2024-03-20 23:30:10,521][03784] Fps is (10 sec: 55705.1, 60 sec: 45875.1, 300 sec: 48207.8). Total num frames: 130449408. Throughput: 0: 45773.2. Samples: 131547900. Policy #0 lag: (min: 0.0, avg: 36.2, max: 70.0) [2024-03-20 23:30:10,522][03784] Avg episode reward: [(0, '0.433')] [2024-03-20 23:30:14,317][04017] Updated weights for policy 0, policy_version 3983 (0.0010) [2024-03-20 23:30:15,521][03784] Fps is (10 sec: 26214.1, 60 sec: 43690.6, 300 sec: 47430.3). Total num frames: 130547712. Throughput: 0: 45862.1. Samples: 131701900. Policy #0 lag: (min: 0.0, avg: 51.4, max: 106.0) [2024-03-20 23:30:15,522][03784] Avg episode reward: [(0, '0.461')] [2024-03-20 23:30:20,521][03784] Fps is (10 sec: 29491.6, 60 sec: 44236.8, 300 sec: 47985.7). Total num frames: 130744320. Throughput: 0: 45473.4. Samples: 131981100. Policy #0 lag: (min: 0.0, avg: 41.8, max: 77.0) [2024-03-20 23:30:20,522][03784] Avg episode reward: [(0, '0.131')] [2024-03-20 23:30:21,396][04017] Updated weights for policy 0, policy_version 3993 (0.0010) [2024-03-20 23:30:25,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46421.2, 300 sec: 47874.6). Total num frames: 131039232. Throughput: 0: 45773.1. Samples: 132260300. Policy #0 lag: (min: 0.0, avg: 46.2, max: 92.0) [2024-03-20 23:30:25,522][03784] Avg episode reward: [(0, '0.199')] [2024-03-20 23:30:27,283][04017] Updated weights for policy 0, policy_version 4003 (0.0020) [2024-03-20 23:30:30,521][03784] Fps is (10 sec: 45875.2, 60 sec: 48605.9, 300 sec: 47097.0). Total num frames: 131203072. Throughput: 0: 45704.4. Samples: 132385000. Policy #0 lag: (min: 0.0, avg: 32.6, max: 86.0) [2024-03-20 23:30:30,522][03784] Avg episode reward: [(0, '0.427')] [2024-03-20 23:30:35,521][03784] Fps is (10 sec: 36045.2, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 131399680. Throughput: 0: 45595.6. Samples: 132672400. Policy #0 lag: (min: 1.0, avg: 38.4, max: 86.0) [2024-03-20 23:30:35,522][03784] Avg episode reward: [(0, '0.303')] [2024-03-20 23:30:40,134][04017] Updated weights for policy 0, policy_version 4013 (0.0010) [2024-03-20 23:30:40,521][03784] Fps is (10 sec: 32767.7, 60 sec: 46421.2, 300 sec: 46430.6). Total num frames: 131530752. Throughput: 0: 46177.7. Samples: 132955500. Policy #0 lag: (min: 1.0, avg: 24.2, max: 61.0) [2024-03-20 23:30:40,522][03784] Avg episode reward: [(0, '0.233')] [2024-03-20 23:30:41,787][03995] Signal inference workers to stop experience collection... (2700 times) [2024-03-20 23:30:41,860][04017] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-03-20 23:30:42,050][03995] Signal inference workers to resume experience collection... (2700 times) [2024-03-20 23:30:42,050][04017] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-03-20 23:30:44,168][04017] Updated weights for policy 0, policy_version 4023 (0.0010) [2024-03-20 23:30:45,521][03784] Fps is (10 sec: 42598.8, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 131825664. Throughput: 0: 45760.1. Samples: 133077500. Policy #0 lag: (min: 1.0, avg: 39.2, max: 82.0) [2024-03-20 23:30:45,522][03784] Avg episode reward: [(0, '0.306')] [2024-03-20 23:30:50,521][03784] Fps is (10 sec: 52428.9, 60 sec: 46967.4, 300 sec: 46430.6). Total num frames: 132055040. Throughput: 0: 46844.4. Samples: 133379900. Policy #0 lag: (min: 0.0, avg: 37.3, max: 105.0) [2024-03-20 23:30:50,522][03784] Avg episode reward: [(0, '0.238')] [2024-03-20 23:30:51,490][04017] Updated weights for policy 0, policy_version 4033 (0.0011) [2024-03-20 23:30:55,521][03784] Fps is (10 sec: 55705.0, 60 sec: 46421.3, 300 sec: 46874.9). Total num frames: 132382720. Throughput: 0: 46244.5. Samples: 133628900. Policy #0 lag: (min: 0.0, avg: 37.3, max: 105.0) [2024-03-20 23:30:55,522][03784] Avg episode reward: [(0, '0.111')] [2024-03-20 23:30:56,455][04017] Updated weights for policy 0, policy_version 4043 (0.0020) [2024-03-20 23:31:00,521][03784] Fps is (10 sec: 62259.7, 60 sec: 46421.4, 300 sec: 46986.0). Total num frames: 132677632. Throughput: 0: 45962.4. Samples: 133770200. Policy #0 lag: (min: 0.0, avg: 41.4, max: 83.0) [2024-03-20 23:31:00,522][03784] Avg episode reward: [(0, '0.111')] [2024-03-20 23:31:03,094][04017] Updated weights for policy 0, policy_version 4053 (0.0011) [2024-03-20 23:31:05,521][03784] Fps is (10 sec: 52429.3, 60 sec: 43690.7, 300 sec: 47541.4). Total num frames: 132907008. Throughput: 0: 45408.9. Samples: 134024500. Policy #0 lag: (min: 3.0, avg: 52.0, max: 105.0) [2024-03-20 23:31:05,522][03784] Avg episode reward: [(0, '0.299')] [2024-03-20 23:31:10,521][03784] Fps is (10 sec: 36044.7, 60 sec: 43144.6, 300 sec: 47541.4). Total num frames: 133038080. Throughput: 0: 45177.9. Samples: 134293300. Policy #0 lag: (min: 0.0, avg: 34.3, max: 69.0) [2024-03-20 23:31:10,522][03784] Avg episode reward: [(0, '0.187')] [2024-03-20 23:31:12,230][04017] Updated weights for policy 0, policy_version 4063 (0.0011) [2024-03-20 23:31:15,521][03784] Fps is (10 sec: 29491.2, 60 sec: 44236.9, 300 sec: 47430.3). Total num frames: 133201920. Throughput: 0: 45282.3. Samples: 134422700. Policy #0 lag: (min: 0.0, avg: 40.7, max: 78.0) [2024-03-20 23:31:15,522][03784] Avg episode reward: [(0, '0.350')] [2024-03-20 23:31:19,400][04017] Updated weights for policy 0, policy_version 4073 (0.0017) [2024-03-20 23:31:20,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46421.3, 300 sec: 47541.4). Total num frames: 133529600. Throughput: 0: 45068.9. Samples: 134700500. Policy #0 lag: (min: 1.0, avg: 34.6, max: 66.0) [2024-03-20 23:31:20,522][03784] Avg episode reward: [(0, '0.239')] [2024-03-20 23:31:25,521][03784] Fps is (10 sec: 52428.5, 60 sec: 44783.0, 300 sec: 47541.4). Total num frames: 133726208. Throughput: 0: 44975.6. Samples: 134979400. Policy #0 lag: (min: 0.0, avg: 39.9, max: 94.0) [2024-03-20 23:31:25,522][03784] Avg episode reward: [(0, '0.266')] [2024-03-20 23:31:27,677][04017] Updated weights for policy 0, policy_version 4083 (0.0016) [2024-03-20 23:31:30,521][03784] Fps is (10 sec: 39322.0, 60 sec: 45329.1, 300 sec: 47097.1). Total num frames: 133922816. Throughput: 0: 44986.7. Samples: 135101900. Policy #0 lag: (min: 0.0, avg: 34.2, max: 74.0) [2024-03-20 23:31:30,522][03784] Avg episode reward: [(0, '0.268')] [2024-03-20 23:31:32,249][04017] Updated weights for policy 0, policy_version 4093 (0.0010) [2024-03-20 23:31:35,521][03784] Fps is (10 sec: 55705.5, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 134283264. Throughput: 0: 44211.1. Samples: 135369400. Policy #0 lag: (min: 4.0, avg: 43.7, max: 107.0) [2024-03-20 23:31:35,522][03784] Avg episode reward: [(0, '0.498')] [2024-03-20 23:31:35,523][03995] Saving new best policy, reward=0.498! [2024-03-20 23:31:35,977][03995] Signal inference workers to stop experience collection... (2750 times) [2024-03-20 23:31:36,026][04017] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-03-20 23:31:36,277][03995] Signal inference workers to resume experience collection... (2750 times) [2024-03-20 23:31:36,278][04017] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-03-20 23:31:38,255][04017] Updated weights for policy 0, policy_version 4103 (0.0014) [2024-03-20 23:31:40,521][03784] Fps is (10 sec: 62258.9, 60 sec: 50244.4, 300 sec: 46986.0). Total num frames: 134545408. Throughput: 0: 45051.2. Samples: 135656200. Policy #0 lag: (min: 1.0, avg: 34.5, max: 63.0) [2024-03-20 23:31:40,522][03784] Avg episode reward: [(0, '0.300')] [2024-03-20 23:31:45,521][03784] Fps is (10 sec: 39321.4, 60 sec: 47513.5, 300 sec: 45986.3). Total num frames: 134676480. Throughput: 0: 45402.1. Samples: 135813300. Policy #0 lag: (min: 0.0, avg: 40.0, max: 86.0) [2024-03-20 23:31:45,522][03784] Avg episode reward: [(0, '0.380')] [2024-03-20 23:31:46,243][04017] Updated weights for policy 0, policy_version 4113 (0.0014) [2024-03-20 23:31:50,521][03784] Fps is (10 sec: 39321.5, 60 sec: 48059.8, 300 sec: 46430.6). Total num frames: 134938624. Throughput: 0: 45995.5. Samples: 136094300. Policy #0 lag: (min: 2.0, avg: 40.7, max: 83.0) [2024-03-20 23:31:50,522][03784] Avg episode reward: [(0, '0.347')] [2024-03-20 23:31:55,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44236.7, 300 sec: 45764.1). Total num frames: 135036928. Throughput: 0: 47002.1. Samples: 136408400. Policy #0 lag: (min: 0.0, avg: 36.1, max: 102.0) [2024-03-20 23:31:55,522][03784] Avg episode reward: [(0, '0.181')] [2024-03-20 23:31:56,024][04017] Updated weights for policy 0, policy_version 4123 (0.0015) [2024-03-20 23:32:00,521][03784] Fps is (10 sec: 36044.7, 60 sec: 43690.6, 300 sec: 46652.7). Total num frames: 135299072. Throughput: 0: 46988.8. Samples: 136537200. Policy #0 lag: (min: 3.0, avg: 40.0, max: 73.0) [2024-03-20 23:32:00,522][03784] Avg episode reward: [(0, '0.357')] [2024-03-20 23:32:00,536][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000004129_135299072.pth... [2024-03-20 23:32:00,671][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000003789_124157952.pth [2024-03-20 23:32:04,278][04017] Updated weights for policy 0, policy_version 4133 (0.0013) [2024-03-20 23:32:05,521][03784] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 46874.9). Total num frames: 135462912. Throughput: 0: 46953.3. Samples: 136813400. Policy #0 lag: (min: 2.0, avg: 30.5, max: 65.0) [2024-03-20 23:32:05,522][03784] Avg episode reward: [(0, '0.160')] [2024-03-20 23:32:10,521][03784] Fps is (10 sec: 32768.2, 60 sec: 43144.6, 300 sec: 47097.1). Total num frames: 135626752. Throughput: 0: 46795.6. Samples: 137085200. Policy #0 lag: (min: 0.0, avg: 41.5, max: 94.0) [2024-03-20 23:32:10,522][03784] Avg episode reward: [(0, '0.393')] [2024-03-20 23:32:12,036][04017] Updated weights for policy 0, policy_version 4143 (0.0015) [2024-03-20 23:32:15,521][03784] Fps is (10 sec: 52428.3, 60 sec: 46421.2, 300 sec: 47097.0). Total num frames: 135987200. Throughput: 0: 47084.2. Samples: 137220700. Policy #0 lag: (min: 0.0, avg: 35.7, max: 99.0) [2024-03-20 23:32:15,522][03784] Avg episode reward: [(0, '0.375')] [2024-03-20 23:32:16,690][04017] Updated weights for policy 0, policy_version 4153 (0.0012) [2024-03-20 23:32:20,521][03784] Fps is (10 sec: 62258.6, 60 sec: 45329.0, 300 sec: 47208.1). Total num frames: 136249344. Throughput: 0: 46775.5. Samples: 137474300. Policy #0 lag: (min: 0.0, avg: 36.1, max: 70.0) [2024-03-20 23:32:20,522][03784] Avg episode reward: [(0, '0.316')] [2024-03-20 23:32:22,924][04017] Updated weights for policy 0, policy_version 4163 (0.0012) [2024-03-20 23:32:25,521][03784] Fps is (10 sec: 55706.1, 60 sec: 46967.4, 300 sec: 47097.1). Total num frames: 136544256. Throughput: 0: 46286.6. Samples: 137739100. Policy #0 lag: (min: 0.0, avg: 36.1, max: 70.0) [2024-03-20 23:32:25,522][03784] Avg episode reward: [(0, '0.253')] [2024-03-20 23:32:27,580][04017] Updated weights for policy 0, policy_version 4173 (0.0011) [2024-03-20 23:32:30,521][03784] Fps is (10 sec: 58982.4, 60 sec: 48605.8, 300 sec: 47208.1). Total num frames: 136839168. Throughput: 0: 45708.9. Samples: 137870200. Policy #0 lag: (min: 0.0, avg: 40.9, max: 83.0) [2024-03-20 23:32:30,522][03784] Avg episode reward: [(0, '0.202')] [2024-03-20 23:32:32,141][03995] Signal inference workers to stop experience collection... (2800 times) [2024-03-20 23:32:32,204][04017] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-03-20 23:32:32,372][03995] Signal inference workers to resume experience collection... (2800 times) [2024-03-20 23:32:32,372][04017] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-03-20 23:32:35,521][03784] Fps is (10 sec: 49152.4, 60 sec: 45875.3, 300 sec: 46541.7). Total num frames: 137035776. Throughput: 0: 46195.6. Samples: 138173100. Policy #0 lag: (min: 0.0, avg: 40.1, max: 75.0) [2024-03-20 23:32:35,522][03784] Avg episode reward: [(0, '0.202')] [2024-03-20 23:32:37,560][04017] Updated weights for policy 0, policy_version 4183 (0.0009) [2024-03-20 23:32:40,521][03784] Fps is (10 sec: 52428.5, 60 sec: 46967.4, 300 sec: 46541.7). Total num frames: 137363456. Throughput: 0: 45411.1. Samples: 138451900. Policy #0 lag: (min: 0.0, avg: 40.7, max: 73.0) [2024-03-20 23:32:40,523][03784] Avg episode reward: [(0, '0.429')] [2024-03-20 23:32:40,787][04017] Updated weights for policy 0, policy_version 4193 (0.0031) [2024-03-20 23:32:45,521][03784] Fps is (10 sec: 58982.1, 60 sec: 49152.1, 300 sec: 47097.1). Total num frames: 137625600. Throughput: 0: 45844.5. Samples: 138600200. Policy #0 lag: (min: 0.0, avg: 47.5, max: 107.0) [2024-03-20 23:32:45,522][03784] Avg episode reward: [(0, '0.429')] [2024-03-20 23:32:50,521][03784] Fps is (10 sec: 29491.5, 60 sec: 45329.1, 300 sec: 46874.9). Total num frames: 137658368. Throughput: 0: 46288.9. Samples: 138896400. Policy #0 lag: (min: 0.0, avg: 47.5, max: 107.0) [2024-03-20 23:32:50,522][03784] Avg episode reward: [(0, '0.314')] [2024-03-20 23:32:51,758][04017] Updated weights for policy 0, policy_version 4203 (0.0016) [2024-03-20 23:32:55,521][03784] Fps is (10 sec: 19660.8, 60 sec: 46421.4, 300 sec: 46319.5). Total num frames: 137822208. Throughput: 0: 46595.5. Samples: 139182000. Policy #0 lag: (min: 0.0, avg: 42.2, max: 101.0) [2024-03-20 23:32:55,522][03784] Avg episode reward: [(0, '0.283')] [2024-03-20 23:33:00,521][03784] Fps is (10 sec: 26214.5, 60 sec: 43690.7, 300 sec: 45875.2). Total num frames: 137920512. Throughput: 0: 47006.8. Samples: 139336000. Policy #0 lag: (min: 0.0, avg: 28.3, max: 78.0) [2024-03-20 23:33:00,522][03784] Avg episode reward: [(0, '0.283')] [2024-03-20 23:33:03,384][04017] Updated weights for policy 0, policy_version 4213 (0.0014) [2024-03-20 23:33:05,521][03784] Fps is (10 sec: 32767.2, 60 sec: 44782.8, 300 sec: 45764.1). Total num frames: 138149888. Throughput: 0: 47282.0. Samples: 139602000. Policy #0 lag: (min: 3.0, avg: 38.7, max: 89.0) [2024-03-20 23:33:05,523][03784] Avg episode reward: [(0, '0.429')] [2024-03-20 23:33:09,886][04017] Updated weights for policy 0, policy_version 4223 (0.0018) [2024-03-20 23:33:10,521][03784] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 138379264. Throughput: 0: 47397.9. Samples: 139872000. Policy #0 lag: (min: 0.0, avg: 29.0, max: 66.0) [2024-03-20 23:33:10,522][03784] Avg episode reward: [(0, '0.248')] [2024-03-20 23:33:15,521][03784] Fps is (10 sec: 49153.4, 60 sec: 44236.9, 300 sec: 46097.4). Total num frames: 138641408. Throughput: 0: 47671.2. Samples: 140015400. Policy #0 lag: (min: 1.0, avg: 24.8, max: 51.0) [2024-03-20 23:33:15,522][03784] Avg episode reward: [(0, '0.298')] [2024-03-20 23:33:16,034][04017] Updated weights for policy 0, policy_version 4233 (0.0016) [2024-03-20 23:33:20,468][04017] Updated weights for policy 0, policy_version 4243 (0.0012) [2024-03-20 23:33:20,521][03784] Fps is (10 sec: 65535.5, 60 sec: 46421.3, 300 sec: 46319.5). Total num frames: 139034624. Throughput: 0: 47262.1. Samples: 140299900. Policy #0 lag: (min: 2.0, avg: 46.1, max: 114.0) [2024-03-20 23:33:20,530][03784] Avg episode reward: [(0, '0.298')] [2024-03-20 23:33:23,211][03995] Signal inference workers to stop experience collection... (2850 times) [2024-03-20 23:33:23,276][04017] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-03-20 23:33:23,462][03995] Signal inference workers to resume experience collection... (2850 times) [2024-03-20 23:33:23,462][04017] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-03-20 23:33:24,074][04017] Updated weights for policy 0, policy_version 4253 (0.0012) [2024-03-20 23:33:25,521][03784] Fps is (10 sec: 85197.2, 60 sec: 49152.1, 300 sec: 47208.2). Total num frames: 139493376. Throughput: 0: 46824.6. Samples: 140559000. Policy #0 lag: (min: 4.0, avg: 37.2, max: 69.0) [2024-03-20 23:33:25,521][03784] Avg episode reward: [(0, '0.298')] [2024-03-20 23:33:27,247][04017] Updated weights for policy 0, policy_version 4263 (0.0013) [2024-03-20 23:33:30,521][03784] Fps is (10 sec: 78643.2, 60 sec: 49698.1, 300 sec: 47652.5). Total num frames: 139821056. Throughput: 0: 46580.0. Samples: 140696300. Policy #0 lag: (min: 0.0, avg: 56.0, max: 105.0) [2024-03-20 23:33:30,522][03784] Avg episode reward: [(0, '0.298')] [2024-03-20 23:33:35,521][03784] Fps is (10 sec: 45875.3, 60 sec: 48605.9, 300 sec: 47319.2). Total num frames: 139952128. Throughput: 0: 46655.7. Samples: 140995900. Policy #0 lag: (min: 0.0, avg: 47.7, max: 102.0) [2024-03-20 23:33:35,521][03784] Avg episode reward: [(0, '0.449')] [2024-03-20 23:33:38,209][04017] Updated weights for policy 0, policy_version 4273 (0.0019) [2024-03-20 23:33:40,521][03784] Fps is (10 sec: 26214.3, 60 sec: 45329.1, 300 sec: 47208.1). Total num frames: 140083200. Throughput: 0: 46726.6. Samples: 141284700. Policy #0 lag: (min: 0.0, avg: 47.7, max: 102.0) [2024-03-20 23:33:40,522][03784] Avg episode reward: [(0, '0.260')] [2024-03-20 23:33:45,521][03784] Fps is (10 sec: 22937.2, 60 sec: 42598.3, 300 sec: 46430.6). Total num frames: 140181504. Throughput: 0: 46713.2. Samples: 141438100. Policy #0 lag: (min: 0.0, avg: 33.1, max: 72.0) [2024-03-20 23:33:45,522][03784] Avg episode reward: [(0, '0.126')] [2024-03-20 23:33:50,521][03784] Fps is (10 sec: 22937.8, 60 sec: 44236.8, 300 sec: 46430.6). Total num frames: 140312576. Throughput: 0: 47013.6. Samples: 141717600. Policy #0 lag: (min: 0.0, avg: 31.5, max: 74.0) [2024-03-20 23:33:50,522][03784] Avg episode reward: [(0, '0.151')] [2024-03-20 23:33:51,330][04017] Updated weights for policy 0, policy_version 4283 (0.0014) [2024-03-20 23:33:55,521][03784] Fps is (10 sec: 39322.2, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 140574720. Throughput: 0: 47124.5. Samples: 141992600. Policy #0 lag: (min: 0.0, avg: 31.5, max: 74.0) [2024-03-20 23:33:55,522][03784] Avg episode reward: [(0, '0.450')] [2024-03-20 23:33:58,094][04017] Updated weights for policy 0, policy_version 4293 (0.0009) [2024-03-20 23:34:00,521][03784] Fps is (10 sec: 45875.4, 60 sec: 47513.6, 300 sec: 45764.1). Total num frames: 140771328. Throughput: 0: 47151.1. Samples: 142137200. Policy #0 lag: (min: 0.0, avg: 41.6, max: 95.0) [2024-03-20 23:34:00,521][03784] Avg episode reward: [(0, '0.174')] [2024-03-20 23:34:00,887][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000004298_140836864.pth... [2024-03-20 23:34:00,959][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000003965_129925120.pth [2024-03-20 23:34:04,348][04017] Updated weights for policy 0, policy_version 4303 (0.0011) [2024-03-20 23:34:05,521][03784] Fps is (10 sec: 52428.5, 60 sec: 49152.2, 300 sec: 45430.9). Total num frames: 141099008. Throughput: 0: 47395.6. Samples: 142432700. Policy #0 lag: (min: 1.0, avg: 23.5, max: 51.0) [2024-03-20 23:34:05,522][03784] Avg episode reward: [(0, '0.406')] [2024-03-20 23:34:09,405][04017] Updated weights for policy 0, policy_version 4313 (0.0012) [2024-03-20 23:34:10,521][03784] Fps is (10 sec: 62258.5, 60 sec: 50244.2, 300 sec: 45653.0). Total num frames: 141393920. Throughput: 0: 47348.8. Samples: 142689700. Policy #0 lag: (min: 0.0, avg: 34.7, max: 84.0) [2024-03-20 23:34:10,522][03784] Avg episode reward: [(0, '0.257')] [2024-03-20 23:34:14,740][04017] Updated weights for policy 0, policy_version 4323 (0.0011) [2024-03-20 23:34:15,521][03784] Fps is (10 sec: 58982.2, 60 sec: 50790.3, 300 sec: 46097.3). Total num frames: 141688832. Throughput: 0: 47568.9. Samples: 142836900. Policy #0 lag: (min: 0.0, avg: 36.2, max: 76.0) [2024-03-20 23:34:15,522][03784] Avg episode reward: [(0, '0.402')] [2024-03-20 23:34:20,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 141754368. Throughput: 0: 47202.1. Samples: 143120000. Policy #0 lag: (min: 0.0, avg: 36.2, max: 76.0) [2024-03-20 23:34:20,522][03784] Avg episode reward: [(0, '0.553')] [2024-03-20 23:34:20,623][03995] Signal inference workers to stop experience collection... (2900 times) [2024-03-20 23:34:20,671][04017] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-03-20 23:34:20,897][03995] Saving new best policy, reward=0.553! [2024-03-20 23:34:20,897][03995] Signal inference workers to resume experience collection... (2900 times) [2024-03-20 23:34:20,897][04017] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-03-20 23:34:25,049][04017] Updated weights for policy 0, policy_version 4333 (0.0011) [2024-03-20 23:34:25,521][03784] Fps is (10 sec: 32768.1, 60 sec: 42052.2, 300 sec: 46541.7). Total num frames: 142016512. Throughput: 0: 46873.4. Samples: 143394000. Policy #0 lag: (min: 0.0, avg: 36.5, max: 72.0) [2024-03-20 23:34:25,522][03784] Avg episode reward: [(0, '0.249')] [2024-03-20 23:34:30,521][03784] Fps is (10 sec: 45874.9, 60 sec: 39867.7, 300 sec: 46541.7). Total num frames: 142213120. Throughput: 0: 46140.0. Samples: 143514400. Policy #0 lag: (min: 0.0, avg: 42.8, max: 99.0) [2024-03-20 23:34:30,522][03784] Avg episode reward: [(0, '0.465')] [2024-03-20 23:34:31,676][04017] Updated weights for policy 0, policy_version 4343 (0.0021) [2024-03-20 23:34:35,521][03784] Fps is (10 sec: 36045.0, 60 sec: 40413.8, 300 sec: 46208.4). Total num frames: 142376960. Throughput: 0: 46197.8. Samples: 143796500. Policy #0 lag: (min: 1.0, avg: 45.6, max: 89.0) [2024-03-20 23:34:35,522][03784] Avg episode reward: [(0, '0.427')] [2024-03-20 23:34:39,606][04017] Updated weights for policy 0, policy_version 4353 (0.0011) [2024-03-20 23:34:40,521][03784] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 45986.3). Total num frames: 142639104. Throughput: 0: 45604.4. Samples: 144044800. Policy #0 lag: (min: 0.0, avg: 31.0, max: 70.0) [2024-03-20 23:34:40,522][03784] Avg episode reward: [(0, '0.531')] [2024-03-20 23:34:45,521][03784] Fps is (10 sec: 49152.3, 60 sec: 44783.1, 300 sec: 46208.5). Total num frames: 142868480. Throughput: 0: 45508.9. Samples: 144185100. Policy #0 lag: (min: 0.0, avg: 46.2, max: 114.0) [2024-03-20 23:34:45,522][03784] Avg episode reward: [(0, '0.195')] [2024-03-20 23:34:47,293][04017] Updated weights for policy 0, policy_version 4363 (0.0016) [2024-03-20 23:34:50,521][03784] Fps is (10 sec: 49152.0, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 143130624. Throughput: 0: 45051.1. Samples: 144460000. Policy #0 lag: (min: 0.0, avg: 38.5, max: 79.0) [2024-03-20 23:34:50,522][03784] Avg episode reward: [(0, '0.565')] [2024-03-20 23:34:50,615][03995] Saving new best policy, reward=0.565! [2024-03-20 23:34:52,391][04017] Updated weights for policy 0, policy_version 4373 (0.0013) [2024-03-20 23:34:55,521][03784] Fps is (10 sec: 52428.0, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 143392768. Throughput: 0: 45455.5. Samples: 144735200. Policy #0 lag: (min: 2.0, avg: 57.6, max: 106.0) [2024-03-20 23:34:55,522][03784] Avg episode reward: [(0, '0.333')] [2024-03-20 23:34:58,997][04017] Updated weights for policy 0, policy_version 4383 (0.0011) [2024-03-20 23:35:00,521][03784] Fps is (10 sec: 55705.6, 60 sec: 48605.8, 300 sec: 45430.9). Total num frames: 143687680. Throughput: 0: 45437.8. Samples: 144881600. Policy #0 lag: (min: 1.0, avg: 35.1, max: 70.0) [2024-03-20 23:35:00,522][03784] Avg episode reward: [(0, '0.370')] [2024-03-20 23:35:04,199][04017] Updated weights for policy 0, policy_version 4393 (0.0010) [2024-03-20 23:35:05,521][03784] Fps is (10 sec: 62259.9, 60 sec: 48605.9, 300 sec: 45986.3). Total num frames: 144015360. Throughput: 0: 45186.7. Samples: 145153400. Policy #0 lag: (min: 0.0, avg: 41.6, max: 86.0) [2024-03-20 23:35:05,522][03784] Avg episode reward: [(0, '0.498')] [2024-03-20 23:35:10,521][03784] Fps is (10 sec: 49151.8, 60 sec: 46421.3, 300 sec: 46208.4). Total num frames: 144179200. Throughput: 0: 45448.9. Samples: 145439200. Policy #0 lag: (min: 0.0, avg: 43.5, max: 106.0) [2024-03-20 23:35:10,522][03784] Avg episode reward: [(0, '0.447')] [2024-03-20 23:35:14,493][04017] Updated weights for policy 0, policy_version 4403 (0.0014) [2024-03-20 23:35:15,521][03784] Fps is (10 sec: 29490.9, 60 sec: 43690.7, 300 sec: 45986.3). Total num frames: 144310272. Throughput: 0: 46124.4. Samples: 145590000. Policy #0 lag: (min: 0.0, avg: 40.0, max: 80.0) [2024-03-20 23:35:15,522][03784] Avg episode reward: [(0, '0.198')] [2024-03-20 23:35:18,607][03995] Signal inference workers to stop experience collection... (2950 times) [2024-03-20 23:35:18,608][03995] Signal inference workers to resume experience collection... (2950 times) [2024-03-20 23:35:18,650][04017] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-03-20 23:35:18,650][04017] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-03-20 23:35:20,521][03784] Fps is (10 sec: 29491.3, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 144474112. Throughput: 0: 46202.2. Samples: 145875600. Policy #0 lag: (min: 1.0, avg: 34.1, max: 70.0) [2024-03-20 23:35:20,522][03784] Avg episode reward: [(0, '0.227')] [2024-03-20 23:35:24,935][04017] Updated weights for policy 0, policy_version 4413 (0.0019) [2024-03-20 23:35:25,521][03784] Fps is (10 sec: 32768.3, 60 sec: 43690.7, 300 sec: 45542.0). Total num frames: 144637952. Throughput: 0: 47133.4. Samples: 146165800. Policy #0 lag: (min: 1.0, avg: 35.4, max: 78.0) [2024-03-20 23:35:25,522][03784] Avg episode reward: [(0, '0.460')] [2024-03-20 23:35:30,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43690.7, 300 sec: 45542.0). Total num frames: 144834560. Throughput: 0: 46859.9. Samples: 146293800. Policy #0 lag: (min: 0.0, avg: 36.3, max: 75.0) [2024-03-20 23:35:30,522][03784] Avg episode reward: [(0, '0.611')] [2024-03-20 23:35:30,535][03995] Saving new best policy, reward=0.611! [2024-03-20 23:35:33,145][04017] Updated weights for policy 0, policy_version 4423 (0.0011) [2024-03-20 23:35:35,521][03784] Fps is (10 sec: 45874.9, 60 sec: 45329.0, 300 sec: 45986.3). Total num frames: 145096704. Throughput: 0: 46368.9. Samples: 146546600. Policy #0 lag: (min: 2.0, avg: 32.7, max: 68.0) [2024-03-20 23:35:35,522][03784] Avg episode reward: [(0, '0.402')] [2024-03-20 23:35:39,426][04017] Updated weights for policy 0, policy_version 4433 (0.0013) [2024-03-20 23:35:40,521][03784] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 45653.0). Total num frames: 145293312. Throughput: 0: 45357.8. Samples: 146776300. Policy #0 lag: (min: 2.0, avg: 30.7, max: 69.0) [2024-03-20 23:35:40,522][03784] Avg episode reward: [(0, '0.601')] [2024-03-20 23:35:45,521][03784] Fps is (10 sec: 45875.4, 60 sec: 44782.9, 300 sec: 45764.1). Total num frames: 145555456. Throughput: 0: 45391.2. Samples: 146924200. Policy #0 lag: (min: 0.0, avg: 59.6, max: 109.0) [2024-03-20 23:35:45,522][03784] Avg episode reward: [(0, '0.212')] [2024-03-20 23:35:45,626][04017] Updated weights for policy 0, policy_version 4443 (0.0015) [2024-03-20 23:35:50,321][04017] Updated weights for policy 0, policy_version 4453 (0.0015) [2024-03-20 23:35:50,521][03784] Fps is (10 sec: 62259.0, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 145915904. Throughput: 0: 44946.6. Samples: 147176000. Policy #0 lag: (min: 1.0, avg: 50.2, max: 109.0) [2024-03-20 23:35:50,522][03784] Avg episode reward: [(0, '0.282')] [2024-03-20 23:35:55,521][03784] Fps is (10 sec: 65536.1, 60 sec: 46967.6, 300 sec: 45875.2). Total num frames: 146210816. Throughput: 0: 44644.5. Samples: 147448200. Policy #0 lag: (min: 0.0, avg: 41.5, max: 76.0) [2024-03-20 23:35:55,522][03784] Avg episode reward: [(0, '0.306')] [2024-03-20 23:35:56,562][04017] Updated weights for policy 0, policy_version 4463 (0.0037) [2024-03-20 23:36:00,365][04017] Updated weights for policy 0, policy_version 4473 (0.0012) [2024-03-20 23:36:00,521][03784] Fps is (10 sec: 65536.2, 60 sec: 48059.8, 300 sec: 46319.5). Total num frames: 146571264. Throughput: 0: 44044.5. Samples: 147572000. Policy #0 lag: (min: 0.0, avg: 48.8, max: 117.0) [2024-03-20 23:36:00,530][03784] Avg episode reward: [(0, '0.361')] [2024-03-20 23:36:00,544][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000004473_146571264.pth... [2024-03-20 23:36:00,656][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000004129_135299072.pth [2024-03-20 23:36:05,041][03995] Signal inference workers to stop experience collection... (3000 times) [2024-03-20 23:36:05,041][03995] Signal inference workers to resume experience collection... (3000 times) [2024-03-20 23:36:05,119][04017] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-03-20 23:36:05,119][04017] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-03-20 23:36:05,521][03784] Fps is (10 sec: 55705.2, 60 sec: 45875.1, 300 sec: 46541.7). Total num frames: 146767872. Throughput: 0: 44357.8. Samples: 147871700. Policy #0 lag: (min: 0.0, avg: 63.1, max: 114.0) [2024-03-20 23:36:05,531][03784] Avg episode reward: [(0, '0.384')] [2024-03-20 23:36:10,521][03784] Fps is (10 sec: 29491.2, 60 sec: 44783.0, 300 sec: 46319.5). Total num frames: 146866176. Throughput: 0: 44664.4. Samples: 148175700. Policy #0 lag: (min: 0.0, avg: 43.4, max: 97.0) [2024-03-20 23:36:10,530][03784] Avg episode reward: [(0, '0.358')] [2024-03-20 23:36:11,197][04017] Updated weights for policy 0, policy_version 4483 (0.0018) [2024-03-20 23:36:15,521][03784] Fps is (10 sec: 36045.0, 60 sec: 46967.5, 300 sec: 46097.4). Total num frames: 147128320. Throughput: 0: 44851.1. Samples: 148312100. Policy #0 lag: (min: 0.0, avg: 43.4, max: 97.0) [2024-03-20 23:36:15,522][03784] Avg episode reward: [(0, '0.155')] [2024-03-20 23:36:18,217][04017] Updated weights for policy 0, policy_version 4493 (0.0010) [2024-03-20 23:36:20,521][03784] Fps is (10 sec: 45874.9, 60 sec: 47513.6, 300 sec: 46097.3). Total num frames: 147324928. Throughput: 0: 45715.5. Samples: 148603800. Policy #0 lag: (min: 0.0, avg: 38.2, max: 73.0) [2024-03-20 23:36:20,530][03784] Avg episode reward: [(0, '0.359')] [2024-03-20 23:36:25,521][03784] Fps is (10 sec: 26214.4, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 147390464. Throughput: 0: 47300.0. Samples: 148904800. Policy #0 lag: (min: 0.0, avg: 36.7, max: 84.0) [2024-03-20 23:36:25,522][03784] Avg episode reward: [(0, '0.302')] [2024-03-20 23:36:30,424][04017] Updated weights for policy 0, policy_version 4503 (0.0012) [2024-03-20 23:36:30,521][03784] Fps is (10 sec: 22937.5, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 147554304. Throughput: 0: 47026.5. Samples: 149040400. Policy #0 lag: (min: 0.0, avg: 28.6, max: 66.0) [2024-03-20 23:36:30,522][03784] Avg episode reward: [(0, '0.282')] [2024-03-20 23:36:35,521][03784] Fps is (10 sec: 45875.5, 60 sec: 45875.3, 300 sec: 45097.7). Total num frames: 147849216. Throughput: 0: 46986.8. Samples: 149290400. Policy #0 lag: (min: 0.0, avg: 28.8, max: 62.0) [2024-03-20 23:36:35,522][03784] Avg episode reward: [(0, '0.579')] [2024-03-20 23:36:36,653][04017] Updated weights for policy 0, policy_version 4513 (0.0010) [2024-03-20 23:36:40,521][03784] Fps is (10 sec: 55704.7, 60 sec: 46967.2, 300 sec: 45541.9). Total num frames: 148111360. Throughput: 0: 46817.4. Samples: 149555000. Policy #0 lag: (min: 0.0, avg: 62.7, max: 122.0) [2024-03-20 23:36:40,522][03784] Avg episode reward: [(0, '0.641')] [2024-03-20 23:36:40,690][03995] Saving new best policy, reward=0.641! [2024-03-20 23:36:43,392][04017] Updated weights for policy 0, policy_version 4523 (0.0011) [2024-03-20 23:36:45,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 148307968. Throughput: 0: 47035.6. Samples: 149688600. Policy #0 lag: (min: 0.0, avg: 62.7, max: 122.0) [2024-03-20 23:36:45,522][03784] Avg episode reward: [(0, '0.592')] [2024-03-20 23:36:50,088][04017] Updated weights for policy 0, policy_version 4533 (0.0016) [2024-03-20 23:36:50,521][03784] Fps is (10 sec: 45876.3, 60 sec: 44236.8, 300 sec: 45875.2). Total num frames: 148570112. Throughput: 0: 46526.7. Samples: 149965400. Policy #0 lag: (min: 0.0, avg: 36.9, max: 86.0) [2024-03-20 23:36:50,522][03784] Avg episode reward: [(0, '0.592')] [2024-03-20 23:36:55,016][04017] Updated weights for policy 0, policy_version 4543 (0.0029) [2024-03-20 23:36:55,521][03784] Fps is (10 sec: 58980.9, 60 sec: 44782.7, 300 sec: 46097.3). Total num frames: 148897792. Throughput: 0: 45762.0. Samples: 150235000. Policy #0 lag: (min: 3.0, avg: 39.4, max: 93.0) [2024-03-20 23:36:55,522][03784] Avg episode reward: [(0, '0.211')] [2024-03-20 23:37:00,273][04017] Updated weights for policy 0, policy_version 4553 (0.0015) [2024-03-20 23:37:00,521][03784] Fps is (10 sec: 62259.4, 60 sec: 43690.7, 300 sec: 46541.7). Total num frames: 149192704. Throughput: 0: 45351.1. Samples: 150352900. Policy #0 lag: (min: 3.0, avg: 60.8, max: 110.0) [2024-03-20 23:37:00,522][03784] Avg episode reward: [(0, '0.312')] [2024-03-20 23:37:01,684][03995] Signal inference workers to stop experience collection... (3050 times) [2024-03-20 23:37:01,755][03995] Signal inference workers to resume experience collection... (3050 times) [2024-03-20 23:37:01,768][04017] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-03-20 23:37:01,805][04017] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-03-20 23:37:03,710][04017] Updated weights for policy 0, policy_version 4563 (0.0011) [2024-03-20 23:37:05,521][03784] Fps is (10 sec: 65537.6, 60 sec: 46421.4, 300 sec: 47208.1). Total num frames: 149553152. Throughput: 0: 44509.0. Samples: 150606700. Policy #0 lag: (min: 1.0, avg: 46.1, max: 114.0) [2024-03-20 23:37:05,522][03784] Avg episode reward: [(0, '0.424')] [2024-03-20 23:37:10,521][03784] Fps is (10 sec: 55705.4, 60 sec: 48059.7, 300 sec: 46652.8). Total num frames: 149749760. Throughput: 0: 44128.8. Samples: 150890600. Policy #0 lag: (min: 0.0, avg: 39.8, max: 90.0) [2024-03-20 23:37:10,522][03784] Avg episode reward: [(0, '0.294')] [2024-03-20 23:37:11,334][04017] Updated weights for policy 0, policy_version 4573 (0.0016) [2024-03-20 23:37:15,521][03784] Fps is (10 sec: 42598.4, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 149979136. Throughput: 0: 44200.2. Samples: 151029400. Policy #0 lag: (min: 0.0, avg: 39.4, max: 79.0) [2024-03-20 23:37:15,522][03784] Avg episode reward: [(0, '0.177')] [2024-03-20 23:37:20,521][03784] Fps is (10 sec: 36045.1, 60 sec: 46421.4, 300 sec: 45986.3). Total num frames: 150110208. Throughput: 0: 45377.7. Samples: 151332400. Policy #0 lag: (min: 0.0, avg: 39.4, max: 79.0) [2024-03-20 23:37:20,522][03784] Avg episode reward: [(0, '0.399')] [2024-03-20 23:37:24,579][04017] Updated weights for policy 0, policy_version 4583 (0.0013) [2024-03-20 23:37:25,521][03784] Fps is (10 sec: 22937.4, 60 sec: 46967.4, 300 sec: 45319.8). Total num frames: 150208512. Throughput: 0: 46238.0. Samples: 151635700. Policy #0 lag: (min: 0.0, avg: 30.5, max: 73.0) [2024-03-20 23:37:25,522][03784] Avg episode reward: [(0, '0.596')] [2024-03-20 23:37:30,521][03784] Fps is (10 sec: 29491.0, 60 sec: 47513.7, 300 sec: 45319.8). Total num frames: 150405120. Throughput: 0: 46755.5. Samples: 151792600. Policy #0 lag: (min: 0.0, avg: 30.8, max: 73.0) [2024-03-20 23:37:30,522][03784] Avg episode reward: [(0, '0.239')] [2024-03-20 23:37:31,355][04017] Updated weights for policy 0, policy_version 4593 (0.0011) [2024-03-20 23:37:35,521][03784] Fps is (10 sec: 52428.8, 60 sec: 48059.6, 300 sec: 45319.8). Total num frames: 150732800. Throughput: 0: 46635.5. Samples: 152064000. Policy #0 lag: (min: 2.0, avg: 46.8, max: 93.0) [2024-03-20 23:37:35,522][03784] Avg episode reward: [(0, '0.317')] [2024-03-20 23:37:38,681][04017] Updated weights for policy 0, policy_version 4603 (0.0015) [2024-03-20 23:37:40,521][03784] Fps is (10 sec: 55705.1, 60 sec: 47513.7, 300 sec: 45208.7). Total num frames: 150962176. Throughput: 0: 47055.6. Samples: 152352500. Policy #0 lag: (min: 0.0, avg: 33.7, max: 76.0) [2024-03-20 23:37:40,522][03784] Avg episode reward: [(0, '0.317')] [2024-03-20 23:37:42,909][04017] Updated weights for policy 0, policy_version 4613 (0.0014) [2024-03-20 23:37:45,521][03784] Fps is (10 sec: 58982.6, 60 sec: 50244.2, 300 sec: 46319.5). Total num frames: 151322624. Throughput: 0: 47086.7. Samples: 152471800. Policy #0 lag: (min: 2.0, avg: 42.1, max: 84.0) [2024-03-20 23:37:45,522][03784] Avg episode reward: [(0, '0.394')] [2024-03-20 23:37:50,521][03784] Fps is (10 sec: 45875.5, 60 sec: 47513.6, 300 sec: 46097.4). Total num frames: 151420928. Throughput: 0: 47691.0. Samples: 152752800. Policy #0 lag: (min: 0.0, avg: 37.5, max: 91.0) [2024-03-20 23:37:50,522][03784] Avg episode reward: [(0, '0.329')] [2024-03-20 23:37:51,666][04017] Updated weights for policy 0, policy_version 4623 (0.0012) [2024-03-20 23:37:52,451][03995] Signal inference workers to stop experience collection... (3100 times) [2024-03-20 23:37:52,527][04017] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-03-20 23:37:52,527][03995] Signal inference workers to resume experience collection... (3100 times) [2024-03-20 23:37:52,564][04017] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-03-20 23:37:55,521][03784] Fps is (10 sec: 36044.5, 60 sec: 46421.4, 300 sec: 46652.7). Total num frames: 151683072. Throughput: 0: 47680.0. Samples: 153036200. Policy #0 lag: (min: 0.0, avg: 37.5, max: 91.0) [2024-03-20 23:37:55,522][03784] Avg episode reward: [(0, '0.527')] [2024-03-20 23:37:57,701][04017] Updated weights for policy 0, policy_version 4633 (0.0017) [2024-03-20 23:38:00,521][03784] Fps is (10 sec: 58981.9, 60 sec: 46967.4, 300 sec: 46986.0). Total num frames: 152010752. Throughput: 0: 47317.6. Samples: 153158700. Policy #0 lag: (min: 1.0, avg: 34.1, max: 71.0) [2024-03-20 23:38:00,522][03784] Avg episode reward: [(0, '0.203')] [2024-03-20 23:38:00,841][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000004640_152043520.pth... [2024-03-20 23:38:00,952][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000004298_140836864.pth [2024-03-20 23:38:03,881][04017] Updated weights for policy 0, policy_version 4643 (0.0018) [2024-03-20 23:38:05,521][03784] Fps is (10 sec: 52429.7, 60 sec: 44236.8, 300 sec: 46874.9). Total num frames: 152207360. Throughput: 0: 46455.6. Samples: 153422900. Policy #0 lag: (min: 1.0, avg: 51.6, max: 105.0) [2024-03-20 23:38:05,522][03784] Avg episode reward: [(0, '0.394')] [2024-03-20 23:38:09,318][04017] Updated weights for policy 0, policy_version 4653 (0.0010) [2024-03-20 23:38:10,521][03784] Fps is (10 sec: 49152.6, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 152502272. Throughput: 0: 45473.4. Samples: 153682000. Policy #0 lag: (min: 0.0, avg: 43.0, max: 82.0) [2024-03-20 23:38:10,522][03784] Avg episode reward: [(0, '0.555')] [2024-03-20 23:38:15,521][03784] Fps is (10 sec: 49151.8, 60 sec: 45329.1, 300 sec: 46319.5). Total num frames: 152698880. Throughput: 0: 45531.2. Samples: 153841500. Policy #0 lag: (min: 0.0, avg: 40.8, max: 83.0) [2024-03-20 23:38:15,522][03784] Avg episode reward: [(0, '0.555')] [2024-03-20 23:38:17,893][04017] Updated weights for policy 0, policy_version 4663 (0.0010) [2024-03-20 23:38:20,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46421.2, 300 sec: 45430.9). Total num frames: 152895488. Throughput: 0: 46326.6. Samples: 154148700. Policy #0 lag: (min: 2.0, avg: 42.5, max: 94.0) [2024-03-20 23:38:20,522][03784] Avg episode reward: [(0, '0.473')] [2024-03-20 23:38:23,248][04017] Updated weights for policy 0, policy_version 4673 (0.0014) [2024-03-20 23:38:25,521][03784] Fps is (10 sec: 55705.2, 60 sec: 50790.4, 300 sec: 45542.0). Total num frames: 153255936. Throughput: 0: 46244.5. Samples: 154433500. Policy #0 lag: (min: 0.0, avg: 41.1, max: 87.0) [2024-03-20 23:38:25,522][03784] Avg episode reward: [(0, '0.171')] [2024-03-20 23:38:30,521][03784] Fps is (10 sec: 52429.2, 60 sec: 50244.3, 300 sec: 45653.0). Total num frames: 153419776. Throughput: 0: 46940.0. Samples: 154584100. Policy #0 lag: (min: 0.0, avg: 39.0, max: 96.0) [2024-03-20 23:38:30,522][03784] Avg episode reward: [(0, '0.171')] [2024-03-20 23:38:32,137][04017] Updated weights for policy 0, policy_version 4683 (0.0011) [2024-03-20 23:38:35,521][03784] Fps is (10 sec: 49152.8, 60 sec: 50244.4, 300 sec: 46319.5). Total num frames: 153747456. Throughput: 0: 47024.6. Samples: 154868900. Policy #0 lag: (min: 2.0, avg: 38.5, max: 85.0) [2024-03-20 23:38:35,522][03784] Avg episode reward: [(0, '0.422')] [2024-03-20 23:38:36,175][04017] Updated weights for policy 0, policy_version 4693 (0.0016) [2024-03-20 23:38:40,521][03784] Fps is (10 sec: 45875.3, 60 sec: 48606.0, 300 sec: 46430.6). Total num frames: 153878528. Throughput: 0: 46580.1. Samples: 155132300. Policy #0 lag: (min: 0.0, avg: 29.1, max: 67.0) [2024-03-20 23:38:40,522][03784] Avg episode reward: [(0, '0.555')] [2024-03-20 23:38:45,508][04017] Updated weights for policy 0, policy_version 4703 (0.0013) [2024-03-20 23:38:45,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46421.4, 300 sec: 46763.8). Total num frames: 154107904. Throughput: 0: 46973.6. Samples: 155272500. Policy #0 lag: (min: 0.0, avg: 27.0, max: 64.0) [2024-03-20 23:38:45,522][03784] Avg episode reward: [(0, '0.551')] [2024-03-20 23:38:50,413][03995] Signal inference workers to stop experience collection... (3150 times) [2024-03-20 23:38:50,485][04017] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-03-20 23:38:50,521][03784] Fps is (10 sec: 26214.2, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 154140672. Throughput: 0: 47679.9. Samples: 155568500. Policy #0 lag: (min: 0.0, avg: 27.0, max: 64.0) [2024-03-20 23:38:50,522][03784] Avg episode reward: [(0, '0.347')] [2024-03-20 23:38:50,710][03995] Signal inference workers to resume experience collection... (3150 times) [2024-03-20 23:38:50,711][04017] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-03-20 23:38:55,521][03784] Fps is (10 sec: 19660.5, 60 sec: 43690.7, 300 sec: 45875.2). Total num frames: 154304512. Throughput: 0: 47895.5. Samples: 155837300. Policy #0 lag: (min: 0.0, avg: 46.5, max: 104.0) [2024-03-20 23:38:55,522][03784] Avg episode reward: [(0, '0.498')] [2024-03-20 23:38:58,890][04017] Updated weights for policy 0, policy_version 4713 (0.0011) [2024-03-20 23:39:00,521][03784] Fps is (10 sec: 32768.0, 60 sec: 40960.1, 300 sec: 45319.8). Total num frames: 154468352. Throughput: 0: 47386.6. Samples: 155973900. Policy #0 lag: (min: 0.0, avg: 31.4, max: 91.0) [2024-03-20 23:39:00,522][03784] Avg episode reward: [(0, '0.269')] [2024-03-20 23:39:04,888][04017] Updated weights for policy 0, policy_version 4723 (0.0014) [2024-03-20 23:39:05,521][03784] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 45319.8). Total num frames: 154763264. Throughput: 0: 46266.6. Samples: 156230700. Policy #0 lag: (min: 0.0, avg: 32.5, max: 81.0) [2024-03-20 23:39:05,522][03784] Avg episode reward: [(0, '0.508')] [2024-03-20 23:39:09,853][04017] Updated weights for policy 0, policy_version 4733 (0.0015) [2024-03-20 23:39:10,521][03784] Fps is (10 sec: 65536.0, 60 sec: 43690.7, 300 sec: 45542.0). Total num frames: 155123712. Throughput: 0: 45902.2. Samples: 156499100. Policy #0 lag: (min: 3.0, avg: 40.5, max: 114.0) [2024-03-20 23:39:10,522][03784] Avg episode reward: [(0, '0.540')] [2024-03-20 23:39:14,524][04017] Updated weights for policy 0, policy_version 4743 (0.0011) [2024-03-20 23:39:15,521][03784] Fps is (10 sec: 72090.0, 60 sec: 46421.3, 300 sec: 46541.7). Total num frames: 155484160. Throughput: 0: 45448.9. Samples: 156629300. Policy #0 lag: (min: 0.0, avg: 36.5, max: 72.0) [2024-03-20 23:39:15,522][03784] Avg episode reward: [(0, '0.465')] [2024-03-20 23:39:18,234][04017] Updated weights for policy 0, policy_version 4753 (0.0010) [2024-03-20 23:39:20,521][03784] Fps is (10 sec: 62259.1, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 155746304. Throughput: 0: 45168.7. Samples: 156901500. Policy #0 lag: (min: 2.0, avg: 68.7, max: 121.0) [2024-03-20 23:39:20,522][03784] Avg episode reward: [(0, '0.272')] [2024-03-20 23:39:25,521][03784] Fps is (10 sec: 55706.1, 60 sec: 46421.4, 300 sec: 46874.9). Total num frames: 156041216. Throughput: 0: 45491.1. Samples: 157179400. Policy #0 lag: (min: 1.0, avg: 38.7, max: 84.0) [2024-03-20 23:39:25,522][03784] Avg episode reward: [(0, '0.563')] [2024-03-20 23:39:26,829][04017] Updated weights for policy 0, policy_version 4763 (0.0015) [2024-03-20 23:39:30,521][03784] Fps is (10 sec: 55705.6, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 156303360. Throughput: 0: 45435.4. Samples: 157317100. Policy #0 lag: (min: 1.0, avg: 56.9, max: 126.0) [2024-03-20 23:39:30,522][03784] Avg episode reward: [(0, '0.428')] [2024-03-20 23:39:32,798][04017] Updated weights for policy 0, policy_version 4773 (0.0010) [2024-03-20 23:39:35,521][03784] Fps is (10 sec: 42598.0, 60 sec: 45328.9, 300 sec: 46874.9). Total num frames: 156467200. Throughput: 0: 44877.8. Samples: 157588000. Policy #0 lag: (min: 0.0, avg: 54.4, max: 116.0) [2024-03-20 23:39:35,522][03784] Avg episode reward: [(0, '0.483')] [2024-03-20 23:39:40,521][03784] Fps is (10 sec: 26214.6, 60 sec: 44782.9, 300 sec: 46430.6). Total num frames: 156565504. Throughput: 0: 45762.3. Samples: 157896600. Policy #0 lag: (min: 0.0, avg: 54.4, max: 116.0) [2024-03-20 23:39:40,522][03784] Avg episode reward: [(0, '0.483')] [2024-03-20 23:39:43,413][03995] Signal inference workers to stop experience collection... (3200 times) [2024-03-20 23:39:43,448][04017] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-03-20 23:39:43,700][03995] Signal inference workers to resume experience collection... (3200 times) [2024-03-20 23:39:43,700][04017] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-03-20 23:39:45,521][03784] Fps is (10 sec: 22937.7, 60 sec: 43144.5, 300 sec: 45986.3). Total num frames: 156696576. Throughput: 0: 46126.7. Samples: 158049600. Policy #0 lag: (min: 0.0, avg: 66.1, max: 119.0) [2024-03-20 23:39:45,522][03784] Avg episode reward: [(0, '0.657')] [2024-03-20 23:39:45,523][03995] Saving new best policy, reward=0.657! [2024-03-20 23:39:47,596][04017] Updated weights for policy 0, policy_version 4783 (0.0021) [2024-03-20 23:39:50,521][03784] Fps is (10 sec: 26214.4, 60 sec: 44783.0, 300 sec: 45542.0). Total num frames: 156827648. Throughput: 0: 46602.4. Samples: 158327800. Policy #0 lag: (min: 0.0, avg: 23.4, max: 73.0) [2024-03-20 23:39:50,522][03784] Avg episode reward: [(0, '0.526')] [2024-03-20 23:39:53,218][04017] Updated weights for policy 0, policy_version 4793 (0.0013) [2024-03-20 23:39:55,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46967.5, 300 sec: 45542.0). Total num frames: 157122560. Throughput: 0: 46544.5. Samples: 158593600. Policy #0 lag: (min: 1.0, avg: 34.3, max: 79.0) [2024-03-20 23:39:55,522][03784] Avg episode reward: [(0, '0.223')] [2024-03-20 23:39:59,845][04017] Updated weights for policy 0, policy_version 4803 (0.0016) [2024-03-20 23:40:00,521][03784] Fps is (10 sec: 58981.9, 60 sec: 49152.0, 300 sec: 45430.9). Total num frames: 157417472. Throughput: 0: 46711.1. Samples: 158731300. Policy #0 lag: (min: 2.0, avg: 45.4, max: 95.0) [2024-03-20 23:40:00,522][03784] Avg episode reward: [(0, '0.514')] [2024-03-20 23:40:00,592][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000004805_157450240.pth... [2024-03-20 23:40:00,699][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000004473_146571264.pth [2024-03-20 23:40:04,394][04017] Updated weights for policy 0, policy_version 4813 (0.0011) [2024-03-20 23:40:05,521][03784] Fps is (10 sec: 62259.5, 60 sec: 49698.3, 300 sec: 45986.3). Total num frames: 157745152. Throughput: 0: 46011.2. Samples: 158972000. Policy #0 lag: (min: 0.0, avg: 33.3, max: 109.0) [2024-03-20 23:40:05,522][03784] Avg episode reward: [(0, '0.398')] [2024-03-20 23:40:10,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45329.1, 300 sec: 45875.2). Total num frames: 157843456. Throughput: 0: 45875.5. Samples: 159243800. Policy #0 lag: (min: 0.0, avg: 34.4, max: 72.0) [2024-03-20 23:40:10,522][03784] Avg episode reward: [(0, '0.441')] [2024-03-20 23:40:14,423][04017] Updated weights for policy 0, policy_version 4823 (0.0011) [2024-03-20 23:40:15,521][03784] Fps is (10 sec: 39321.2, 60 sec: 44236.8, 300 sec: 46319.5). Total num frames: 158138368. Throughput: 0: 45528.9. Samples: 159365900. Policy #0 lag: (min: 3.0, avg: 34.0, max: 67.0) [2024-03-20 23:40:15,522][03784] Avg episode reward: [(0, '0.647')] [2024-03-20 23:40:20,521][03784] Fps is (10 sec: 36044.6, 60 sec: 40960.0, 300 sec: 45986.3). Total num frames: 158203904. Throughput: 0: 45024.5. Samples: 159614100. Policy #0 lag: (min: 3.0, avg: 34.0, max: 67.0) [2024-03-20 23:40:20,522][03784] Avg episode reward: [(0, '0.647')] [2024-03-20 23:40:22,796][04017] Updated weights for policy 0, policy_version 4833 (0.0013) [2024-03-20 23:40:25,521][03784] Fps is (10 sec: 29491.4, 60 sec: 39867.7, 300 sec: 46097.4). Total num frames: 158433280. Throughput: 0: 44124.4. Samples: 159882200. Policy #0 lag: (min: 0.0, avg: 51.0, max: 109.0) [2024-03-20 23:40:25,522][03784] Avg episode reward: [(0, '0.224')] [2024-03-20 23:40:29,759][04017] Updated weights for policy 0, policy_version 4843 (0.0020) [2024-03-20 23:40:30,521][03784] Fps is (10 sec: 55705.6, 60 sec: 40960.0, 300 sec: 46319.5). Total num frames: 158760960. Throughput: 0: 43982.2. Samples: 160028800. Policy #0 lag: (min: 0.0, avg: 39.2, max: 120.0) [2024-03-20 23:40:30,522][03784] Avg episode reward: [(0, '0.526')] [2024-03-20 23:40:34,458][03995] Signal inference workers to stop experience collection... (3250 times) [2024-03-20 23:40:34,535][03995] Signal inference workers to resume experience collection... (3250 times) [2024-03-20 23:40:34,567][04017] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-03-20 23:40:34,610][04017] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-03-20 23:40:35,521][03784] Fps is (10 sec: 52428.4, 60 sec: 41506.1, 300 sec: 46319.5). Total num frames: 158957568. Throughput: 0: 43624.4. Samples: 160290900. Policy #0 lag: (min: 0.0, avg: 58.3, max: 103.0) [2024-03-20 23:40:35,522][03784] Avg episode reward: [(0, '0.313')] [2024-03-20 23:40:36,100][04017] Updated weights for policy 0, policy_version 4853 (0.0012) [2024-03-20 23:40:40,521][03784] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 46208.4). Total num frames: 159186944. Throughput: 0: 44248.8. Samples: 160584800. Policy #0 lag: (min: 0.0, avg: 38.1, max: 97.0) [2024-03-20 23:40:40,522][03784] Avg episode reward: [(0, '0.391')] [2024-03-20 23:40:41,770][04017] Updated weights for policy 0, policy_version 4863 (0.0017) [2024-03-20 23:40:45,521][03784] Fps is (10 sec: 52428.9, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 159481856. Throughput: 0: 43908.9. Samples: 160707200. Policy #0 lag: (min: 1.0, avg: 58.7, max: 96.0) [2024-03-20 23:40:45,522][03784] Avg episode reward: [(0, '0.472')] [2024-03-20 23:40:47,694][04017] Updated weights for policy 0, policy_version 4873 (0.0010) [2024-03-20 23:40:50,521][03784] Fps is (10 sec: 65536.3, 60 sec: 50244.2, 300 sec: 46208.4). Total num frames: 159842304. Throughput: 0: 44902.1. Samples: 160992600. Policy #0 lag: (min: 1.0, avg: 58.7, max: 96.0) [2024-03-20 23:40:50,522][03784] Avg episode reward: [(0, '0.519')] [2024-03-20 23:40:52,663][04017] Updated weights for policy 0, policy_version 4883 (0.0016) [2024-03-20 23:40:55,521][03784] Fps is (10 sec: 58982.5, 60 sec: 49152.0, 300 sec: 45764.1). Total num frames: 160071680. Throughput: 0: 45246.6. Samples: 161279900. Policy #0 lag: (min: 1.0, avg: 32.6, max: 83.0) [2024-03-20 23:40:55,522][03784] Avg episode reward: [(0, '0.276')] [2024-03-20 23:41:00,521][03784] Fps is (10 sec: 45875.3, 60 sec: 48059.8, 300 sec: 45875.2). Total num frames: 160301056. Throughput: 0: 45617.8. Samples: 161418700. Policy #0 lag: (min: 0.0, avg: 41.0, max: 109.0) [2024-03-20 23:41:00,522][03784] Avg episode reward: [(0, '0.276')] [2024-03-20 23:41:00,554][04017] Updated weights for policy 0, policy_version 4893 (0.0015) [2024-03-20 23:41:05,521][03784] Fps is (10 sec: 36045.2, 60 sec: 44782.9, 300 sec: 45986.3). Total num frames: 160432128. Throughput: 0: 45960.1. Samples: 161682300. Policy #0 lag: (min: 0.0, avg: 40.0, max: 79.0) [2024-03-20 23:41:05,522][03784] Avg episode reward: [(0, '0.436')] [2024-03-20 23:41:10,521][03784] Fps is (10 sec: 19660.8, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 160497664. Throughput: 0: 46362.2. Samples: 161968500. Policy #0 lag: (min: 1.0, avg: 35.5, max: 72.0) [2024-03-20 23:41:10,522][03784] Avg episode reward: [(0, '0.520')] [2024-03-20 23:41:15,184][04017] Updated weights for policy 0, policy_version 4903 (0.0014) [2024-03-20 23:41:15,521][03784] Fps is (10 sec: 22937.5, 60 sec: 42052.3, 300 sec: 45208.7). Total num frames: 160661504. Throughput: 0: 45615.6. Samples: 162081500. Policy #0 lag: (min: 0.0, avg: 29.9, max: 73.0) [2024-03-20 23:41:15,522][03784] Avg episode reward: [(0, '0.533')] [2024-03-20 23:41:20,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 45875.2). Total num frames: 160923648. Throughput: 0: 45120.0. Samples: 162321300. Policy #0 lag: (min: 0.0, avg: 31.9, max: 75.0) [2024-03-20 23:41:20,522][03784] Avg episode reward: [(0, '0.332')] [2024-03-20 23:41:21,729][04017] Updated weights for policy 0, policy_version 4913 (0.0014) [2024-03-20 23:41:25,521][03784] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 45986.3). Total num frames: 161120256. Throughput: 0: 44640.1. Samples: 162593600. Policy #0 lag: (min: 0.0, avg: 31.9, max: 75.0) [2024-03-20 23:41:25,522][03784] Avg episode reward: [(0, '0.185')] [2024-03-20 23:41:28,888][04017] Updated weights for policy 0, policy_version 4923 (0.0012) [2024-03-20 23:41:30,521][03784] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 45653.0). Total num frames: 161316864. Throughput: 0: 45088.9. Samples: 162736200. Policy #0 lag: (min: 0.0, avg: 28.5, max: 69.0) [2024-03-20 23:41:30,522][03784] Avg episode reward: [(0, '0.185')] [2024-03-20 23:41:35,521][03784] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 45653.1). Total num frames: 161579008. Throughput: 0: 44817.8. Samples: 163009400. Policy #0 lag: (min: 0.0, avg: 57.3, max: 105.0) [2024-03-20 23:41:35,522][03784] Avg episode reward: [(0, '0.815')] [2024-03-20 23:41:35,806][03995] Saving new best policy, reward=0.815! [2024-03-20 23:41:36,116][03995] Signal inference workers to stop experience collection... (3300 times) [2024-03-20 23:41:36,185][03995] Signal inference workers to resume experience collection... (3300 times) [2024-03-20 23:41:36,221][04017] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-03-20 23:41:36,223][04017] Updated weights for policy 0, policy_version 4933 (0.0010) [2024-03-20 23:41:36,266][04017] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-03-20 23:41:40,422][04017] Updated weights for policy 0, policy_version 4943 (0.0011) [2024-03-20 23:41:40,521][03784] Fps is (10 sec: 65535.3, 60 sec: 46421.3, 300 sec: 46319.5). Total num frames: 161972224. Throughput: 0: 44462.1. Samples: 163280700. Policy #0 lag: (min: 5.0, avg: 40.8, max: 85.0) [2024-03-20 23:41:40,522][03784] Avg episode reward: [(0, '0.651')] [2024-03-20 23:41:44,543][04017] Updated weights for policy 0, policy_version 4953 (0.0015) [2024-03-20 23:41:45,521][03784] Fps is (10 sec: 78643.4, 60 sec: 48059.8, 300 sec: 46763.8). Total num frames: 162365440. Throughput: 0: 44362.3. Samples: 163415000. Policy #0 lag: (min: 0.0, avg: 40.6, max: 78.0) [2024-03-20 23:41:45,522][03784] Avg episode reward: [(0, '0.453')] [2024-03-20 23:41:47,860][04017] Updated weights for policy 0, policy_version 4963 (0.0011) [2024-03-20 23:41:50,521][03784] Fps is (10 sec: 72090.5, 60 sec: 47513.7, 300 sec: 46763.9). Total num frames: 162693120. Throughput: 0: 44375.5. Samples: 163679200. Policy #0 lag: (min: 2.0, avg: 52.5, max: 92.0) [2024-03-20 23:41:50,522][03784] Avg episode reward: [(0, '0.640')] [2024-03-20 23:41:55,521][03784] Fps is (10 sec: 42598.1, 60 sec: 45329.1, 300 sec: 46097.4). Total num frames: 162791424. Throughput: 0: 45057.8. Samples: 163996100. Policy #0 lag: (min: 2.0, avg: 52.5, max: 92.0) [2024-03-20 23:41:55,522][03784] Avg episode reward: [(0, '0.640')] [2024-03-20 23:42:00,521][03784] Fps is (10 sec: 26214.3, 60 sec: 44236.8, 300 sec: 45430.9). Total num frames: 162955264. Throughput: 0: 45891.1. Samples: 164146600. Policy #0 lag: (min: 0.0, avg: 42.5, max: 76.0) [2024-03-20 23:42:00,522][03784] Avg episode reward: [(0, '0.201')] [2024-03-20 23:42:00,522][04017] Updated weights for policy 0, policy_version 4973 (0.0019) [2024-03-20 23:42:00,856][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000004974_162988032.pth... [2024-03-20 23:42:00,997][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000004640_152043520.pth [2024-03-20 23:42:05,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 163250176. Throughput: 0: 46433.3. Samples: 164410800. Policy #0 lag: (min: 0.0, avg: 41.3, max: 83.0) [2024-03-20 23:42:05,522][03784] Avg episode reward: [(0, '0.591')] [2024-03-20 23:42:06,706][04017] Updated weights for policy 0, policy_version 4983 (0.0012) [2024-03-20 23:42:10,521][03784] Fps is (10 sec: 45875.5, 60 sec: 48605.9, 300 sec: 45542.0). Total num frames: 163414016. Throughput: 0: 46235.6. Samples: 164674200. Policy #0 lag: (min: 1.0, avg: 38.6, max: 79.0) [2024-03-20 23:42:10,522][03784] Avg episode reward: [(0, '0.489')] [2024-03-20 23:42:15,521][03784] Fps is (10 sec: 26214.4, 60 sec: 47513.5, 300 sec: 45430.9). Total num frames: 163512320. Throughput: 0: 46655.5. Samples: 164835700. Policy #0 lag: (min: 0.0, avg: 38.2, max: 81.0) [2024-03-20 23:42:15,522][03784] Avg episode reward: [(0, '0.489')] [2024-03-20 23:42:17,133][04017] Updated weights for policy 0, policy_version 4993 (0.0017) [2024-03-20 23:42:18,858][03995] Signal inference workers to stop experience collection... (3350 times) [2024-03-20 23:42:18,911][04017] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-03-20 23:42:19,081][03995] Signal inference workers to resume experience collection... (3350 times) [2024-03-20 23:42:19,082][04017] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-03-20 23:42:20,521][03784] Fps is (10 sec: 42598.3, 60 sec: 48605.9, 300 sec: 46208.4). Total num frames: 163840000. Throughput: 0: 46160.0. Samples: 165086600. Policy #0 lag: (min: 4.0, avg: 51.0, max: 101.0) [2024-03-20 23:42:20,522][03784] Avg episode reward: [(0, '0.529')] [2024-03-20 23:42:25,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45875.1, 300 sec: 45653.0). Total num frames: 163872768. Throughput: 0: 47060.1. Samples: 165398400. Policy #0 lag: (min: 0.0, avg: 21.5, max: 61.0) [2024-03-20 23:42:25,531][03784] Avg episode reward: [(0, '0.529')] [2024-03-20 23:42:26,265][04017] Updated weights for policy 0, policy_version 5003 (0.0011) [2024-03-20 23:42:30,521][03784] Fps is (10 sec: 26214.2, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 164102144. Throughput: 0: 47199.9. Samples: 165539000. Policy #0 lag: (min: 0.0, avg: 21.5, max: 61.0) [2024-03-20 23:42:30,522][03784] Avg episode reward: [(0, '0.640')] [2024-03-20 23:42:32,086][04017] Updated weights for policy 0, policy_version 5013 (0.0024) [2024-03-20 23:42:35,521][03784] Fps is (10 sec: 62259.7, 60 sec: 48605.9, 300 sec: 45875.2). Total num frames: 164495360. Throughput: 0: 47455.6. Samples: 165814700. Policy #0 lag: (min: 3.0, avg: 22.4, max: 90.0) [2024-03-20 23:42:35,522][03784] Avg episode reward: [(0, '0.531')] [2024-03-20 23:42:36,247][04017] Updated weights for policy 0, policy_version 5023 (0.0023) [2024-03-20 23:42:40,521][03784] Fps is (10 sec: 78643.5, 60 sec: 48605.9, 300 sec: 45986.3). Total num frames: 164888576. Throughput: 0: 46217.8. Samples: 166075900. Policy #0 lag: (min: 3.0, avg: 44.6, max: 109.0) [2024-03-20 23:42:40,522][03784] Avg episode reward: [(0, '0.674')] [2024-03-20 23:42:42,752][04017] Updated weights for policy 0, policy_version 5033 (0.0015) [2024-03-20 23:42:45,521][03784] Fps is (10 sec: 62259.1, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 165117952. Throughput: 0: 46093.4. Samples: 166220800. Policy #0 lag: (min: 0.0, avg: 43.9, max: 93.0) [2024-03-20 23:42:45,522][03784] Avg episode reward: [(0, '0.240')] [2024-03-20 23:42:47,131][04017] Updated weights for policy 0, policy_version 5043 (0.0028) [2024-03-20 23:42:50,521][03784] Fps is (10 sec: 55705.6, 60 sec: 45875.2, 300 sec: 46652.8). Total num frames: 165445632. Throughput: 0: 46304.5. Samples: 166494500. Policy #0 lag: (min: 0.0, avg: 43.5, max: 90.0) [2024-03-20 23:42:50,522][03784] Avg episode reward: [(0, '0.619')] [2024-03-20 23:42:53,471][04017] Updated weights for policy 0, policy_version 5053 (0.0015) [2024-03-20 23:42:55,521][03784] Fps is (10 sec: 55705.8, 60 sec: 48059.8, 300 sec: 46319.5). Total num frames: 165675008. Throughput: 0: 46873.3. Samples: 166783500. Policy #0 lag: (min: 0.0, avg: 50.0, max: 99.0) [2024-03-20 23:42:55,522][03784] Avg episode reward: [(0, '0.619')] [2024-03-20 23:43:00,521][03784] Fps is (10 sec: 36045.0, 60 sec: 47513.7, 300 sec: 46097.4). Total num frames: 165806080. Throughput: 0: 46462.3. Samples: 166926500. Policy #0 lag: (min: 0.0, avg: 50.0, max: 99.0) [2024-03-20 23:43:00,522][03784] Avg episode reward: [(0, '0.435')] [2024-03-20 23:43:02,981][04017] Updated weights for policy 0, policy_version 5063 (0.0016) [2024-03-20 23:43:05,521][03784] Fps is (10 sec: 36044.4, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 166035456. Throughput: 0: 47131.0. Samples: 167207500. Policy #0 lag: (min: 0.0, avg: 37.1, max: 82.0) [2024-03-20 23:43:05,522][03784] Avg episode reward: [(0, '0.421')] [2024-03-20 23:43:10,120][04017] Updated weights for policy 0, policy_version 5073 (0.0015) [2024-03-20 23:43:10,521][03784] Fps is (10 sec: 42598.1, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 166232064. Throughput: 0: 46497.8. Samples: 167490800. Policy #0 lag: (min: 0.0, avg: 47.5, max: 96.0) [2024-03-20 23:43:10,522][03784] Avg episode reward: [(0, '0.191')] [2024-03-20 23:43:11,166][03995] Signal inference workers to stop experience collection... (3400 times) [2024-03-20 23:43:11,236][03995] Signal inference workers to resume experience collection... (3400 times) [2024-03-20 23:43:11,242][04017] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-03-20 23:43:11,293][04017] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-03-20 23:43:15,521][03784] Fps is (10 sec: 45875.6, 60 sec: 49698.2, 300 sec: 46097.4). Total num frames: 166494208. Throughput: 0: 46520.1. Samples: 167632400. Policy #0 lag: (min: 2.0, avg: 40.2, max: 85.0) [2024-03-20 23:43:15,522][03784] Avg episode reward: [(0, '0.191')] [2024-03-20 23:43:19,392][04017] Updated weights for policy 0, policy_version 5083 (0.0011) [2024-03-20 23:43:20,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46421.2, 300 sec: 45319.8). Total num frames: 166625280. Throughput: 0: 47344.3. Samples: 167945200. Policy #0 lag: (min: 0.0, avg: 29.8, max: 79.0) [2024-03-20 23:43:20,522][03784] Avg episode reward: [(0, '0.413')] [2024-03-20 23:43:25,521][03784] Fps is (10 sec: 29491.1, 60 sec: 48605.9, 300 sec: 45319.8). Total num frames: 166789120. Throughput: 0: 47617.8. Samples: 168218700. Policy #0 lag: (min: 0.0, avg: 33.4, max: 80.0) [2024-03-20 23:43:25,522][03784] Avg episode reward: [(0, '0.496')] [2024-03-20 23:43:27,036][04017] Updated weights for policy 0, policy_version 5093 (0.0043) [2024-03-20 23:43:30,521][03784] Fps is (10 sec: 42598.8, 60 sec: 49152.0, 300 sec: 45097.6). Total num frames: 167051264. Throughput: 0: 47293.3. Samples: 168349000. Policy #0 lag: (min: 0.0, avg: 33.4, max: 80.0) [2024-03-20 23:43:30,522][03784] Avg episode reward: [(0, '0.299')] [2024-03-20 23:43:34,295][04017] Updated weights for policy 0, policy_version 5103 (0.0016) [2024-03-20 23:43:35,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46421.3, 300 sec: 45430.9). Total num frames: 167280640. Throughput: 0: 47162.2. Samples: 168616800. Policy #0 lag: (min: 0.0, avg: 35.5, max: 80.0) [2024-03-20 23:43:35,522][03784] Avg episode reward: [(0, '0.496')] [2024-03-20 23:43:39,915][04017] Updated weights for policy 0, policy_version 5113 (0.0011) [2024-03-20 23:43:40,521][03784] Fps is (10 sec: 52428.8, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 167575552. Throughput: 0: 46584.4. Samples: 168879800. Policy #0 lag: (min: 0.0, avg: 32.8, max: 78.0) [2024-03-20 23:43:40,522][03784] Avg episode reward: [(0, '0.587')] [2024-03-20 23:43:45,029][04017] Updated weights for policy 0, policy_version 5123 (0.0019) [2024-03-20 23:43:45,521][03784] Fps is (10 sec: 58982.3, 60 sec: 45875.1, 300 sec: 46541.7). Total num frames: 167870464. Throughput: 0: 46184.3. Samples: 169004800. Policy #0 lag: (min: 2.0, avg: 47.6, max: 99.0) [2024-03-20 23:43:45,522][03784] Avg episode reward: [(0, '0.577')] [2024-03-20 23:43:50,521][03784] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 46430.6). Total num frames: 168001536. Throughput: 0: 45873.4. Samples: 169271800. Policy #0 lag: (min: 0.0, avg: 39.8, max: 91.0) [2024-03-20 23:43:50,522][03784] Avg episode reward: [(0, '0.481')] [2024-03-20 23:43:53,216][04017] Updated weights for policy 0, policy_version 5133 (0.0012) [2024-03-20 23:43:55,521][03784] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 46874.9). Total num frames: 168296448. Throughput: 0: 45502.2. Samples: 169538400. Policy #0 lag: (min: 0.0, avg: 39.8, max: 91.0) [2024-03-20 23:43:55,522][03784] Avg episode reward: [(0, '0.566')] [2024-03-20 23:43:59,900][04017] Updated weights for policy 0, policy_version 5143 (0.0016) [2024-03-20 23:44:00,521][03784] Fps is (10 sec: 52428.2, 60 sec: 45328.9, 300 sec: 46652.7). Total num frames: 168525824. Throughput: 0: 45399.9. Samples: 169675400. Policy #0 lag: (min: 1.0, avg: 34.3, max: 67.0) [2024-03-20 23:44:00,522][03784] Avg episode reward: [(0, '0.305')] [2024-03-20 23:44:00,674][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000005144_168558592.pth... [2024-03-20 23:44:00,819][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000004805_157450240.pth [2024-03-20 23:44:05,521][03784] Fps is (10 sec: 39321.9, 60 sec: 44236.9, 300 sec: 45986.3). Total num frames: 168689664. Throughput: 0: 44331.2. Samples: 169940100. Policy #0 lag: (min: 1.0, avg: 36.4, max: 71.0) [2024-03-20 23:44:05,522][03784] Avg episode reward: [(0, '0.554')] [2024-03-20 23:44:09,625][04017] Updated weights for policy 0, policy_version 5153 (0.0011) [2024-03-20 23:44:10,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44236.7, 300 sec: 45430.9). Total num frames: 168886272. Throughput: 0: 44131.0. Samples: 170204600. Policy #0 lag: (min: 0.0, avg: 46.6, max: 100.0) [2024-03-20 23:44:10,522][03784] Avg episode reward: [(0, '0.313')] [2024-03-20 23:44:11,069][03995] Signal inference workers to stop experience collection... (3450 times) [2024-03-20 23:44:11,139][03995] Signal inference workers to resume experience collection... (3450 times) [2024-03-20 23:44:11,148][04017] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-03-20 23:44:11,193][04017] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-03-20 23:44:15,521][03784] Fps is (10 sec: 32768.0, 60 sec: 42052.3, 300 sec: 44986.6). Total num frames: 169017344. Throughput: 0: 44480.1. Samples: 170350600. Policy #0 lag: (min: 0.0, avg: 37.0, max: 75.0) [2024-03-20 23:44:15,522][03784] Avg episode reward: [(0, '0.224')] [2024-03-20 23:44:16,891][04017] Updated weights for policy 0, policy_version 5163 (0.0012) [2024-03-20 23:44:20,521][03784] Fps is (10 sec: 45875.3, 60 sec: 45329.1, 300 sec: 45097.6). Total num frames: 169345024. Throughput: 0: 45068.8. Samples: 170644900. Policy #0 lag: (min: 0.0, avg: 37.0, max: 75.0) [2024-03-20 23:44:20,522][03784] Avg episode reward: [(0, '0.224')] [2024-03-20 23:44:21,996][04017] Updated weights for policy 0, policy_version 5173 (0.0009) [2024-03-20 23:44:25,521][03784] Fps is (10 sec: 55705.1, 60 sec: 46421.3, 300 sec: 44986.6). Total num frames: 169574400. Throughput: 0: 45235.5. Samples: 170915400. Policy #0 lag: (min: 0.0, avg: 36.7, max: 111.0) [2024-03-20 23:44:25,522][03784] Avg episode reward: [(0, '0.512')] [2024-03-20 23:44:29,260][04017] Updated weights for policy 0, policy_version 5183 (0.0013) [2024-03-20 23:44:30,521][03784] Fps is (10 sec: 55706.2, 60 sec: 47513.6, 300 sec: 45542.0). Total num frames: 169902080. Throughput: 0: 45042.3. Samples: 171031700. Policy #0 lag: (min: 1.0, avg: 37.8, max: 78.0) [2024-03-20 23:44:30,522][03784] Avg episode reward: [(0, '0.522')] [2024-03-20 23:44:35,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 170065920. Throughput: 0: 45299.9. Samples: 171310300. Policy #0 lag: (min: 0.0, avg: 61.6, max: 117.0) [2024-03-20 23:44:35,522][03784] Avg episode reward: [(0, '0.545')] [2024-03-20 23:44:38,662][04017] Updated weights for policy 0, policy_version 5193 (0.0009) [2024-03-20 23:44:40,521][03784] Fps is (10 sec: 32768.0, 60 sec: 44236.8, 300 sec: 45875.2). Total num frames: 170229760. Throughput: 0: 44537.8. Samples: 171542600. Policy #0 lag: (min: 0.0, avg: 32.9, max: 75.0) [2024-03-20 23:44:40,522][03784] Avg episode reward: [(0, '0.652')] [2024-03-20 23:44:45,521][03784] Fps is (10 sec: 32768.1, 60 sec: 42052.3, 300 sec: 45986.3). Total num frames: 170393600. Throughput: 0: 44557.8. Samples: 171680500. Policy #0 lag: (min: 0.0, avg: 32.7, max: 71.0) [2024-03-20 23:44:45,522][03784] Avg episode reward: [(0, '0.561')] [2024-03-20 23:44:48,232][04017] Updated weights for policy 0, policy_version 5203 (0.0011) [2024-03-20 23:44:50,521][03784] Fps is (10 sec: 36044.7, 60 sec: 43144.5, 300 sec: 45653.0). Total num frames: 170590208. Throughput: 0: 44644.4. Samples: 171949100. Policy #0 lag: (min: 0.0, avg: 55.9, max: 113.0) [2024-03-20 23:44:50,522][03784] Avg episode reward: [(0, '0.666')] [2024-03-20 23:44:55,345][04017] Updated weights for policy 0, policy_version 5213 (0.0017) [2024-03-20 23:44:55,521][03784] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 45430.9). Total num frames: 170819584. Throughput: 0: 44266.7. Samples: 172196600. Policy #0 lag: (min: 0.0, avg: 29.1, max: 66.0) [2024-03-20 23:44:55,522][03784] Avg episode reward: [(0, '0.272')] [2024-03-20 23:45:00,521][03784] Fps is (10 sec: 39321.2, 60 sec: 40960.0, 300 sec: 44875.5). Total num frames: 170983424. Throughput: 0: 43977.6. Samples: 172329600. Policy #0 lag: (min: 0.0, avg: 26.2, max: 67.0) [2024-03-20 23:45:00,522][03784] Avg episode reward: [(0, '0.244')] [2024-03-20 23:45:02,713][04017] Updated weights for policy 0, policy_version 5223 (0.0015) [2024-03-20 23:45:05,521][03784] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 171245568. Throughput: 0: 43309.0. Samples: 172593800. Policy #0 lag: (min: 0.0, avg: 26.2, max: 67.0) [2024-03-20 23:45:05,522][03784] Avg episode reward: [(0, '0.446')] [2024-03-20 23:45:09,619][04017] Updated weights for policy 0, policy_version 5233 (0.0019) [2024-03-20 23:45:10,521][03784] Fps is (10 sec: 55706.2, 60 sec: 44236.9, 300 sec: 45430.9). Total num frames: 171540480. Throughput: 0: 43102.3. Samples: 172855000. Policy #0 lag: (min: 1.0, avg: 29.9, max: 59.0) [2024-03-20 23:45:10,522][03784] Avg episode reward: [(0, '0.328')] [2024-03-20 23:45:10,549][03995] Signal inference workers to stop experience collection... (3500 times) [2024-03-20 23:45:10,623][03995] Signal inference workers to resume experience collection... (3500 times) [2024-03-20 23:45:10,630][04017] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-03-20 23:45:10,710][04017] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-03-20 23:45:13,354][04017] Updated weights for policy 0, policy_version 5243 (0.0013) [2024-03-20 23:45:15,521][03784] Fps is (10 sec: 65535.7, 60 sec: 48059.7, 300 sec: 46430.6). Total num frames: 171900928. Throughput: 0: 43104.4. Samples: 172971400. Policy #0 lag: (min: 2.0, avg: 35.5, max: 80.0) [2024-03-20 23:45:15,530][03784] Avg episode reward: [(0, '0.669')] [2024-03-20 23:45:20,521][03784] Fps is (10 sec: 52428.9, 60 sec: 45329.1, 300 sec: 46208.4). Total num frames: 172064768. Throughput: 0: 43242.3. Samples: 173256200. Policy #0 lag: (min: 0.0, avg: 65.5, max: 120.0) [2024-03-20 23:45:20,522][03784] Avg episode reward: [(0, '0.742')] [2024-03-20 23:45:22,496][04017] Updated weights for policy 0, policy_version 5253 (0.0010) [2024-03-20 23:45:25,521][03784] Fps is (10 sec: 42598.2, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 172326912. Throughput: 0: 43857.7. Samples: 173516200. Policy #0 lag: (min: 1.0, avg: 38.2, max: 81.0) [2024-03-20 23:45:25,522][03784] Avg episode reward: [(0, '0.314')] [2024-03-20 23:45:28,484][04017] Updated weights for policy 0, policy_version 5263 (0.0013) [2024-03-20 23:45:30,521][03784] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 45986.3). Total num frames: 172523520. Throughput: 0: 43986.7. Samples: 173659900. Policy #0 lag: (min: 1.0, avg: 38.2, max: 81.0) [2024-03-20 23:45:30,522][03784] Avg episode reward: [(0, '0.733')] [2024-03-20 23:45:35,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43690.7, 300 sec: 45764.1). Total num frames: 172687360. Throughput: 0: 44435.5. Samples: 173948700. Policy #0 lag: (min: 0.0, avg: 46.9, max: 94.0) [2024-03-20 23:45:35,522][03784] Avg episode reward: [(0, '0.189')] [2024-03-20 23:45:37,801][04017] Updated weights for policy 0, policy_version 5273 (0.0010) [2024-03-20 23:45:40,521][03784] Fps is (10 sec: 45875.2, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 172982272. Throughput: 0: 45237.9. Samples: 174232300. Policy #0 lag: (min: 4.0, avg: 40.5, max: 85.0) [2024-03-20 23:45:40,522][03784] Avg episode reward: [(0, '0.189')] [2024-03-20 23:45:45,521][03784] Fps is (10 sec: 39321.8, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 173080576. Throughput: 0: 45369.0. Samples: 174371200. Policy #0 lag: (min: 4.0, avg: 40.5, max: 85.0) [2024-03-20 23:45:45,522][03784] Avg episode reward: [(0, '0.189')] [2024-03-20 23:45:46,108][04017] Updated weights for policy 0, policy_version 5283 (0.0021) [2024-03-20 23:45:50,521][03784] Fps is (10 sec: 26214.4, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 173244416. Throughput: 0: 45517.7. Samples: 174642100. Policy #0 lag: (min: 0.0, avg: 33.7, max: 73.0) [2024-03-20 23:45:50,522][03784] Avg episode reward: [(0, '0.228')] [2024-03-20 23:45:52,740][04017] Updated weights for policy 0, policy_version 5293 (0.0011) [2024-03-20 23:45:55,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 173539328. Throughput: 0: 45508.9. Samples: 174902900. Policy #0 lag: (min: 0.0, avg: 26.6, max: 75.0) [2024-03-20 23:45:55,522][03784] Avg episode reward: [(0, '0.473')] [2024-03-20 23:45:58,753][04017] Updated weights for policy 0, policy_version 5303 (0.0020) [2024-03-20 23:46:00,521][03784] Fps is (10 sec: 52428.3, 60 sec: 46421.4, 300 sec: 45208.7). Total num frames: 173768704. Throughput: 0: 45977.7. Samples: 175040400. Policy #0 lag: (min: 1.0, avg: 30.2, max: 66.0) [2024-03-20 23:46:00,522][03784] Avg episode reward: [(0, '0.473')] [2024-03-20 23:46:00,543][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000005303_173768704.pth... [2024-03-20 23:46:00,712][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000004974_162988032.pth [2024-03-20 23:46:05,521][03784] Fps is (10 sec: 45875.5, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 173998080. Throughput: 0: 45355.6. Samples: 175297200. Policy #0 lag: (min: 0.0, avg: 37.6, max: 105.0) [2024-03-20 23:46:05,522][03784] Avg episode reward: [(0, '0.630')] [2024-03-20 23:46:08,525][04017] Updated weights for policy 0, policy_version 5313 (0.0012) [2024-03-20 23:46:08,593][03995] Signal inference workers to stop experience collection... (3550 times) [2024-03-20 23:46:08,711][04017] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-03-20 23:46:08,798][03995] Signal inference workers to resume experience collection... (3550 times) [2024-03-20 23:46:08,799][04017] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-03-20 23:46:10,521][03784] Fps is (10 sec: 49153.0, 60 sec: 45329.2, 300 sec: 46097.4). Total num frames: 174260224. Throughput: 0: 44909.1. Samples: 175537100. Policy #0 lag: (min: 1.0, avg: 56.3, max: 110.0) [2024-03-20 23:46:10,522][03784] Avg episode reward: [(0, '0.556')] [2024-03-20 23:46:15,521][03784] Fps is (10 sec: 36044.7, 60 sec: 40960.0, 300 sec: 45542.0). Total num frames: 174358528. Throughput: 0: 45042.2. Samples: 175686800. Policy #0 lag: (min: 1.0, avg: 33.8, max: 67.0) [2024-03-20 23:46:15,522][03784] Avg episode reward: [(0, '0.845')] [2024-03-20 23:46:15,657][03995] Saving new best policy, reward=0.845! [2024-03-20 23:46:16,163][04017] Updated weights for policy 0, policy_version 5323 (0.0011) [2024-03-20 23:46:20,521][03784] Fps is (10 sec: 36044.4, 60 sec: 42598.4, 300 sec: 45764.1). Total num frames: 174620672. Throughput: 0: 44633.4. Samples: 175957200. Policy #0 lag: (min: 0.0, avg: 39.3, max: 86.0) [2024-03-20 23:46:20,522][03784] Avg episode reward: [(0, '0.948')] [2024-03-20 23:46:20,648][03995] Saving new best policy, reward=0.948! [2024-03-20 23:46:23,756][04017] Updated weights for policy 0, policy_version 5333 (0.0012) [2024-03-20 23:46:25,521][03784] Fps is (10 sec: 52427.4, 60 sec: 42598.3, 300 sec: 45986.2). Total num frames: 174882816. Throughput: 0: 44553.1. Samples: 176237200. Policy #0 lag: (min: 1.0, avg: 62.3, max: 115.0) [2024-03-20 23:46:25,522][03784] Avg episode reward: [(0, '0.790')] [2024-03-20 23:46:28,590][04017] Updated weights for policy 0, policy_version 5343 (0.0015) [2024-03-20 23:46:30,521][03784] Fps is (10 sec: 58982.0, 60 sec: 44782.9, 300 sec: 46208.4). Total num frames: 175210496. Throughput: 0: 44437.7. Samples: 176370900. Policy #0 lag: (min: 0.0, avg: 47.7, max: 93.0) [2024-03-20 23:46:30,522][03784] Avg episode reward: [(0, '0.607')] [2024-03-20 23:46:33,042][04017] Updated weights for policy 0, policy_version 5353 (0.0016) [2024-03-20 23:46:35,521][03784] Fps is (10 sec: 62260.6, 60 sec: 46967.5, 300 sec: 45875.2). Total num frames: 175505408. Throughput: 0: 44722.2. Samples: 176654600. Policy #0 lag: (min: 0.0, avg: 47.7, max: 93.0) [2024-03-20 23:46:35,522][03784] Avg episode reward: [(0, '0.607')] [2024-03-20 23:46:40,307][04017] Updated weights for policy 0, policy_version 5363 (0.0020) [2024-03-20 23:46:40,521][03784] Fps is (10 sec: 52428.8, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 175734784. Throughput: 0: 45031.1. Samples: 176929300. Policy #0 lag: (min: 0.0, avg: 38.5, max: 76.0) [2024-03-20 23:46:40,522][03784] Avg episode reward: [(0, '0.224')] [2024-03-20 23:46:45,521][03784] Fps is (10 sec: 45875.3, 60 sec: 48059.7, 300 sec: 44986.6). Total num frames: 175964160. Throughput: 0: 45129.0. Samples: 177071200. Policy #0 lag: (min: 1.0, avg: 37.7, max: 73.0) [2024-03-20 23:46:45,522][03784] Avg episode reward: [(0, '0.785')] [2024-03-20 23:46:46,834][04017] Updated weights for policy 0, policy_version 5373 (0.0015) [2024-03-20 23:46:50,521][03784] Fps is (10 sec: 39321.7, 60 sec: 48059.7, 300 sec: 45208.7). Total num frames: 176128000. Throughput: 0: 45597.7. Samples: 177349100. Policy #0 lag: (min: 1.0, avg: 37.7, max: 73.0) [2024-03-20 23:46:50,522][03784] Avg episode reward: [(0, '0.825')] [2024-03-20 23:46:53,955][04017] Updated weights for policy 0, policy_version 5383 (0.0021) [2024-03-20 23:46:55,521][03784] Fps is (10 sec: 52428.5, 60 sec: 49152.0, 300 sec: 45875.2). Total num frames: 176488448. Throughput: 0: 46433.1. Samples: 177626600. Policy #0 lag: (min: 0.0, avg: 37.5, max: 77.0) [2024-03-20 23:46:55,522][03784] Avg episode reward: [(0, '0.825')] [2024-03-20 23:46:59,056][03995] Signal inference workers to stop experience collection... (3600 times) [2024-03-20 23:46:59,118][03995] Signal inference workers to resume experience collection... (3600 times) [2024-03-20 23:46:59,139][04017] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-03-20 23:46:59,180][04017] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-03-20 23:47:00,521][03784] Fps is (10 sec: 55705.0, 60 sec: 48605.8, 300 sec: 45542.0). Total num frames: 176685056. Throughput: 0: 46104.3. Samples: 177761500. Policy #0 lag: (min: 2.0, avg: 40.6, max: 88.0) [2024-03-20 23:47:00,522][03784] Avg episode reward: [(0, '0.571')] [2024-03-20 23:47:05,521][03784] Fps is (10 sec: 19660.7, 60 sec: 44782.8, 300 sec: 44986.6). Total num frames: 176685056. Throughput: 0: 46648.8. Samples: 178056400. Policy #0 lag: (min: 2.0, avg: 40.6, max: 88.0) [2024-03-20 23:47:05,522][03784] Avg episode reward: [(0, '0.490')] [2024-03-20 23:47:05,751][04017] Updated weights for policy 0, policy_version 5393 (0.0011) [2024-03-20 23:47:10,521][03784] Fps is (10 sec: 26214.6, 60 sec: 44782.8, 300 sec: 45542.0). Total num frames: 176947200. Throughput: 0: 46300.2. Samples: 178320700. Policy #0 lag: (min: 0.0, avg: 28.7, max: 80.0) [2024-03-20 23:47:10,522][03784] Avg episode reward: [(0, '0.427')] [2024-03-20 23:47:13,352][04017] Updated weights for policy 0, policy_version 5403 (0.0018) [2024-03-20 23:47:15,521][03784] Fps is (10 sec: 52429.2, 60 sec: 47513.6, 300 sec: 45319.8). Total num frames: 177209344. Throughput: 0: 46575.6. Samples: 178466800. Policy #0 lag: (min: 0.0, avg: 31.4, max: 74.0) [2024-03-20 23:47:15,522][03784] Avg episode reward: [(0, '0.550')] [2024-03-20 23:47:20,521][03784] Fps is (10 sec: 29491.4, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 177242112. Throughput: 0: 46680.0. Samples: 178755200. Policy #0 lag: (min: 0.0, avg: 31.4, max: 74.0) [2024-03-20 23:47:20,522][03784] Avg episode reward: [(0, '0.236')] [2024-03-20 23:47:22,233][04017] Updated weights for policy 0, policy_version 5413 (0.0011) [2024-03-20 23:47:25,521][03784] Fps is (10 sec: 39321.4, 60 sec: 45329.2, 300 sec: 45764.1). Total num frames: 177602560. Throughput: 0: 46388.9. Samples: 179016800. Policy #0 lag: (min: 0.0, avg: 26.7, max: 63.0) [2024-03-20 23:47:25,522][03784] Avg episode reward: [(0, '0.557')] [2024-03-20 23:47:27,125][04017] Updated weights for policy 0, policy_version 5423 (0.0012) [2024-03-20 23:47:30,521][03784] Fps is (10 sec: 75365.8, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 177995776. Throughput: 0: 46020.0. Samples: 179142100. Policy #0 lag: (min: 0.0, avg: 30.3, max: 96.0) [2024-03-20 23:47:30,522][03784] Avg episode reward: [(0, '0.709')] [2024-03-20 23:47:30,806][04017] Updated weights for policy 0, policy_version 5433 (0.0025) [2024-03-20 23:47:35,521][03784] Fps is (10 sec: 65536.3, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 178257920. Throughput: 0: 46002.2. Samples: 179419200. Policy #0 lag: (min: 2.0, avg: 50.6, max: 96.0) [2024-03-20 23:47:35,522][03784] Avg episode reward: [(0, '0.261')] [2024-03-20 23:47:36,849][04017] Updated weights for policy 0, policy_version 5443 (0.0011) [2024-03-20 23:47:40,521][03784] Fps is (10 sec: 58982.6, 60 sec: 47513.6, 300 sec: 45653.0). Total num frames: 178585600. Throughput: 0: 45808.9. Samples: 179688000. Policy #0 lag: (min: 1.0, avg: 50.6, max: 108.0) [2024-03-20 23:47:40,522][03784] Avg episode reward: [(0, '0.461')] [2024-03-20 23:47:42,394][04017] Updated weights for policy 0, policy_version 5453 (0.0011) [2024-03-20 23:47:45,521][03784] Fps is (10 sec: 55705.9, 60 sec: 47513.6, 300 sec: 45319.8). Total num frames: 178814976. Throughput: 0: 46026.8. Samples: 179832700. Policy #0 lag: (min: 0.0, avg: 37.9, max: 99.0) [2024-03-20 23:47:45,522][03784] Avg episode reward: [(0, '0.804')] [2024-03-20 23:47:50,171][04017] Updated weights for policy 0, policy_version 5463 (0.0012) [2024-03-20 23:47:50,521][03784] Fps is (10 sec: 45874.8, 60 sec: 48605.8, 300 sec: 45319.8). Total num frames: 179044352. Throughput: 0: 45762.2. Samples: 180115700. Policy #0 lag: (min: 0.0, avg: 50.2, max: 102.0) [2024-03-20 23:47:50,522][03784] Avg episode reward: [(0, '0.287')] [2024-03-20 23:47:52,526][03995] Signal inference workers to stop experience collection... (3650 times) [2024-03-20 23:47:52,527][03995] Signal inference workers to resume experience collection... (3650 times) [2024-03-20 23:47:52,588][04017] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-03-20 23:47:52,588][04017] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-03-20 23:47:55,521][03784] Fps is (10 sec: 36044.5, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 179175424. Throughput: 0: 46262.2. Samples: 180402500. Policy #0 lag: (min: 0.0, avg: 50.2, max: 102.0) [2024-03-20 23:47:55,522][03784] Avg episode reward: [(0, '0.594')] [2024-03-20 23:47:56,985][04017] Updated weights for policy 0, policy_version 5473 (0.0011) [2024-03-20 23:48:00,521][03784] Fps is (10 sec: 32767.8, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 179372032. Throughput: 0: 46124.3. Samples: 180542400. Policy #0 lag: (min: 1.0, avg: 29.1, max: 68.0) [2024-03-20 23:48:00,522][03784] Avg episode reward: [(0, '0.765')] [2024-03-20 23:48:00,888][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000005476_179437568.pth... [2024-03-20 23:48:00,993][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000005144_168558592.pth [2024-03-20 23:48:05,521][03784] Fps is (10 sec: 42598.0, 60 sec: 48605.8, 300 sec: 45319.8). Total num frames: 179601408. Throughput: 0: 46055.4. Samples: 180827700. Policy #0 lag: (min: 0.0, avg: 42.3, max: 93.0) [2024-03-20 23:48:05,522][03784] Avg episode reward: [(0, '0.368')] [2024-03-20 23:48:06,120][04017] Updated weights for policy 0, policy_version 5483 (0.0017) [2024-03-20 23:48:10,521][03784] Fps is (10 sec: 36045.6, 60 sec: 46421.4, 300 sec: 44875.5). Total num frames: 179732480. Throughput: 0: 46975.7. Samples: 181130700. Policy #0 lag: (min: 0.0, avg: 31.1, max: 84.0) [2024-03-20 23:48:10,522][03784] Avg episode reward: [(0, '0.368')] [2024-03-20 23:48:15,059][04017] Updated weights for policy 0, policy_version 5493 (0.0017) [2024-03-20 23:48:15,521][03784] Fps is (10 sec: 39322.4, 60 sec: 46421.4, 300 sec: 45319.8). Total num frames: 179994624. Throughput: 0: 47491.2. Samples: 181279200. Policy #0 lag: (min: 0.0, avg: 31.1, max: 84.0) [2024-03-20 23:48:15,522][03784] Avg episode reward: [(0, '0.368')] [2024-03-20 23:48:20,521][03784] Fps is (10 sec: 55704.7, 60 sec: 50790.3, 300 sec: 45764.1). Total num frames: 180289536. Throughput: 0: 46506.6. Samples: 181512000. Policy #0 lag: (min: 0.0, avg: 25.2, max: 66.0) [2024-03-20 23:48:20,522][03784] Avg episode reward: [(0, '0.758')] [2024-03-20 23:48:20,550][04017] Updated weights for policy 0, policy_version 5503 (0.0024) [2024-03-20 23:48:25,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46967.5, 300 sec: 45319.8). Total num frames: 180420608. Throughput: 0: 46617.8. Samples: 181785800. Policy #0 lag: (min: 0.0, avg: 40.8, max: 105.0) [2024-03-20 23:48:25,522][03784] Avg episode reward: [(0, '0.455')] [2024-03-20 23:48:30,521][03784] Fps is (10 sec: 26214.6, 60 sec: 42598.4, 300 sec: 44986.6). Total num frames: 180551680. Throughput: 0: 46326.6. Samples: 181917400. Policy #0 lag: (min: 0.0, avg: 27.6, max: 72.0) [2024-03-20 23:48:30,522][03784] Avg episode reward: [(0, '0.516')] [2024-03-20 23:48:32,201][04017] Updated weights for policy 0, policy_version 5513 (0.0017) [2024-03-20 23:48:35,521][03784] Fps is (10 sec: 45874.7, 60 sec: 43690.6, 300 sec: 45097.6). Total num frames: 180879360. Throughput: 0: 45566.7. Samples: 182166200. Policy #0 lag: (min: 0.0, avg: 27.6, max: 72.0) [2024-03-20 23:48:35,522][03784] Avg episode reward: [(0, '0.224')] [2024-03-20 23:48:36,813][04017] Updated weights for policy 0, policy_version 5523 (0.0017) [2024-03-20 23:48:40,521][03784] Fps is (10 sec: 68812.8, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 181239808. Throughput: 0: 45055.6. Samples: 182430000. Policy #0 lag: (min: 0.0, avg: 42.0, max: 91.0) [2024-03-20 23:48:40,522][03784] Avg episode reward: [(0, '0.644')] [2024-03-20 23:48:41,390][04017] Updated weights for policy 0, policy_version 5533 (0.0013) [2024-03-20 23:48:43,346][03995] Signal inference workers to stop experience collection... (3700 times) [2024-03-20 23:48:43,347][03995] Signal inference workers to resume experience collection... (3700 times) [2024-03-20 23:48:43,400][04017] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-03-20 23:48:43,400][04017] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-03-20 23:48:45,521][03784] Fps is (10 sec: 62259.8, 60 sec: 44782.9, 300 sec: 45764.1). Total num frames: 181501952. Throughput: 0: 45058.0. Samples: 182570000. Policy #0 lag: (min: 1.0, avg: 51.6, max: 98.0) [2024-03-20 23:48:45,522][03784] Avg episode reward: [(0, '0.270')] [2024-03-20 23:48:49,623][04017] Updated weights for policy 0, policy_version 5543 (0.0015) [2024-03-20 23:48:50,521][03784] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 45208.7). Total num frames: 181633024. Throughput: 0: 44780.2. Samples: 182842800. Policy #0 lag: (min: 0.0, avg: 41.3, max: 76.0) [2024-03-20 23:48:50,522][03784] Avg episode reward: [(0, '0.718')] [2024-03-20 23:48:55,521][03784] Fps is (10 sec: 29491.2, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 181796864. Throughput: 0: 44564.4. Samples: 183136100. Policy #0 lag: (min: 0.0, avg: 41.3, max: 76.0) [2024-03-20 23:48:55,522][03784] Avg episode reward: [(0, '0.626')] [2024-03-20 23:48:58,168][04017] Updated weights for policy 0, policy_version 5553 (0.0011) [2024-03-20 23:49:00,521][03784] Fps is (10 sec: 52428.3, 60 sec: 46421.4, 300 sec: 45653.0). Total num frames: 182157312. Throughput: 0: 44431.0. Samples: 183278600. Policy #0 lag: (min: 0.0, avg: 41.8, max: 103.0) [2024-03-20 23:49:00,522][03784] Avg episode reward: [(0, '0.692')] [2024-03-20 23:49:02,839][04017] Updated weights for policy 0, policy_version 5563 (0.0015) [2024-03-20 23:49:05,521][03784] Fps is (10 sec: 52428.4, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 182321152. Throughput: 0: 45744.5. Samples: 183570500. Policy #0 lag: (min: 0.0, avg: 37.5, max: 80.0) [2024-03-20 23:49:05,522][03784] Avg episode reward: [(0, '0.579')] [2024-03-20 23:49:09,704][04017] Updated weights for policy 0, policy_version 5573 (0.0016) [2024-03-20 23:49:10,521][03784] Fps is (10 sec: 49152.3, 60 sec: 48605.8, 300 sec: 46208.4). Total num frames: 182648832. Throughput: 0: 45637.7. Samples: 183839500. Policy #0 lag: (min: 0.0, avg: 35.3, max: 85.0) [2024-03-20 23:49:10,522][03784] Avg episode reward: [(0, '0.194')] [2024-03-20 23:49:15,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 182779904. Throughput: 0: 45640.0. Samples: 183971200. Policy #0 lag: (min: 0.0, avg: 35.3, max: 85.0) [2024-03-20 23:49:15,522][03784] Avg episode reward: [(0, '0.580')] [2024-03-20 23:49:20,521][03784] Fps is (10 sec: 26214.5, 60 sec: 43690.8, 300 sec: 45208.7). Total num frames: 182910976. Throughput: 0: 46062.3. Samples: 184239000. Policy #0 lag: (min: 0.0, avg: 35.4, max: 75.0) [2024-03-20 23:49:20,522][03784] Avg episode reward: [(0, '0.724')] [2024-03-20 23:49:20,704][04017] Updated weights for policy 0, policy_version 5583 (0.0012) [2024-03-20 23:49:25,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46421.3, 300 sec: 45097.6). Total num frames: 183205888. Throughput: 0: 45042.2. Samples: 184456900. Policy #0 lag: (min: 1.0, avg: 54.6, max: 109.0) [2024-03-20 23:49:25,522][03784] Avg episode reward: [(0, '0.687')] [2024-03-20 23:49:27,714][04017] Updated weights for policy 0, policy_version 5593 (0.0026) [2024-03-20 23:49:30,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46967.5, 300 sec: 45097.7). Total num frames: 183369728. Throughput: 0: 44913.3. Samples: 184591100. Policy #0 lag: (min: 0.0, avg: 34.2, max: 78.0) [2024-03-20 23:49:30,522][03784] Avg episode reward: [(0, '0.298')] [2024-03-20 23:49:35,521][03784] Fps is (10 sec: 32767.9, 60 sec: 44236.8, 300 sec: 45097.6). Total num frames: 183533568. Throughput: 0: 44964.4. Samples: 184866200. Policy #0 lag: (min: 1.0, avg: 32.2, max: 65.0) [2024-03-20 23:49:35,522][03784] Avg episode reward: [(0, '0.845')] [2024-03-20 23:49:36,080][04017] Updated weights for policy 0, policy_version 5603 (0.0012) [2024-03-20 23:49:40,521][03784] Fps is (10 sec: 32767.9, 60 sec: 40960.0, 300 sec: 45097.7). Total num frames: 183697408. Throughput: 0: 44391.0. Samples: 185133700. Policy #0 lag: (min: 1.0, avg: 32.2, max: 65.0) [2024-03-20 23:49:40,522][03784] Avg episode reward: [(0, '0.417')] [2024-03-20 23:49:44,733][04017] Updated weights for policy 0, policy_version 5613 (0.0022) [2024-03-20 23:49:45,521][03784] Fps is (10 sec: 45875.3, 60 sec: 41506.1, 300 sec: 45430.9). Total num frames: 183992320. Throughput: 0: 43995.6. Samples: 185258400. Policy #0 lag: (min: 0.0, avg: 36.1, max: 109.0) [2024-03-20 23:49:45,522][03784] Avg episode reward: [(0, '0.659')] [2024-03-20 23:49:47,608][03995] Signal inference workers to stop experience collection... (3750 times) [2024-03-20 23:49:47,684][03995] Signal inference workers to resume experience collection... (3750 times) [2024-03-20 23:49:47,689][04017] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-03-20 23:49:47,732][04017] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-03-20 23:49:50,196][04017] Updated weights for policy 0, policy_version 5623 (0.0011) [2024-03-20 23:49:50,521][03784] Fps is (10 sec: 55705.7, 60 sec: 43690.7, 300 sec: 45542.0). Total num frames: 184254464. Throughput: 0: 43404.5. Samples: 185523700. Policy #0 lag: (min: 0.0, avg: 29.4, max: 65.0) [2024-03-20 23:49:50,522][03784] Avg episode reward: [(0, '0.460')] [2024-03-20 23:49:55,521][03784] Fps is (10 sec: 49152.4, 60 sec: 44782.9, 300 sec: 45764.1). Total num frames: 184483840. Throughput: 0: 43604.5. Samples: 185801700. Policy #0 lag: (min: 0.0, avg: 37.3, max: 83.0) [2024-03-20 23:49:55,522][03784] Avg episode reward: [(0, '0.460')] [2024-03-20 23:49:56,241][04017] Updated weights for policy 0, policy_version 5633 (0.0017) [2024-03-20 23:50:00,521][03784] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 45875.2). Total num frames: 184778752. Throughput: 0: 43188.9. Samples: 185914700. Policy #0 lag: (min: 0.0, avg: 37.3, max: 83.0) [2024-03-20 23:50:00,522][03784] Avg episode reward: [(0, '0.547')] [2024-03-20 23:50:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000005639_184778752.pth... [2024-03-20 23:50:00,693][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000005303_173768704.pth [2024-03-20 23:50:04,190][04017] Updated weights for policy 0, policy_version 5643 (0.0011) [2024-03-20 23:50:05,521][03784] Fps is (10 sec: 52428.4, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 185008128. Throughput: 0: 43566.6. Samples: 186199500. Policy #0 lag: (min: 0.0, avg: 55.1, max: 107.0) [2024-03-20 23:50:05,522][03784] Avg episode reward: [(0, '0.641')] [2024-03-20 23:50:10,521][03784] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 44986.6). Total num frames: 185171968. Throughput: 0: 45211.1. Samples: 186491400. Policy #0 lag: (min: 0.0, avg: 32.1, max: 64.0) [2024-03-20 23:50:10,522][03784] Avg episode reward: [(0, '0.692')] [2024-03-20 23:50:11,424][04017] Updated weights for policy 0, policy_version 5653 (0.0010) [2024-03-20 23:50:15,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43144.5, 300 sec: 45097.6). Total num frames: 185368576. Throughput: 0: 45408.8. Samples: 186634500. Policy #0 lag: (min: 0.0, avg: 32.1, max: 64.0) [2024-03-20 23:50:15,522][03784] Avg episode reward: [(0, '0.840')] [2024-03-20 23:50:17,892][04017] Updated weights for policy 0, policy_version 5663 (0.0011) [2024-03-20 23:50:20,521][03784] Fps is (10 sec: 55705.0, 60 sec: 46967.4, 300 sec: 45430.9). Total num frames: 185729024. Throughput: 0: 44960.0. Samples: 186889400. Policy #0 lag: (min: 0.0, avg: 37.1, max: 109.0) [2024-03-20 23:50:20,522][03784] Avg episode reward: [(0, '0.412')] [2024-03-20 23:50:25,018][04017] Updated weights for policy 0, policy_version 5673 (0.0011) [2024-03-20 23:50:25,521][03784] Fps is (10 sec: 52428.1, 60 sec: 44782.8, 300 sec: 45319.8). Total num frames: 185892864. Throughput: 0: 44957.6. Samples: 187156800. Policy #0 lag: (min: 0.0, avg: 42.5, max: 83.0) [2024-03-20 23:50:25,523][03784] Avg episode reward: [(0, '0.221')] [2024-03-20 23:50:30,521][03784] Fps is (10 sec: 42598.7, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 186155008. Throughput: 0: 45368.9. Samples: 187300000. Policy #0 lag: (min: 0.0, avg: 46.8, max: 102.0) [2024-03-20 23:50:30,522][03784] Avg episode reward: [(0, '0.643')] [2024-03-20 23:50:34,828][04017] Updated weights for policy 0, policy_version 5683 (0.0014) [2024-03-20 23:50:35,521][03784] Fps is (10 sec: 36045.6, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 186253312. Throughput: 0: 45231.2. Samples: 187559100. Policy #0 lag: (min: 0.0, avg: 36.4, max: 79.0) [2024-03-20 23:50:35,522][03784] Avg episode reward: [(0, '0.728')] [2024-03-20 23:50:40,521][03784] Fps is (10 sec: 26214.3, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 186417152. Throughput: 0: 44031.0. Samples: 187783100. Policy #0 lag: (min: 0.0, avg: 36.4, max: 79.0) [2024-03-20 23:50:40,522][03784] Avg episode reward: [(0, '0.341')] [2024-03-20 23:50:41,928][04017] Updated weights for policy 0, policy_version 5693 (0.0013) [2024-03-20 23:50:43,863][03995] Signal inference workers to stop experience collection... (3800 times) [2024-03-20 23:50:43,863][03995] Signal inference workers to resume experience collection... (3800 times) [2024-03-20 23:50:43,941][04017] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-03-20 23:50:43,941][04017] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-03-20 23:50:45,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 186712064. Throughput: 0: 43984.5. Samples: 187894000. Policy #0 lag: (min: 0.0, avg: 33.9, max: 105.0) [2024-03-20 23:50:45,522][03784] Avg episode reward: [(0, '0.830')] [2024-03-20 23:50:50,521][03784] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 44986.6). Total num frames: 186810368. Throughput: 0: 43631.2. Samples: 188162900. Policy #0 lag: (min: 1.0, avg: 31.0, max: 66.0) [2024-03-20 23:50:50,522][03784] Avg episode reward: [(0, '0.524')] [2024-03-20 23:50:52,927][04017] Updated weights for policy 0, policy_version 5703 (0.0012) [2024-03-20 23:50:55,521][03784] Fps is (10 sec: 19660.6, 60 sec: 40413.8, 300 sec: 44542.3). Total num frames: 186908672. Throughput: 0: 43835.5. Samples: 188464000. Policy #0 lag: (min: 0.0, avg: 34.3, max: 84.0) [2024-03-20 23:50:55,522][03784] Avg episode reward: [(0, '0.855')] [2024-03-20 23:50:58,876][04017] Updated weights for policy 0, policy_version 5713 (0.0038) [2024-03-20 23:51:00,521][03784] Fps is (10 sec: 49151.9, 60 sec: 42052.3, 300 sec: 45097.6). Total num frames: 187301888. Throughput: 0: 43615.6. Samples: 188597200. Policy #0 lag: (min: 2.0, avg: 44.9, max: 102.0) [2024-03-20 23:51:00,522][03784] Avg episode reward: [(0, '0.808')] [2024-03-20 23:51:03,104][04017] Updated weights for policy 0, policy_version 5723 (0.0012) [2024-03-20 23:51:05,521][03784] Fps is (10 sec: 81920.7, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 187727872. Throughput: 0: 44151.2. Samples: 188876200. Policy #0 lag: (min: 1.0, avg: 34.1, max: 73.0) [2024-03-20 23:51:05,522][03784] Avg episode reward: [(0, '0.679')] [2024-03-20 23:51:07,599][04017] Updated weights for policy 0, policy_version 5733 (0.0021) [2024-03-20 23:51:10,521][03784] Fps is (10 sec: 68812.4, 60 sec: 46967.4, 300 sec: 46208.4). Total num frames: 187990016. Throughput: 0: 44280.1. Samples: 189149400. Policy #0 lag: (min: 1.0, avg: 34.1, max: 73.0) [2024-03-20 23:51:10,522][03784] Avg episode reward: [(0, '0.665')] [2024-03-20 23:51:15,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46421.4, 300 sec: 45875.2). Total num frames: 188153856. Throughput: 0: 44511.2. Samples: 189303000. Policy #0 lag: (min: 0.0, avg: 45.5, max: 107.0) [2024-03-20 23:51:15,522][03784] Avg episode reward: [(0, '0.938')] [2024-03-20 23:51:15,935][04017] Updated weights for policy 0, policy_version 5743 (0.0015) [2024-03-20 23:51:20,521][03784] Fps is (10 sec: 29491.1, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 188284928. Throughput: 0: 44973.2. Samples: 189582900. Policy #0 lag: (min: 0.0, avg: 38.0, max: 78.0) [2024-03-20 23:51:20,522][03784] Avg episode reward: [(0, '0.494')] [2024-03-20 23:51:23,174][04017] Updated weights for policy 0, policy_version 5753 (0.0012) [2024-03-20 23:51:25,521][03784] Fps is (10 sec: 52428.3, 60 sec: 46421.4, 300 sec: 45653.0). Total num frames: 188678144. Throughput: 0: 45662.2. Samples: 189837900. Policy #0 lag: (min: 0.0, avg: 42.3, max: 80.0) [2024-03-20 23:51:25,522][03784] Avg episode reward: [(0, '0.494')] [2024-03-20 23:51:30,521][03784] Fps is (10 sec: 52429.0, 60 sec: 44236.8, 300 sec: 45097.7). Total num frames: 188809216. Throughput: 0: 46440.0. Samples: 189983800. Policy #0 lag: (min: 0.0, avg: 42.3, max: 80.0) [2024-03-20 23:51:30,522][03784] Avg episode reward: [(0, '0.845')] [2024-03-20 23:51:30,545][04017] Updated weights for policy 0, policy_version 5763 (0.0021) [2024-03-20 23:51:32,145][03995] Signal inference workers to stop experience collection... (3850 times) [2024-03-20 23:51:32,215][03995] Signal inference workers to resume experience collection... (3850 times) [2024-03-20 23:51:32,239][04017] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-03-20 23:51:32,299][04017] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-03-20 23:51:34,635][04017] Updated weights for policy 0, policy_version 5773 (0.0026) [2024-03-20 23:51:35,521][03784] Fps is (10 sec: 55705.6, 60 sec: 49698.0, 300 sec: 45764.1). Total num frames: 189235200. Throughput: 0: 46346.6. Samples: 190248500. Policy #0 lag: (min: 0.0, avg: 33.2, max: 71.0) [2024-03-20 23:51:35,531][03784] Avg episode reward: [(0, '0.917')] [2024-03-20 23:51:40,521][03784] Fps is (10 sec: 52428.5, 60 sec: 48605.8, 300 sec: 45319.8). Total num frames: 189333504. Throughput: 0: 46008.9. Samples: 190534400. Policy #0 lag: (min: 0.0, avg: 33.2, max: 71.0) [2024-03-20 23:51:40,522][03784] Avg episode reward: [(0, '0.287')] [2024-03-20 23:51:45,521][03784] Fps is (10 sec: 13107.4, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 189366272. Throughput: 0: 46428.9. Samples: 190686500. Policy #0 lag: (min: 0.0, avg: 45.9, max: 92.0) [2024-03-20 23:51:45,522][03784] Avg episode reward: [(0, '0.647')] [2024-03-20 23:51:49,715][04017] Updated weights for policy 0, policy_version 5783 (0.0015) [2024-03-20 23:51:50,521][03784] Fps is (10 sec: 22938.0, 60 sec: 45875.2, 300 sec: 44320.1). Total num frames: 189562880. Throughput: 0: 46688.9. Samples: 190977200. Policy #0 lag: (min: 0.0, avg: 34.8, max: 79.0) [2024-03-20 23:51:50,522][03784] Avg episode reward: [(0, '0.809')] [2024-03-20 23:51:55,521][03784] Fps is (10 sec: 42598.4, 60 sec: 48059.8, 300 sec: 44431.2). Total num frames: 189792256. Throughput: 0: 46995.7. Samples: 191264200. Policy #0 lag: (min: 0.0, avg: 34.8, max: 79.0) [2024-03-20 23:51:55,522][03784] Avg episode reward: [(0, '0.809')] [2024-03-20 23:51:56,324][04017] Updated weights for policy 0, policy_version 5793 (0.0010) [2024-03-20 23:52:00,521][03784] Fps is (10 sec: 52428.4, 60 sec: 46421.3, 300 sec: 45430.9). Total num frames: 190087168. Throughput: 0: 46380.0. Samples: 191390100. Policy #0 lag: (min: 0.0, avg: 29.5, max: 78.0) [2024-03-20 23:52:00,522][03784] Avg episode reward: [(0, '0.702')] [2024-03-20 23:52:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000005801_190087168.pth... [2024-03-20 23:52:00,645][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000005476_179437568.pth [2024-03-20 23:52:02,096][04017] Updated weights for policy 0, policy_version 5803 (0.0015) [2024-03-20 23:52:05,521][03784] Fps is (10 sec: 65536.6, 60 sec: 45329.2, 300 sec: 45764.2). Total num frames: 190447616. Throughput: 0: 46435.8. Samples: 191672500. Policy #0 lag: (min: 1.0, avg: 35.6, max: 79.0) [2024-03-20 23:52:05,522][03784] Avg episode reward: [(0, '0.475')] [2024-03-20 23:52:06,444][04017] Updated weights for policy 0, policy_version 5813 (0.0011) [2024-03-20 23:52:10,521][03784] Fps is (10 sec: 62259.0, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 190709760. Throughput: 0: 46244.5. Samples: 191918900. Policy #0 lag: (min: 0.0, avg: 46.7, max: 97.0) [2024-03-20 23:52:10,522][03784] Avg episode reward: [(0, '0.507')] [2024-03-20 23:52:12,469][04017] Updated weights for policy 0, policy_version 5823 (0.0016) [2024-03-20 23:52:15,521][03784] Fps is (10 sec: 55704.8, 60 sec: 47513.6, 300 sec: 46652.7). Total num frames: 191004672. Throughput: 0: 46117.8. Samples: 192059100. Policy #0 lag: (min: 2.0, avg: 51.1, max: 93.0) [2024-03-20 23:52:15,522][03784] Avg episode reward: [(0, '0.674')] [2024-03-20 23:52:20,521][03784] Fps is (10 sec: 39321.8, 60 sec: 46967.5, 300 sec: 45764.1). Total num frames: 191102976. Throughput: 0: 46935.7. Samples: 192360600. Policy #0 lag: (min: 2.0, avg: 51.1, max: 93.0) [2024-03-20 23:52:20,522][03784] Avg episode reward: [(0, '0.485')] [2024-03-20 23:52:21,193][04017] Updated weights for policy 0, policy_version 5833 (0.0018) [2024-03-20 23:52:22,977][03995] Signal inference workers to stop experience collection... (3900 times) [2024-03-20 23:52:23,045][03995] Signal inference workers to resume experience collection... (3900 times) [2024-03-20 23:52:23,064][04017] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-03-20 23:52:23,120][04017] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-03-20 23:52:25,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 191365120. Throughput: 0: 47077.9. Samples: 192652900. Policy #0 lag: (min: 0.0, avg: 37.1, max: 75.0) [2024-03-20 23:52:25,522][03784] Avg episode reward: [(0, '0.485')] [2024-03-20 23:52:27,557][04017] Updated weights for policy 0, policy_version 5843 (0.0016) [2024-03-20 23:52:30,521][03784] Fps is (10 sec: 55705.2, 60 sec: 47513.6, 300 sec: 45430.9). Total num frames: 191660032. Throughput: 0: 46595.5. Samples: 192783300. Policy #0 lag: (min: 1.0, avg: 47.9, max: 96.0) [2024-03-20 23:52:30,522][03784] Avg episode reward: [(0, '0.746')] [2024-03-20 23:52:32,628][04017] Updated weights for policy 0, policy_version 5853 (0.0010) [2024-03-20 23:52:35,521][03784] Fps is (10 sec: 55705.4, 60 sec: 44783.0, 300 sec: 45208.7). Total num frames: 191922176. Throughput: 0: 46499.9. Samples: 193069700. Policy #0 lag: (min: 1.0, avg: 43.5, max: 87.0) [2024-03-20 23:52:35,522][03784] Avg episode reward: [(0, '0.746')] [2024-03-20 23:52:38,891][04017] Updated weights for policy 0, policy_version 5863 (0.0020) [2024-03-20 23:52:40,521][03784] Fps is (10 sec: 52429.0, 60 sec: 47513.7, 300 sec: 45319.8). Total num frames: 192184320. Throughput: 0: 46266.6. Samples: 193346200. Policy #0 lag: (min: 1.0, avg: 40.4, max: 98.0) [2024-03-20 23:52:40,522][03784] Avg episode reward: [(0, '0.597')] [2024-03-20 23:52:45,521][03784] Fps is (10 sec: 39322.1, 60 sec: 49152.1, 300 sec: 44986.6). Total num frames: 192315392. Throughput: 0: 46655.7. Samples: 193489600. Policy #0 lag: (min: 0.0, avg: 33.0, max: 88.0) [2024-03-20 23:52:45,522][03784] Avg episode reward: [(0, '0.724')] [2024-03-20 23:52:47,810][04017] Updated weights for policy 0, policy_version 5873 (0.0010) [2024-03-20 23:52:50,521][03784] Fps is (10 sec: 32768.1, 60 sec: 49152.0, 300 sec: 45208.7). Total num frames: 192512000. Throughput: 0: 46871.0. Samples: 193781700. Policy #0 lag: (min: 0.0, avg: 33.0, max: 88.0) [2024-03-20 23:52:50,522][03784] Avg episode reward: [(0, '0.724')] [2024-03-20 23:52:55,521][03784] Fps is (10 sec: 39321.3, 60 sec: 48605.9, 300 sec: 45208.8). Total num frames: 192708608. Throughput: 0: 47622.3. Samples: 194061900. Policy #0 lag: (min: 1.0, avg: 38.7, max: 98.0) [2024-03-20 23:52:55,522][03784] Avg episode reward: [(0, '0.724')] [2024-03-20 23:52:57,886][04017] Updated weights for policy 0, policy_version 5883 (0.0010) [2024-03-20 23:53:00,521][03784] Fps is (10 sec: 32768.1, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 192839680. Throughput: 0: 47615.6. Samples: 194201800. Policy #0 lag: (min: 0.0, avg: 34.1, max: 91.0) [2024-03-20 23:53:00,522][03784] Avg episode reward: [(0, '0.666')] [2024-03-20 23:53:05,010][04017] Updated weights for policy 0, policy_version 5893 (0.0010) [2024-03-20 23:53:05,521][03784] Fps is (10 sec: 39321.5, 60 sec: 44236.7, 300 sec: 45319.8). Total num frames: 193101824. Throughput: 0: 46733.3. Samples: 194463600. Policy #0 lag: (min: 0.0, avg: 36.7, max: 92.0) [2024-03-20 23:53:05,522][03784] Avg episode reward: [(0, '0.658')] [2024-03-20 23:53:10,521][03784] Fps is (10 sec: 52428.8, 60 sec: 44236.9, 300 sec: 45319.8). Total num frames: 193363968. Throughput: 0: 45797.8. Samples: 194713800. Policy #0 lag: (min: 1.0, avg: 25.7, max: 48.0) [2024-03-20 23:53:10,522][03784] Avg episode reward: [(0, '0.386')] [2024-03-20 23:53:12,575][04017] Updated weights for policy 0, policy_version 5903 (0.0016) [2024-03-20 23:53:15,521][03784] Fps is (10 sec: 55705.9, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 193658880. Throughput: 0: 45809.0. Samples: 194844700. Policy #0 lag: (min: 0.0, avg: 39.3, max: 104.0) [2024-03-20 23:53:15,522][03784] Avg episode reward: [(0, '0.897')] [2024-03-20 23:53:19,607][04017] Updated weights for policy 0, policy_version 5913 (0.0012) [2024-03-20 23:53:20,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 193822720. Throughput: 0: 45142.2. Samples: 195101100. Policy #0 lag: (min: 0.0, avg: 39.3, max: 104.0) [2024-03-20 23:53:20,522][03784] Avg episode reward: [(0, '0.664')] [2024-03-20 23:53:21,971][03995] Signal inference workers to stop experience collection... (3950 times) [2024-03-20 23:53:21,971][03995] Signal inference workers to resume experience collection... (3950 times) [2024-03-20 23:53:22,031][04017] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-03-20 23:53:22,032][04017] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-03-20 23:53:24,835][04017] Updated weights for policy 0, policy_version 5923 (0.0013) [2024-03-20 23:53:25,521][03784] Fps is (10 sec: 49151.6, 60 sec: 46421.3, 300 sec: 46097.4). Total num frames: 194150400. Throughput: 0: 44215.5. Samples: 195335900. Policy #0 lag: (min: 0.0, avg: 31.5, max: 64.0) [2024-03-20 23:53:25,522][03784] Avg episode reward: [(0, '0.776')] [2024-03-20 23:53:30,475][04017] Updated weights for policy 0, policy_version 5933 (0.0016) [2024-03-20 23:53:30,521][03784] Fps is (10 sec: 58982.2, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 194412544. Throughput: 0: 44146.5. Samples: 195476200. Policy #0 lag: (min: 1.0, avg: 54.7, max: 117.0) [2024-03-20 23:53:30,522][03784] Avg episode reward: [(0, '0.407')] [2024-03-20 23:53:35,521][03784] Fps is (10 sec: 36044.5, 60 sec: 43144.5, 300 sec: 44986.6). Total num frames: 194510848. Throughput: 0: 44379.9. Samples: 195778800. Policy #0 lag: (min: 0.0, avg: 47.2, max: 95.0) [2024-03-20 23:53:35,522][03784] Avg episode reward: [(0, '0.748')] [2024-03-20 23:53:40,521][03784] Fps is (10 sec: 29491.4, 60 sec: 42052.3, 300 sec: 44764.4). Total num frames: 194707456. Throughput: 0: 44397.8. Samples: 196059800. Policy #0 lag: (min: 0.0, avg: 47.2, max: 95.0) [2024-03-20 23:53:40,522][03784] Avg episode reward: [(0, '0.584')] [2024-03-20 23:53:40,756][04017] Updated weights for policy 0, policy_version 5943 (0.0015) [2024-03-20 23:53:45,521][03784] Fps is (10 sec: 49152.5, 60 sec: 44782.8, 300 sec: 45319.8). Total num frames: 195002368. Throughput: 0: 44306.6. Samples: 196195600. Policy #0 lag: (min: 0.0, avg: 31.6, max: 64.0) [2024-03-20 23:53:45,522][03784] Avg episode reward: [(0, '0.486')] [2024-03-20 23:53:46,285][04017] Updated weights for policy 0, policy_version 5953 (0.0009) [2024-03-20 23:53:50,521][03784] Fps is (10 sec: 58981.2, 60 sec: 46421.2, 300 sec: 45764.1). Total num frames: 195297280. Throughput: 0: 44748.7. Samples: 196477300. Policy #0 lag: (min: 2.0, avg: 55.6, max: 109.0) [2024-03-20 23:53:50,522][03784] Avg episode reward: [(0, '0.486')] [2024-03-20 23:53:52,385][04017] Updated weights for policy 0, policy_version 5963 (0.0012) [2024-03-20 23:53:55,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46967.4, 300 sec: 45319.8). Total num frames: 195526656. Throughput: 0: 45368.8. Samples: 196755400. Policy #0 lag: (min: 0.0, avg: 35.8, max: 79.0) [2024-03-20 23:53:55,522][03784] Avg episode reward: [(0, '0.484')] [2024-03-20 23:54:00,521][03784] Fps is (10 sec: 26214.8, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 195559424. Throughput: 0: 45264.4. Samples: 196881600. Policy #0 lag: (min: 0.0, avg: 35.8, max: 79.0) [2024-03-20 23:54:00,522][03784] Avg episode reward: [(0, '0.623')] [2024-03-20 23:54:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000005968_195559424.pth... [2024-03-20 23:54:00,723][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000005639_184778752.pth [2024-03-20 23:54:02,804][04017] Updated weights for policy 0, policy_version 5973 (0.0016) [2024-03-20 23:54:05,521][03784] Fps is (10 sec: 29491.1, 60 sec: 45329.0, 300 sec: 44653.3). Total num frames: 195821568. Throughput: 0: 44804.4. Samples: 197117300. Policy #0 lag: (min: 0.0, avg: 38.2, max: 87.0) [2024-03-20 23:54:05,522][03784] Avg episode reward: [(0, '0.510')] [2024-03-20 23:54:09,391][04017] Updated weights for policy 0, policy_version 5983 (0.0013) [2024-03-20 23:54:10,521][03784] Fps is (10 sec: 58981.3, 60 sec: 46421.2, 300 sec: 45319.8). Total num frames: 196149248. Throughput: 0: 44044.3. Samples: 197317900. Policy #0 lag: (min: 0.0, avg: 30.0, max: 77.0) [2024-03-20 23:54:10,522][03784] Avg episode reward: [(0, '0.634')] [2024-03-20 23:54:10,883][03995] Signal inference workers to stop experience collection... (4000 times) [2024-03-20 23:54:10,922][04017] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-03-20 23:54:11,195][03995] Signal inference workers to resume experience collection... (4000 times) [2024-03-20 23:54:11,195][04017] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-03-20 23:54:15,521][03784] Fps is (10 sec: 45875.7, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 196280320. Throughput: 0: 43442.3. Samples: 197431100. Policy #0 lag: (min: 0.0, avg: 47.7, max: 96.0) [2024-03-20 23:54:15,522][03784] Avg episode reward: [(0, '0.607')] [2024-03-20 23:54:17,515][04017] Updated weights for policy 0, policy_version 5993 (0.0013) [2024-03-20 23:54:20,521][03784] Fps is (10 sec: 39322.2, 60 sec: 45329.0, 300 sec: 45208.7). Total num frames: 196542464. Throughput: 0: 42822.3. Samples: 197705800. Policy #0 lag: (min: 0.0, avg: 47.7, max: 96.0) [2024-03-20 23:54:20,522][03784] Avg episode reward: [(0, '0.864')] [2024-03-20 23:54:25,521][03784] Fps is (10 sec: 36044.2, 60 sec: 41506.1, 300 sec: 44986.6). Total num frames: 196640768. Throughput: 0: 43375.4. Samples: 198011700. Policy #0 lag: (min: 0.0, avg: 33.8, max: 79.0) [2024-03-20 23:54:25,522][03784] Avg episode reward: [(0, '0.645')] [2024-03-20 23:54:29,096][04017] Updated weights for policy 0, policy_version 6003 (0.0018) [2024-03-20 23:54:30,521][03784] Fps is (10 sec: 19661.0, 60 sec: 38775.5, 300 sec: 44764.4). Total num frames: 196739072. Throughput: 0: 43557.8. Samples: 198155700. Policy #0 lag: (min: 0.0, avg: 29.0, max: 69.0) [2024-03-20 23:54:30,522][03784] Avg episode reward: [(0, '0.274')] [2024-03-20 23:54:35,456][04017] Updated weights for policy 0, policy_version 6013 (0.0020) [2024-03-20 23:54:35,521][03784] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 45208.7). Total num frames: 197033984. Throughput: 0: 42740.1. Samples: 198400600. Policy #0 lag: (min: 1.0, avg: 45.5, max: 106.0) [2024-03-20 23:54:35,522][03784] Avg episode reward: [(0, '0.607')] [2024-03-20 23:54:40,363][04017] Updated weights for policy 0, policy_version 6023 (0.0028) [2024-03-20 23:54:40,521][03784] Fps is (10 sec: 62259.2, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 197361664. Throughput: 0: 41966.7. Samples: 198643900. Policy #0 lag: (min: 1.0, avg: 45.5, max: 106.0) [2024-03-20 23:54:40,522][03784] Avg episode reward: [(0, '0.341')] [2024-03-20 23:54:45,521][03784] Fps is (10 sec: 58983.2, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 197623808. Throughput: 0: 42364.5. Samples: 198788000. Policy #0 lag: (min: 0.0, avg: 43.1, max: 96.0) [2024-03-20 23:54:45,522][03784] Avg episode reward: [(0, '0.341')] [2024-03-20 23:54:47,680][04017] Updated weights for policy 0, policy_version 6033 (0.0010) [2024-03-20 23:54:50,521][03784] Fps is (10 sec: 45875.4, 60 sec: 42052.5, 300 sec: 45208.7). Total num frames: 197820416. Throughput: 0: 43649.0. Samples: 199081500. Policy #0 lag: (min: 0.0, avg: 31.7, max: 64.0) [2024-03-20 23:54:50,522][03784] Avg episode reward: [(0, '0.341')] [2024-03-20 23:54:55,521][03784] Fps is (10 sec: 29490.8, 60 sec: 39867.7, 300 sec: 44542.2). Total num frames: 197918720. Throughput: 0: 45433.4. Samples: 199362400. Policy #0 lag: (min: 0.0, avg: 35.9, max: 81.0) [2024-03-20 23:54:55,522][03784] Avg episode reward: [(0, '0.394')] [2024-03-20 23:54:56,440][04017] Updated weights for policy 0, policy_version 6043 (0.0009) [2024-03-20 23:55:00,521][03784] Fps is (10 sec: 45875.3, 60 sec: 45329.2, 300 sec: 44986.6). Total num frames: 198279168. Throughput: 0: 45604.5. Samples: 199483300. Policy #0 lag: (min: 0.0, avg: 35.9, max: 81.0) [2024-03-20 23:55:00,521][03784] Avg episode reward: [(0, '0.842')] [2024-03-20 23:55:02,078][04017] Updated weights for policy 0, policy_version 6053 (0.0011) [2024-03-20 23:55:04,143][03995] Signal inference workers to stop experience collection... (4050 times) [2024-03-20 23:55:04,143][03995] Signal inference workers to resume experience collection... (4050 times) [2024-03-20 23:55:04,222][04017] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-03-20 23:55:04,227][04017] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-03-20 23:55:05,422][04017] Updated weights for policy 0, policy_version 6063 (0.0014) [2024-03-20 23:55:05,521][03784] Fps is (10 sec: 75367.2, 60 sec: 47513.6, 300 sec: 45764.1). Total num frames: 198672384. Throughput: 0: 45300.0. Samples: 199744300. Policy #0 lag: (min: 0.0, avg: 35.7, max: 80.0) [2024-03-20 23:55:05,522][03784] Avg episode reward: [(0, '0.550')] [2024-03-20 23:55:10,521][03784] Fps is (10 sec: 62258.3, 60 sec: 45875.3, 300 sec: 45875.2). Total num frames: 198901760. Throughput: 0: 44386.7. Samples: 200009100. Policy #0 lag: (min: 1.0, avg: 41.6, max: 87.0) [2024-03-20 23:55:10,522][03784] Avg episode reward: [(0, '0.717')] [2024-03-20 23:55:13,833][04017] Updated weights for policy 0, policy_version 6073 (0.0015) [2024-03-20 23:55:15,521][03784] Fps is (10 sec: 36044.5, 60 sec: 45875.0, 300 sec: 45097.7). Total num frames: 199032832. Throughput: 0: 44528.7. Samples: 200159500. Policy #0 lag: (min: 0.0, avg: 40.3, max: 78.0) [2024-03-20 23:55:15,522][03784] Avg episode reward: [(0, '0.754')] [2024-03-20 23:55:20,521][03784] Fps is (10 sec: 26214.4, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 199163904. Throughput: 0: 44693.4. Samples: 200411800. Policy #0 lag: (min: 0.0, avg: 40.3, max: 78.0) [2024-03-20 23:55:20,522][03784] Avg episode reward: [(0, '0.532')] [2024-03-20 23:55:24,552][04017] Updated weights for policy 0, policy_version 6083 (0.0015) [2024-03-20 23:55:25,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 199393280. Throughput: 0: 45182.1. Samples: 200677100. Policy #0 lag: (min: 0.0, avg: 30.5, max: 80.0) [2024-03-20 23:55:25,522][03784] Avg episode reward: [(0, '0.925')] [2024-03-20 23:55:29,128][04017] Updated weights for policy 0, policy_version 6093 (0.0016) [2024-03-20 23:55:30,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49151.9, 300 sec: 45542.0). Total num frames: 199688192. Throughput: 0: 44699.9. Samples: 200799500. Policy #0 lag: (min: 0.0, avg: 31.4, max: 77.0) [2024-03-20 23:55:30,530][03784] Avg episode reward: [(0, '0.925')] [2024-03-20 23:55:35,521][03784] Fps is (10 sec: 39322.1, 60 sec: 45875.3, 300 sec: 45319.8). Total num frames: 199786496. Throughput: 0: 43502.2. Samples: 201039100. Policy #0 lag: (min: 0.0, avg: 41.2, max: 92.0) [2024-03-20 23:55:35,522][03784] Avg episode reward: [(0, '0.783')] [2024-03-20 23:55:40,521][03784] Fps is (10 sec: 19660.6, 60 sec: 42052.1, 300 sec: 44653.3). Total num frames: 199884800. Throughput: 0: 43693.3. Samples: 201328600. Policy #0 lag: (min: 0.0, avg: 41.2, max: 92.0) [2024-03-20 23:55:40,522][03784] Avg episode reward: [(0, '0.478')] [2024-03-20 23:55:41,782][04017] Updated weights for policy 0, policy_version 6103 (0.0011) [2024-03-20 23:55:45,521][03784] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 45319.8). Total num frames: 200179712. Throughput: 0: 44104.3. Samples: 201468000. Policy #0 lag: (min: 0.0, avg: 24.8, max: 55.0) [2024-03-20 23:55:45,522][03784] Avg episode reward: [(0, '0.420')] [2024-03-20 23:55:47,710][04017] Updated weights for policy 0, policy_version 6113 (0.0020) [2024-03-20 23:55:50,521][03784] Fps is (10 sec: 52429.3, 60 sec: 43144.4, 300 sec: 45764.1). Total num frames: 200409088. Throughput: 0: 44897.8. Samples: 201764700. Policy #0 lag: (min: 1.0, avg: 31.2, max: 71.0) [2024-03-20 23:55:50,522][03784] Avg episode reward: [(0, '0.673')] [2024-03-20 23:55:55,521][03784] Fps is (10 sec: 39321.8, 60 sec: 44236.9, 300 sec: 44986.6). Total num frames: 200572928. Throughput: 0: 45586.7. Samples: 202060500. Policy #0 lag: (min: 1.0, avg: 31.4, max: 77.0) [2024-03-20 23:55:55,522][03784] Avg episode reward: [(0, '0.320')] [2024-03-20 23:55:56,160][04017] Updated weights for policy 0, policy_version 6123 (0.0011) [2024-03-20 23:56:00,521][03784] Fps is (10 sec: 49152.1, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 200900608. Throughput: 0: 45213.4. Samples: 202194100. Policy #0 lag: (min: 0.0, avg: 32.0, max: 64.0) [2024-03-20 23:56:00,522][03784] Avg episode reward: [(0, '0.340')] [2024-03-20 23:56:00,597][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000006132_200933376.pth... [2024-03-20 23:56:00,715][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000005801_190087168.pth [2024-03-20 23:56:01,323][04017] Updated weights for policy 0, policy_version 6133 (0.0011) [2024-03-20 23:56:05,521][03784] Fps is (10 sec: 58982.4, 60 sec: 41506.2, 300 sec: 44653.4). Total num frames: 201162752. Throughput: 0: 45566.7. Samples: 202462300. Policy #0 lag: (min: 0.0, avg: 32.0, max: 64.0) [2024-03-20 23:56:05,522][03784] Avg episode reward: [(0, '0.763')] [2024-03-20 23:56:08,614][04017] Updated weights for policy 0, policy_version 6143 (0.0021) [2024-03-20 23:56:09,052][03995] Signal inference workers to stop experience collection... (4100 times) [2024-03-20 23:56:09,123][04017] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-03-20 23:56:09,328][03995] Signal inference workers to resume experience collection... (4100 times) [2024-03-20 23:56:09,328][04017] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-03-20 23:56:10,521][03784] Fps is (10 sec: 55705.7, 60 sec: 42598.4, 300 sec: 45097.6). Total num frames: 201457664. Throughput: 0: 46124.5. Samples: 202752700. Policy #0 lag: (min: 0.0, avg: 35.2, max: 114.0) [2024-03-20 23:56:10,522][03784] Avg episode reward: [(0, '0.763')] [2024-03-20 23:56:11,844][04017] Updated weights for policy 0, policy_version 6153 (0.0013) [2024-03-20 23:56:15,521][03784] Fps is (10 sec: 68812.7, 60 sec: 46967.6, 300 sec: 45986.3). Total num frames: 201850880. Throughput: 0: 45666.7. Samples: 202854500. Policy #0 lag: (min: 0.0, avg: 53.9, max: 104.0) [2024-03-20 23:56:15,522][03784] Avg episode reward: [(0, '0.598')] [2024-03-20 23:56:19,064][04017] Updated weights for policy 0, policy_version 6163 (0.0018) [2024-03-20 23:56:20,521][03784] Fps is (10 sec: 52428.4, 60 sec: 46967.4, 300 sec: 45097.7). Total num frames: 201981952. Throughput: 0: 46553.2. Samples: 203134000. Policy #0 lag: (min: 0.0, avg: 53.9, max: 104.0) [2024-03-20 23:56:20,522][03784] Avg episode reward: [(0, '0.650')] [2024-03-20 23:56:25,521][03784] Fps is (10 sec: 39321.3, 60 sec: 47513.6, 300 sec: 45542.0). Total num frames: 202244096. Throughput: 0: 46280.1. Samples: 203411200. Policy #0 lag: (min: 0.0, avg: 38.9, max: 86.0) [2024-03-20 23:56:25,522][03784] Avg episode reward: [(0, '0.756')] [2024-03-20 23:56:25,683][04017] Updated weights for policy 0, policy_version 6173 (0.0019) [2024-03-20 23:56:30,521][03784] Fps is (10 sec: 39321.8, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 202375168. Throughput: 0: 46633.3. Samples: 203566500. Policy #0 lag: (min: 0.0, avg: 37.6, max: 73.0) [2024-03-20 23:56:30,522][03784] Avg episode reward: [(0, '0.641')] [2024-03-20 23:56:35,521][03784] Fps is (10 sec: 32768.4, 60 sec: 46421.3, 300 sec: 44875.5). Total num frames: 202571776. Throughput: 0: 46224.5. Samples: 203844800. Policy #0 lag: (min: 3.0, avg: 38.4, max: 80.0) [2024-03-20 23:56:35,522][03784] Avg episode reward: [(0, '0.778')] [2024-03-20 23:56:36,926][04017] Updated weights for policy 0, policy_version 6183 (0.0020) [2024-03-20 23:56:40,521][03784] Fps is (10 sec: 45875.5, 60 sec: 49152.2, 300 sec: 45653.0). Total num frames: 202833920. Throughput: 0: 45568.9. Samples: 204111100. Policy #0 lag: (min: 3.0, avg: 38.4, max: 80.0) [2024-03-20 23:56:40,522][03784] Avg episode reward: [(0, '0.817')] [2024-03-20 23:56:43,581][04017] Updated weights for policy 0, policy_version 6193 (0.0011) [2024-03-20 23:56:45,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 202932224. Throughput: 0: 45344.5. Samples: 204234600. Policy #0 lag: (min: 0.0, avg: 51.2, max: 103.0) [2024-03-20 23:56:45,522][03784] Avg episode reward: [(0, '0.670')] [2024-03-20 23:56:50,521][03784] Fps is (10 sec: 32767.6, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 203161600. Throughput: 0: 45699.9. Samples: 204518800. Policy #0 lag: (min: 1.0, avg: 38.6, max: 79.0) [2024-03-20 23:56:50,522][03784] Avg episode reward: [(0, '1.039')] [2024-03-20 23:56:50,535][03995] Saving new best policy, reward=1.039! [2024-03-20 23:56:52,869][04017] Updated weights for policy 0, policy_version 6203 (0.0015) [2024-03-20 23:56:55,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48059.7, 300 sec: 45319.8). Total num frames: 203456512. Throughput: 0: 45673.4. Samples: 204808000. Policy #0 lag: (min: 3.0, avg: 37.0, max: 79.0) [2024-03-20 23:56:55,522][03784] Avg episode reward: [(0, '1.039')] [2024-03-20 23:56:58,463][04017] Updated weights for policy 0, policy_version 6213 (0.0016) [2024-03-20 23:57:00,521][03784] Fps is (10 sec: 49152.2, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 203653120. Throughput: 0: 46884.4. Samples: 204964300. Policy #0 lag: (min: 3.0, avg: 37.0, max: 79.0) [2024-03-20 23:57:00,522][03784] Avg episode reward: [(0, '1.039')] [2024-03-20 23:57:03,683][03995] Signal inference workers to stop experience collection... (4150 times) [2024-03-20 23:57:03,684][03995] Signal inference workers to resume experience collection... (4150 times) [2024-03-20 23:57:03,735][04017] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-03-20 23:57:03,735][04017] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-03-20 23:57:05,521][03784] Fps is (10 sec: 39321.6, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 203849728. Throughput: 0: 47109.0. Samples: 205253900. Policy #0 lag: (min: 0.0, avg: 39.8, max: 95.0) [2024-03-20 23:57:05,522][03784] Avg episode reward: [(0, '0.568')] [2024-03-20 23:57:06,266][04017] Updated weights for policy 0, policy_version 6223 (0.0015) [2024-03-20 23:57:10,045][04017] Updated weights for policy 0, policy_version 6233 (0.0010) [2024-03-20 23:57:10,521][03784] Fps is (10 sec: 62258.7, 60 sec: 46967.4, 300 sec: 44986.6). Total num frames: 204275712. Throughput: 0: 46995.5. Samples: 205526000. Policy #0 lag: (min: 2.0, avg: 34.1, max: 67.0) [2024-03-20 23:57:10,522][03784] Avg episode reward: [(0, '0.568')] [2024-03-20 23:57:14,771][04017] Updated weights for policy 0, policy_version 6243 (0.0021) [2024-03-20 23:57:15,521][03784] Fps is (10 sec: 72089.8, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 204570624. Throughput: 0: 46422.3. Samples: 205655500. Policy #0 lag: (min: 0.0, avg: 44.3, max: 75.0) [2024-03-20 23:57:15,530][03784] Avg episode reward: [(0, '0.373')] [2024-03-20 23:57:20,521][03784] Fps is (10 sec: 52429.2, 60 sec: 46967.5, 300 sec: 45542.0). Total num frames: 204800000. Throughput: 0: 46348.8. Samples: 205930500. Policy #0 lag: (min: 0.0, avg: 44.3, max: 75.0) [2024-03-20 23:57:20,522][03784] Avg episode reward: [(0, '0.910')] [2024-03-20 23:57:23,038][04017] Updated weights for policy 0, policy_version 6253 (0.0016) [2024-03-20 23:57:25,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 204931072. Throughput: 0: 47153.3. Samples: 206233000. Policy #0 lag: (min: 0.0, avg: 37.3, max: 65.0) [2024-03-20 23:57:25,522][03784] Avg episode reward: [(0, '0.333')] [2024-03-20 23:57:30,521][03784] Fps is (10 sec: 32768.1, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 205127680. Throughput: 0: 47557.8. Samples: 206374700. Policy #0 lag: (min: 1.0, avg: 38.8, max: 79.0) [2024-03-20 23:57:30,522][03784] Avg episode reward: [(0, '0.333')] [2024-03-20 23:57:31,737][04017] Updated weights for policy 0, policy_version 6263 (0.0010) [2024-03-20 23:57:35,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 205291520. Throughput: 0: 46920.1. Samples: 206630200. Policy #0 lag: (min: 0.0, avg: 39.0, max: 85.0) [2024-03-20 23:57:35,522][03784] Avg episode reward: [(0, '0.589')] [2024-03-20 23:57:40,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44236.7, 300 sec: 44653.3). Total num frames: 205488128. Throughput: 0: 46302.2. Samples: 206891600. Policy #0 lag: (min: 0.0, avg: 39.0, max: 85.0) [2024-03-20 23:57:40,522][03784] Avg episode reward: [(0, '0.604')] [2024-03-20 23:57:40,884][04017] Updated weights for policy 0, policy_version 6273 (0.0012) [2024-03-20 23:57:45,521][03784] Fps is (10 sec: 55704.9, 60 sec: 48605.8, 300 sec: 45208.7). Total num frames: 205848576. Throughput: 0: 45377.7. Samples: 207006300. Policy #0 lag: (min: 0.0, avg: 33.7, max: 73.0) [2024-03-20 23:57:45,522][03784] Avg episode reward: [(0, '0.831')] [2024-03-20 23:57:46,169][04017] Updated weights for policy 0, policy_version 6283 (0.0024) [2024-03-20 23:57:50,521][03784] Fps is (10 sec: 58982.4, 60 sec: 48605.9, 300 sec: 45319.8). Total num frames: 206077952. Throughput: 0: 45237.7. Samples: 207289600. Policy #0 lag: (min: 1.0, avg: 26.8, max: 58.0) [2024-03-20 23:57:50,522][03784] Avg episode reward: [(0, '0.831')] [2024-03-20 23:57:54,763][04017] Updated weights for policy 0, policy_version 6293 (0.0009) [2024-03-20 23:57:55,521][03784] Fps is (10 sec: 42598.9, 60 sec: 46967.5, 300 sec: 45542.0). Total num frames: 206274560. Throughput: 0: 45106.8. Samples: 207555800. Policy #0 lag: (min: 0.0, avg: 34.7, max: 82.0) [2024-03-20 23:57:55,522][03784] Avg episode reward: [(0, '0.303')] [2024-03-20 23:58:00,521][03784] Fps is (10 sec: 29491.1, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 206372864. Throughput: 0: 45279.9. Samples: 207693100. Policy #0 lag: (min: 0.0, avg: 34.7, max: 82.0) [2024-03-20 23:58:00,522][03784] Avg episode reward: [(0, '0.303')] [2024-03-20 23:58:00,539][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000006298_206372864.pth... [2024-03-20 23:58:00,657][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000005968_195559424.pth [2024-03-20 23:58:05,521][03784] Fps is (10 sec: 19660.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 206471168. Throughput: 0: 45391.1. Samples: 207973100. Policy #0 lag: (min: 0.0, avg: 30.2, max: 76.0) [2024-03-20 23:58:05,522][03784] Avg episode reward: [(0, '0.608')] [2024-03-20 23:58:05,614][03995] Signal inference workers to stop experience collection... (4200 times) [2024-03-20 23:58:05,716][04017] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-03-20 23:58:05,831][03995] Signal inference workers to resume experience collection... (4200 times) [2024-03-20 23:58:05,832][04017] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-03-20 23:58:05,840][04017] Updated weights for policy 0, policy_version 6303 (0.0017) [2024-03-20 23:58:10,521][03784] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 44653.3). Total num frames: 206831616. Throughput: 0: 44784.4. Samples: 208248300. Policy #0 lag: (min: 5.0, avg: 42.3, max: 92.0) [2024-03-20 23:58:10,522][03784] Avg episode reward: [(0, '0.365')] [2024-03-20 23:58:11,030][04017] Updated weights for policy 0, policy_version 6313 (0.0010) [2024-03-20 23:58:15,521][03784] Fps is (10 sec: 65537.1, 60 sec: 42598.5, 300 sec: 45097.7). Total num frames: 207126528. Throughput: 0: 44600.1. Samples: 208381700. Policy #0 lag: (min: 0.0, avg: 47.9, max: 116.0) [2024-03-20 23:58:15,521][03784] Avg episode reward: [(0, '0.255')] [2024-03-20 23:58:15,856][04017] Updated weights for policy 0, policy_version 6323 (0.0011) [2024-03-20 23:58:20,521][03784] Fps is (10 sec: 42598.7, 60 sec: 40960.0, 300 sec: 44431.2). Total num frames: 207257600. Throughput: 0: 44764.4. Samples: 208644600. Policy #0 lag: (min: 0.0, avg: 47.9, max: 116.0) [2024-03-20 23:58:20,522][03784] Avg episode reward: [(0, '0.617')] [2024-03-20 23:58:24,623][04017] Updated weights for policy 0, policy_version 6333 (0.0012) [2024-03-20 23:58:25,521][03784] Fps is (10 sec: 45874.5, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 207585280. Throughput: 0: 44902.2. Samples: 208912200. Policy #0 lag: (min: 0.0, avg: 43.6, max: 96.0) [2024-03-20 23:58:25,522][03784] Avg episode reward: [(0, '1.005')] [2024-03-20 23:58:30,235][04017] Updated weights for policy 0, policy_version 6343 (0.0017) [2024-03-20 23:58:30,521][03784] Fps is (10 sec: 58982.5, 60 sec: 45329.1, 300 sec: 45208.8). Total num frames: 207847424. Throughput: 0: 45693.5. Samples: 209062500. Policy #0 lag: (min: 0.0, avg: 36.0, max: 78.0) [2024-03-20 23:58:30,522][03784] Avg episode reward: [(0, '0.824')] [2024-03-20 23:58:35,521][03784] Fps is (10 sec: 55705.4, 60 sec: 47513.5, 300 sec: 45542.0). Total num frames: 208142336. Throughput: 0: 45688.9. Samples: 209345600. Policy #0 lag: (min: 1.0, avg: 55.3, max: 107.0) [2024-03-20 23:58:35,522][03784] Avg episode reward: [(0, '0.278')] [2024-03-20 23:58:35,712][04017] Updated weights for policy 0, policy_version 6353 (0.0014) [2024-03-20 23:58:40,521][03784] Fps is (10 sec: 62258.3, 60 sec: 49698.0, 300 sec: 45653.0). Total num frames: 208470016. Throughput: 0: 45979.8. Samples: 209624900. Policy #0 lag: (min: 0.0, avg: 56.7, max: 112.0) [2024-03-20 23:58:40,522][03784] Avg episode reward: [(0, '0.278')] [2024-03-20 23:58:40,963][04017] Updated weights for policy 0, policy_version 6363 (0.0012) [2024-03-20 23:58:45,521][03784] Fps is (10 sec: 49152.5, 60 sec: 46421.4, 300 sec: 45208.8). Total num frames: 208633856. Throughput: 0: 46002.3. Samples: 209763200. Policy #0 lag: (min: 0.0, avg: 56.7, max: 112.0) [2024-03-20 23:58:45,522][03784] Avg episode reward: [(0, '0.915')] [2024-03-20 23:58:50,521][03784] Fps is (10 sec: 26214.7, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 208732160. Throughput: 0: 45228.9. Samples: 210008400. Policy #0 lag: (min: 0.0, avg: 34.4, max: 85.0) [2024-03-20 23:58:50,522][03784] Avg episode reward: [(0, '0.705')] [2024-03-20 23:58:51,592][04017] Updated weights for policy 0, policy_version 6373 (0.0012) [2024-03-20 23:58:55,521][03784] Fps is (10 sec: 29491.1, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 208928768. Throughput: 0: 44173.4. Samples: 210236100. Policy #0 lag: (min: 0.0, avg: 42.9, max: 77.0) [2024-03-20 23:58:55,522][03784] Avg episode reward: [(0, '0.978')] [2024-03-20 23:58:57,709][03995] Signal inference workers to stop experience collection... (4250 times) [2024-03-20 23:58:57,798][04017] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-03-20 23:58:57,841][03995] Signal inference workers to resume experience collection... (4250 times) [2024-03-20 23:58:57,843][04017] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-03-20 23:58:58,507][04017] Updated weights for policy 0, policy_version 6383 (0.0015) [2024-03-20 23:59:00,521][03784] Fps is (10 sec: 49151.9, 60 sec: 47513.6, 300 sec: 45430.9). Total num frames: 209223680. Throughput: 0: 43395.4. Samples: 210334500. Policy #0 lag: (min: 0.0, avg: 42.9, max: 77.0) [2024-03-20 23:59:00,522][03784] Avg episode reward: [(0, '0.518')] [2024-03-20 23:59:05,521][03784] Fps is (10 sec: 52428.5, 60 sec: 49698.1, 300 sec: 45097.7). Total num frames: 209453056. Throughput: 0: 42948.8. Samples: 210577300. Policy #0 lag: (min: 0.0, avg: 35.3, max: 85.0) [2024-03-20 23:59:05,522][03784] Avg episode reward: [(0, '0.505')] [2024-03-20 23:59:10,035][04017] Updated weights for policy 0, policy_version 6393 (0.0011) [2024-03-20 23:59:10,521][03784] Fps is (10 sec: 26214.3, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 209485824. Throughput: 0: 43611.1. Samples: 210874700. Policy #0 lag: (min: 0.0, avg: 27.7, max: 85.0) [2024-03-20 23:59:10,522][03784] Avg episode reward: [(0, '0.603')] [2024-03-20 23:59:15,521][03784] Fps is (10 sec: 26214.6, 60 sec: 43144.4, 300 sec: 44653.4). Total num frames: 209715200. Throughput: 0: 43420.0. Samples: 211016400. Policy #0 lag: (min: 0.0, avg: 27.7, max: 85.0) [2024-03-20 23:59:15,522][03784] Avg episode reward: [(0, '0.603')] [2024-03-20 23:59:17,511][04017] Updated weights for policy 0, policy_version 6403 (0.0013) [2024-03-20 23:59:20,521][03784] Fps is (10 sec: 49152.2, 60 sec: 45329.0, 300 sec: 45208.7). Total num frames: 209977344. Throughput: 0: 43262.3. Samples: 211292400. Policy #0 lag: (min: 0.0, avg: 28.8, max: 69.0) [2024-03-20 23:59:20,522][03784] Avg episode reward: [(0, '0.773')] [2024-03-20 23:59:24,248][04017] Updated weights for policy 0, policy_version 6413 (0.0015) [2024-03-20 23:59:25,521][03784] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 45653.0). Total num frames: 210206720. Throughput: 0: 43473.4. Samples: 211581200. Policy #0 lag: (min: 0.0, avg: 29.8, max: 59.0) [2024-03-20 23:59:25,522][03784] Avg episode reward: [(0, '0.702')] [2024-03-20 23:59:30,521][03784] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 45319.8). Total num frames: 210403328. Throughput: 0: 43693.3. Samples: 211729400. Policy #0 lag: (min: 1.0, avg: 31.3, max: 59.0) [2024-03-20 23:59:30,531][03784] Avg episode reward: [(0, '0.515')] [2024-03-20 23:59:31,405][04017] Updated weights for policy 0, policy_version 6423 (0.0015) [2024-03-20 23:59:35,521][03784] Fps is (10 sec: 52428.7, 60 sec: 43144.5, 300 sec: 45319.8). Total num frames: 210731008. Throughput: 0: 44133.3. Samples: 211994400. Policy #0 lag: (min: 1.0, avg: 31.3, max: 59.0) [2024-03-20 23:59:35,522][03784] Avg episode reward: [(0, '0.515')] [2024-03-20 23:59:36,257][04017] Updated weights for policy 0, policy_version 6433 (0.0015) [2024-03-20 23:59:40,521][03784] Fps is (10 sec: 52429.0, 60 sec: 40960.1, 300 sec: 45097.7). Total num frames: 210927616. Throughput: 0: 45015.6. Samples: 212261800. Policy #0 lag: (min: 0.0, avg: 41.0, max: 119.0) [2024-03-20 23:59:40,522][03784] Avg episode reward: [(0, '0.515')] [2024-03-20 23:59:45,521][03784] Fps is (10 sec: 36044.5, 60 sec: 40959.9, 300 sec: 44986.5). Total num frames: 211091456. Throughput: 0: 45693.2. Samples: 212390700. Policy #0 lag: (min: 0.0, avg: 48.5, max: 115.0) [2024-03-20 23:59:45,522][03784] Avg episode reward: [(0, '0.690')] [2024-03-20 23:59:45,661][04017] Updated weights for policy 0, policy_version 6443 (0.0015) [2024-03-20 23:59:49,961][04017] Updated weights for policy 0, policy_version 6453 (0.0023) [2024-03-20 23:59:50,521][03784] Fps is (10 sec: 52428.5, 60 sec: 45329.0, 300 sec: 45875.2). Total num frames: 211451904. Throughput: 0: 46642.2. Samples: 212676200. Policy #0 lag: (min: 0.0, avg: 48.5, max: 115.0) [2024-03-20 23:59:50,522][03784] Avg episode reward: [(0, '0.690')] [2024-03-20 23:59:55,521][03784] Fps is (10 sec: 58983.3, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 211681280. Throughput: 0: 45917.9. Samples: 212941000. Policy #0 lag: (min: 0.0, avg: 41.3, max: 88.0) [2024-03-20 23:59:55,522][03784] Avg episode reward: [(0, '0.743')] [2024-03-20 23:59:56,472][03995] Signal inference workers to stop experience collection... (4300 times) [2024-03-20 23:59:56,525][04017] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-03-20 23:59:56,770][03995] Signal inference workers to resume experience collection... (4300 times) [2024-03-20 23:59:56,771][04017] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-03-20 23:59:57,063][04017] Updated weights for policy 0, policy_version 6463 (0.0013) [2024-03-21 00:00:00,521][03784] Fps is (10 sec: 49152.2, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 211943424. Throughput: 0: 45804.4. Samples: 213077600. Policy #0 lag: (min: 2.0, avg: 44.2, max: 77.0) [2024-03-21 00:00:00,522][03784] Avg episode reward: [(0, '0.743')] [2024-03-21 00:00:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000006468_211943424.pth... [2024-03-21 00:00:00,651][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000006132_200933376.pth [2024-03-21 00:00:03,167][04017] Updated weights for policy 0, policy_version 6473 (0.0012) [2024-03-21 00:00:05,521][03784] Fps is (10 sec: 49151.7, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 212172800. Throughput: 0: 45980.0. Samples: 213361500. Policy #0 lag: (min: 4.0, avg: 43.7, max: 79.0) [2024-03-21 00:00:05,522][03784] Avg episode reward: [(0, '0.615')] [2024-03-21 00:00:10,521][03784] Fps is (10 sec: 36045.0, 60 sec: 46967.5, 300 sec: 44986.6). Total num frames: 212303872. Throughput: 0: 45300.0. Samples: 213619700. Policy #0 lag: (min: 0.0, avg: 33.5, max: 68.0) [2024-03-21 00:00:10,522][03784] Avg episode reward: [(0, '0.944')] [2024-03-21 00:00:12,659][04017] Updated weights for policy 0, policy_version 6483 (0.0011) [2024-03-21 00:00:15,521][03784] Fps is (10 sec: 39321.9, 60 sec: 47513.6, 300 sec: 45430.9). Total num frames: 212566016. Throughput: 0: 44653.4. Samples: 213738800. Policy #0 lag: (min: 0.0, avg: 33.5, max: 68.0) [2024-03-21 00:00:15,522][03784] Avg episode reward: [(0, '0.333')] [2024-03-21 00:00:19,012][04017] Updated weights for policy 0, policy_version 6493 (0.0011) [2024-03-21 00:00:20,521][03784] Fps is (10 sec: 52428.7, 60 sec: 47513.6, 300 sec: 45542.0). Total num frames: 212828160. Throughput: 0: 44246.7. Samples: 213985500. Policy #0 lag: (min: 1.0, avg: 46.9, max: 88.0) [2024-03-21 00:00:20,530][03784] Avg episode reward: [(0, '0.733')] [2024-03-21 00:00:25,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 212926464. Throughput: 0: 43682.3. Samples: 214227500. Policy #0 lag: (min: 0.0, avg: 30.1, max: 90.0) [2024-03-21 00:00:25,522][03784] Avg episode reward: [(0, '0.611')] [2024-03-21 00:00:30,521][03784] Fps is (10 sec: 16384.0, 60 sec: 43144.6, 300 sec: 44764.4). Total num frames: 212992000. Throughput: 0: 44062.4. Samples: 214373500. Policy #0 lag: (min: 0.0, avg: 30.1, max: 90.0) [2024-03-21 00:00:30,522][03784] Avg episode reward: [(0, '1.118')] [2024-03-21 00:00:30,535][03995] Saving new best policy, reward=1.118! [2024-03-21 00:00:32,912][04017] Updated weights for policy 0, policy_version 6503 (0.0015) [2024-03-21 00:00:35,521][03784] Fps is (10 sec: 29490.9, 60 sec: 41506.2, 300 sec: 45208.8). Total num frames: 213221376. Throughput: 0: 44488.9. Samples: 214678200. Policy #0 lag: (min: 0.0, avg: 26.1, max: 69.0) [2024-03-21 00:00:35,522][03784] Avg episode reward: [(0, '1.118')] [2024-03-21 00:00:39,449][04017] Updated weights for policy 0, policy_version 6513 (0.0011) [2024-03-21 00:00:40,521][03784] Fps is (10 sec: 49151.5, 60 sec: 42598.3, 300 sec: 45097.6). Total num frames: 213483520. Throughput: 0: 44384.4. Samples: 214938300. Policy #0 lag: (min: 2.0, avg: 30.3, max: 83.0) [2024-03-21 00:00:40,522][03784] Avg episode reward: [(0, '0.504')] [2024-03-21 00:00:43,680][04017] Updated weights for policy 0, policy_version 6523 (0.0021) [2024-03-21 00:00:45,521][03784] Fps is (10 sec: 58982.6, 60 sec: 45329.2, 300 sec: 45430.9). Total num frames: 213811200. Throughput: 0: 44400.0. Samples: 215075600. Policy #0 lag: (min: 0.0, avg: 40.3, max: 81.0) [2024-03-21 00:00:45,522][03784] Avg episode reward: [(0, '0.793')] [2024-03-21 00:00:50,521][03784] Fps is (10 sec: 52428.9, 60 sec: 42598.4, 300 sec: 45542.0). Total num frames: 214007808. Throughput: 0: 44626.7. Samples: 215369700. Policy #0 lag: (min: 0.0, avg: 40.3, max: 81.0) [2024-03-21 00:00:50,522][03784] Avg episode reward: [(0, '0.537')] [2024-03-21 00:00:51,868][04017] Updated weights for policy 0, policy_version 6533 (0.0014) [2024-03-21 00:00:55,521][03784] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 214302720. Throughput: 0: 45051.2. Samples: 215647000. Policy #0 lag: (min: 0.0, avg: 49.8, max: 111.0) [2024-03-21 00:00:55,522][03784] Avg episode reward: [(0, '0.405')] [2024-03-21 00:00:56,285][04017] Updated weights for policy 0, policy_version 6543 (0.0015) [2024-03-21 00:00:56,712][03995] Signal inference workers to stop experience collection... (4350 times) [2024-03-21 00:00:56,712][03995] Signal inference workers to resume experience collection... (4350 times) [2024-03-21 00:00:56,770][04017] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-03-21 00:00:56,771][04017] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-03-21 00:01:00,521][03784] Fps is (10 sec: 62259.3, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 214630400. Throughput: 0: 45266.6. Samples: 215775800. Policy #0 lag: (min: 2.0, avg: 62.0, max: 124.0) [2024-03-21 00:01:00,522][03784] Avg episode reward: [(0, '0.666')] [2024-03-21 00:01:02,177][04017] Updated weights for policy 0, policy_version 6553 (0.0012) [2024-03-21 00:01:05,521][03784] Fps is (10 sec: 58981.7, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 214892544. Throughput: 0: 45999.9. Samples: 216055500. Policy #0 lag: (min: 0.0, avg: 52.3, max: 97.0) [2024-03-21 00:01:05,522][03784] Avg episode reward: [(0, '0.838')] [2024-03-21 00:01:09,641][04017] Updated weights for policy 0, policy_version 6563 (0.0015) [2024-03-21 00:01:10,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46421.3, 300 sec: 44875.5). Total num frames: 215089152. Throughput: 0: 46764.3. Samples: 216331900. Policy #0 lag: (min: 0.0, avg: 52.3, max: 97.0) [2024-03-21 00:01:10,522][03784] Avg episode reward: [(0, '0.554')] [2024-03-21 00:01:15,521][03784] Fps is (10 sec: 32768.5, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 215220224. Throughput: 0: 46302.3. Samples: 216457100. Policy #0 lag: (min: 0.0, avg: 44.2, max: 76.0) [2024-03-21 00:01:15,521][03784] Avg episode reward: [(0, '0.422')] [2024-03-21 00:01:18,960][04017] Updated weights for policy 0, policy_version 6573 (0.0009) [2024-03-21 00:01:20,521][03784] Fps is (10 sec: 39321.5, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 215482368. Throughput: 0: 45568.8. Samples: 216728800. Policy #0 lag: (min: 1.0, avg: 39.7, max: 94.0) [2024-03-21 00:01:20,522][03784] Avg episode reward: [(0, '0.291')] [2024-03-21 00:01:24,802][04017] Updated weights for policy 0, policy_version 6583 (0.0009) [2024-03-21 00:01:25,521][03784] Fps is (10 sec: 55704.6, 60 sec: 47513.5, 300 sec: 45430.9). Total num frames: 215777280. Throughput: 0: 45924.4. Samples: 217004900. Policy #0 lag: (min: 1.0, avg: 39.7, max: 94.0) [2024-03-21 00:01:25,522][03784] Avg episode reward: [(0, '0.914')] [2024-03-21 00:01:30,521][03784] Fps is (10 sec: 39321.8, 60 sec: 48059.7, 300 sec: 45097.6). Total num frames: 215875584. Throughput: 0: 46291.1. Samples: 217158700. Policy #0 lag: (min: 0.0, avg: 35.4, max: 70.0) [2024-03-21 00:01:30,522][03784] Avg episode reward: [(0, '0.975')] [2024-03-21 00:01:33,000][04017] Updated weights for policy 0, policy_version 6593 (0.0011) [2024-03-21 00:01:35,521][03784] Fps is (10 sec: 42599.0, 60 sec: 49698.2, 300 sec: 45319.8). Total num frames: 216203264. Throughput: 0: 45706.8. Samples: 217426500. Policy #0 lag: (min: 0.0, avg: 35.4, max: 70.0) [2024-03-21 00:01:35,522][03784] Avg episode reward: [(0, '0.452')] [2024-03-21 00:01:40,521][03784] Fps is (10 sec: 39321.7, 60 sec: 46421.4, 300 sec: 45208.7). Total num frames: 216268800. Throughput: 0: 45946.6. Samples: 217714600. Policy #0 lag: (min: 0.0, avg: 36.1, max: 94.0) [2024-03-21 00:01:40,522][03784] Avg episode reward: [(0, '0.497')] [2024-03-21 00:01:42,504][04017] Updated weights for policy 0, policy_version 6603 (0.0010) [2024-03-21 00:01:45,521][03784] Fps is (10 sec: 45875.1, 60 sec: 47513.6, 300 sec: 45764.1). Total num frames: 216662016. Throughput: 0: 46124.5. Samples: 217851400. Policy #0 lag: (min: 0.0, avg: 36.1, max: 94.0) [2024-03-21 00:01:45,530][03784] Avg episode reward: [(0, '0.515')] [2024-03-21 00:01:45,738][04017] Updated weights for policy 0, policy_version 6613 (0.0011) [2024-03-21 00:01:50,521][03784] Fps is (10 sec: 58982.5, 60 sec: 47513.7, 300 sec: 45430.9). Total num frames: 216858624. Throughput: 0: 46251.2. Samples: 218136800. Policy #0 lag: (min: 0.0, avg: 38.4, max: 96.0) [2024-03-21 00:01:50,522][03784] Avg episode reward: [(0, '0.696')] [2024-03-21 00:01:55,432][04017] Updated weights for policy 0, policy_version 6623 (0.0014) [2024-03-21 00:01:55,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 217022464. Throughput: 0: 47084.6. Samples: 218450700. Policy #0 lag: (min: 1.0, avg: 36.0, max: 68.0) [2024-03-21 00:01:55,521][03784] Avg episode reward: [(0, '0.696')] [2024-03-21 00:01:59,763][03995] Signal inference workers to stop experience collection... (4400 times) [2024-03-21 00:01:59,764][03995] Signal inference workers to resume experience collection... (4400 times) [2024-03-21 00:01:59,826][04017] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-03-21 00:01:59,826][04017] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-03-21 00:02:00,521][03784] Fps is (10 sec: 36044.5, 60 sec: 43144.5, 300 sec: 45319.8). Total num frames: 217219072. Throughput: 0: 47562.1. Samples: 218597400. Policy #0 lag: (min: 1.0, avg: 40.9, max: 89.0) [2024-03-21 00:02:00,522][03784] Avg episode reward: [(0, '0.844')] [2024-03-21 00:02:00,756][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000006630_217251840.pth... [2024-03-21 00:02:00,877][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000006298_206372864.pth [2024-03-21 00:02:02,045][04017] Updated weights for policy 0, policy_version 6633 (0.0012) [2024-03-21 00:02:05,521][03784] Fps is (10 sec: 49151.3, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 217513984. Throughput: 0: 47604.4. Samples: 218871000. Policy #0 lag: (min: 0.0, avg: 35.8, max: 72.0) [2024-03-21 00:02:05,522][03784] Avg episode reward: [(0, '0.499')] [2024-03-21 00:02:08,792][04017] Updated weights for policy 0, policy_version 6643 (0.0011) [2024-03-21 00:02:10,521][03784] Fps is (10 sec: 62259.7, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 217841664. Throughput: 0: 47600.1. Samples: 219146900. Policy #0 lag: (min: 0.0, avg: 35.8, max: 72.0) [2024-03-21 00:02:10,522][03784] Avg episode reward: [(0, '0.499')] [2024-03-21 00:02:12,171][04017] Updated weights for policy 0, policy_version 6653 (0.0028) [2024-03-21 00:02:15,521][03784] Fps is (10 sec: 68813.4, 60 sec: 49698.1, 300 sec: 45430.9). Total num frames: 218202112. Throughput: 0: 46346.7. Samples: 219244300. Policy #0 lag: (min: 3.0, avg: 42.8, max: 90.0) [2024-03-21 00:02:15,522][03784] Avg episode reward: [(0, '0.835')] [2024-03-21 00:02:19,305][04017] Updated weights for policy 0, policy_version 6663 (0.0015) [2024-03-21 00:02:20,521][03784] Fps is (10 sec: 55705.6, 60 sec: 48605.9, 300 sec: 45653.0). Total num frames: 218398720. Throughput: 0: 46357.8. Samples: 219512600. Policy #0 lag: (min: 0.0, avg: 50.2, max: 85.0) [2024-03-21 00:02:20,523][03784] Avg episode reward: [(0, '0.711')] [2024-03-21 00:02:25,521][03784] Fps is (10 sec: 29491.3, 60 sec: 45329.2, 300 sec: 45319.8). Total num frames: 218497024. Throughput: 0: 46775.6. Samples: 219819500. Policy #0 lag: (min: 1.0, avg: 47.2, max: 96.0) [2024-03-21 00:02:25,522][03784] Avg episode reward: [(0, '0.695')] [2024-03-21 00:02:28,489][04017] Updated weights for policy 0, policy_version 6673 (0.0015) [2024-03-21 00:02:30,521][03784] Fps is (10 sec: 39321.1, 60 sec: 48605.8, 300 sec: 45764.1). Total num frames: 218791936. Throughput: 0: 46968.8. Samples: 219965000. Policy #0 lag: (min: 1.0, avg: 47.2, max: 96.0) [2024-03-21 00:02:30,522][03784] Avg episode reward: [(0, '0.695')] [2024-03-21 00:02:32,510][04017] Updated weights for policy 0, policy_version 6683 (0.0011) [2024-03-21 00:02:35,521][03784] Fps is (10 sec: 68812.5, 60 sec: 49698.1, 300 sec: 46430.6). Total num frames: 219185152. Throughput: 0: 46615.5. Samples: 220234500. Policy #0 lag: (min: 2.0, avg: 36.1, max: 62.0) [2024-03-21 00:02:35,522][03784] Avg episode reward: [(0, '0.420')] [2024-03-21 00:02:40,521][03784] Fps is (10 sec: 45876.0, 60 sec: 49698.2, 300 sec: 45430.9). Total num frames: 219250688. Throughput: 0: 45904.4. Samples: 220516400. Policy #0 lag: (min: 0.0, avg: 50.5, max: 99.0) [2024-03-21 00:02:40,522][03784] Avg episode reward: [(0, '0.366')] [2024-03-21 00:02:41,481][04017] Updated weights for policy 0, policy_version 6693 (0.0015) [2024-03-21 00:02:45,521][03784] Fps is (10 sec: 22937.6, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 219414528. Throughput: 0: 45740.1. Samples: 220655700. Policy #0 lag: (min: 0.0, avg: 50.5, max: 99.0) [2024-03-21 00:02:45,522][03784] Avg episode reward: [(0, '1.139')] [2024-03-21 00:02:45,522][03995] Saving new best policy, reward=1.139! [2024-03-21 00:02:47,032][03995] Signal inference workers to stop experience collection... (4450 times) [2024-03-21 00:02:47,087][04017] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-03-21 00:02:47,329][03995] Signal inference workers to resume experience collection... (4450 times) [2024-03-21 00:02:47,330][04017] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-03-21 00:02:50,521][03784] Fps is (10 sec: 32767.8, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 219578368. Throughput: 0: 46055.6. Samples: 220943500. Policy #0 lag: (min: 0.0, avg: 31.6, max: 72.0) [2024-03-21 00:02:50,522][03784] Avg episode reward: [(0, '0.891')] [2024-03-21 00:02:51,959][04017] Updated weights for policy 0, policy_version 6703 (0.0015) [2024-03-21 00:02:55,521][03784] Fps is (10 sec: 26213.7, 60 sec: 44236.6, 300 sec: 45097.6). Total num frames: 219676672. Throughput: 0: 46755.3. Samples: 221250900. Policy #0 lag: (min: 0.0, avg: 40.9, max: 102.0) [2024-03-21 00:02:55,522][03784] Avg episode reward: [(0, '0.715')] [2024-03-21 00:02:59,555][04017] Updated weights for policy 0, policy_version 6713 (0.0011) [2024-03-21 00:03:00,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46421.4, 300 sec: 45875.2). Total num frames: 220004352. Throughput: 0: 47311.1. Samples: 221373300. Policy #0 lag: (min: 0.0, avg: 40.9, max: 102.0) [2024-03-21 00:03:00,522][03784] Avg episode reward: [(0, '0.379')] [2024-03-21 00:03:05,264][04017] Updated weights for policy 0, policy_version 6723 (0.0017) [2024-03-21 00:03:05,521][03784] Fps is (10 sec: 65537.0, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 220332032. Throughput: 0: 47006.5. Samples: 221627900. Policy #0 lag: (min: 1.0, avg: 31.3, max: 71.0) [2024-03-21 00:03:05,522][03784] Avg episode reward: [(0, '0.661')] [2024-03-21 00:03:10,197][04017] Updated weights for policy 0, policy_version 6733 (0.0016) [2024-03-21 00:03:10,521][03784] Fps is (10 sec: 65536.2, 60 sec: 46967.5, 300 sec: 45875.2). Total num frames: 220659712. Throughput: 0: 46055.5. Samples: 221892000. Policy #0 lag: (min: 0.0, avg: 28.8, max: 102.0) [2024-03-21 00:03:10,522][03784] Avg episode reward: [(0, '0.664')] [2024-03-21 00:03:15,521][03784] Fps is (10 sec: 58983.2, 60 sec: 45329.1, 300 sec: 46319.5). Total num frames: 220921856. Throughput: 0: 46037.9. Samples: 222036700. Policy #0 lag: (min: 0.0, avg: 28.8, max: 102.0) [2024-03-21 00:03:15,522][03784] Avg episode reward: [(0, '0.525')] [2024-03-21 00:03:16,554][04017] Updated weights for policy 0, policy_version 6743 (0.0022) [2024-03-21 00:03:20,521][03784] Fps is (10 sec: 55705.4, 60 sec: 46967.4, 300 sec: 46208.4). Total num frames: 221216768. Throughput: 0: 46582.2. Samples: 222330700. Policy #0 lag: (min: 0.0, avg: 46.5, max: 108.0) [2024-03-21 00:03:20,522][03784] Avg episode reward: [(0, '0.525')] [2024-03-21 00:03:20,933][04017] Updated weights for policy 0, policy_version 6753 (0.0015) [2024-03-21 00:03:25,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46967.4, 300 sec: 45653.0). Total num frames: 221315072. Throughput: 0: 46391.0. Samples: 222604000. Policy #0 lag: (min: 0.0, avg: 45.6, max: 123.0) [2024-03-21 00:03:25,522][03784] Avg episode reward: [(0, '0.834')] [2024-03-21 00:03:30,521][03784] Fps is (10 sec: 29491.2, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 221511680. Throughput: 0: 46237.8. Samples: 222736400. Policy #0 lag: (min: 0.0, avg: 46.9, max: 116.0) [2024-03-21 00:03:30,522][03784] Avg episode reward: [(0, '0.514')] [2024-03-21 00:03:32,271][04017] Updated weights for policy 0, policy_version 6763 (0.0016) [2024-03-21 00:03:35,521][03784] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 44986.6). Total num frames: 221741056. Throughput: 0: 44995.5. Samples: 222968300. Policy #0 lag: (min: 0.0, avg: 46.9, max: 116.0) [2024-03-21 00:03:35,522][03784] Avg episode reward: [(0, '0.451')] [2024-03-21 00:03:40,521][03784] Fps is (10 sec: 32767.7, 60 sec: 43144.4, 300 sec: 44764.4). Total num frames: 221839360. Throughput: 0: 44249.0. Samples: 223242100. Policy #0 lag: (min: 0.0, avg: 44.5, max: 79.0) [2024-03-21 00:03:40,522][03784] Avg episode reward: [(0, '1.063')] [2024-03-21 00:03:41,868][04017] Updated weights for policy 0, policy_version 6773 (0.0011) [2024-03-21 00:03:45,521][03784] Fps is (10 sec: 36044.6, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 222101504. Throughput: 0: 44206.6. Samples: 223362600. Policy #0 lag: (min: 0.0, avg: 44.5, max: 79.0) [2024-03-21 00:03:45,522][03784] Avg episode reward: [(0, '0.316')] [2024-03-21 00:03:48,881][04017] Updated weights for policy 0, policy_version 6783 (0.0012) [2024-03-21 00:03:50,100][03995] Signal inference workers to stop experience collection... (4500 times) [2024-03-21 00:03:50,101][03995] Signal inference workers to resume experience collection... (4500 times) [2024-03-21 00:03:50,164][04017] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-03-21 00:03:50,165][04017] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-03-21 00:03:50,521][03784] Fps is (10 sec: 52429.1, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 222363648. Throughput: 0: 43697.8. Samples: 223594300. Policy #0 lag: (min: 0.0, avg: 39.1, max: 102.0) [2024-03-21 00:03:50,522][03784] Avg episode reward: [(0, '0.823')] [2024-03-21 00:03:55,282][04017] Updated weights for policy 0, policy_version 6793 (0.0011) [2024-03-21 00:03:55,521][03784] Fps is (10 sec: 49152.6, 60 sec: 48606.1, 300 sec: 45319.8). Total num frames: 222593024. Throughput: 0: 43953.3. Samples: 223869900. Policy #0 lag: (min: 1.0, avg: 30.4, max: 67.0) [2024-03-21 00:03:55,522][03784] Avg episode reward: [(0, '0.933')] [2024-03-21 00:04:00,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.4, 300 sec: 45319.8). Total num frames: 222822400. Throughput: 0: 43764.4. Samples: 224006100. Policy #0 lag: (min: 1.0, avg: 41.3, max: 83.0) [2024-03-21 00:04:00,522][03784] Avg episode reward: [(0, '0.743')] [2024-03-21 00:04:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000006800_222822400.pth... [2024-03-21 00:04:00,693][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000006468_211943424.pth [2024-03-21 00:04:02,136][04017] Updated weights for policy 0, policy_version 6803 (0.0011) [2024-03-21 00:04:05,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46421.4, 300 sec: 46208.4). Total num frames: 223117312. Throughput: 0: 43335.6. Samples: 224280800. Policy #0 lag: (min: 1.0, avg: 35.5, max: 71.0) [2024-03-21 00:04:05,522][03784] Avg episode reward: [(0, '0.530')] [2024-03-21 00:04:09,725][04017] Updated weights for policy 0, policy_version 6813 (0.0021) [2024-03-21 00:04:10,521][03784] Fps is (10 sec: 45875.5, 60 sec: 43690.7, 300 sec: 45986.3). Total num frames: 223281152. Throughput: 0: 42926.7. Samples: 224535700. Policy #0 lag: (min: 1.0, avg: 35.5, max: 71.0) [2024-03-21 00:04:10,522][03784] Avg episode reward: [(0, '0.757')] [2024-03-21 00:04:15,521][03784] Fps is (10 sec: 22937.4, 60 sec: 40413.8, 300 sec: 45319.8). Total num frames: 223346688. Throughput: 0: 43399.9. Samples: 224689400. Policy #0 lag: (min: 0.0, avg: 47.7, max: 114.0) [2024-03-21 00:04:15,522][03784] Avg episode reward: [(0, '0.369')] [2024-03-21 00:04:20,398][04017] Updated weights for policy 0, policy_version 6823 (0.0018) [2024-03-21 00:04:20,521][03784] Fps is (10 sec: 29490.6, 60 sec: 39321.5, 300 sec: 45319.8). Total num frames: 223576064. Throughput: 0: 44215.5. Samples: 224958000. Policy #0 lag: (min: 0.0, avg: 47.7, max: 114.0) [2024-03-21 00:04:20,522][03784] Avg episode reward: [(0, '0.679')] [2024-03-21 00:04:25,521][03784] Fps is (10 sec: 42598.7, 60 sec: 40960.0, 300 sec: 45319.8). Total num frames: 223772672. Throughput: 0: 43926.8. Samples: 225218800. Policy #0 lag: (min: 0.0, avg: 38.1, max: 68.0) [2024-03-21 00:04:25,522][03784] Avg episode reward: [(0, '0.641')] [2024-03-21 00:04:27,495][04017] Updated weights for policy 0, policy_version 6833 (0.0025) [2024-03-21 00:04:30,521][03784] Fps is (10 sec: 49153.0, 60 sec: 42598.4, 300 sec: 45208.7). Total num frames: 224067584. Throughput: 0: 44160.2. Samples: 225349800. Policy #0 lag: (min: 0.0, avg: 40.4, max: 78.0) [2024-03-21 00:04:30,527][03784] Avg episode reward: [(0, '0.769')] [2024-03-21 00:04:31,915][04017] Updated weights for policy 0, policy_version 6843 (0.0016) [2024-03-21 00:04:35,521][03784] Fps is (10 sec: 58982.4, 60 sec: 43690.7, 300 sec: 45542.0). Total num frames: 224362496. Throughput: 0: 44811.1. Samples: 225610800. Policy #0 lag: (min: 0.0, avg: 40.4, max: 78.0) [2024-03-21 00:04:35,522][03784] Avg episode reward: [(0, '0.520')] [2024-03-21 00:04:39,728][04017] Updated weights for policy 0, policy_version 6853 (0.0011) [2024-03-21 00:04:40,521][03784] Fps is (10 sec: 52427.9, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 224591872. Throughput: 0: 45091.0. Samples: 225899000. Policy #0 lag: (min: 1.0, avg: 29.7, max: 59.0) [2024-03-21 00:04:40,522][03784] Avg episode reward: [(0, '0.519')] [2024-03-21 00:04:45,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44783.0, 300 sec: 45208.7). Total num frames: 224788480. Throughput: 0: 44791.1. Samples: 226021700. Policy #0 lag: (min: 0.0, avg: 41.5, max: 77.0) [2024-03-21 00:04:45,522][03784] Avg episode reward: [(0, '0.359')] [2024-03-21 00:04:46,884][03995] Signal inference workers to stop experience collection... (4550 times) [2024-03-21 00:04:46,947][04017] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-03-21 00:04:47,204][03995] Signal inference workers to resume experience collection... (4550 times) [2024-03-21 00:04:47,204][04017] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-03-21 00:04:47,799][04017] Updated weights for policy 0, policy_version 6863 (0.0028) [2024-03-21 00:04:50,521][03784] Fps is (10 sec: 52429.3, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 225116160. Throughput: 0: 44706.6. Samples: 226292600. Policy #0 lag: (min: 0.0, avg: 45.2, max: 106.0) [2024-03-21 00:04:50,522][03784] Avg episode reward: [(0, '0.359')] [2024-03-21 00:04:52,464][04017] Updated weights for policy 0, policy_version 6873 (0.0014) [2024-03-21 00:04:55,521][03784] Fps is (10 sec: 65536.0, 60 sec: 47513.5, 300 sec: 45764.1). Total num frames: 225443840. Throughput: 0: 44622.1. Samples: 226543700. Policy #0 lag: (min: 0.0, avg: 54.5, max: 116.0) [2024-03-21 00:04:55,522][03784] Avg episode reward: [(0, '1.012')] [2024-03-21 00:04:59,476][04017] Updated weights for policy 0, policy_version 6883 (0.0011) [2024-03-21 00:05:00,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 225574912. Throughput: 0: 44115.6. Samples: 226674600. Policy #0 lag: (min: 0.0, avg: 54.5, max: 116.0) [2024-03-21 00:05:00,522][03784] Avg episode reward: [(0, '0.422')] [2024-03-21 00:05:05,521][03784] Fps is (10 sec: 16384.2, 60 sec: 41506.2, 300 sec: 45097.7). Total num frames: 225607680. Throughput: 0: 44324.7. Samples: 226952600. Policy #0 lag: (min: 0.0, avg: 41.3, max: 77.0) [2024-03-21 00:05:05,522][03784] Avg episode reward: [(0, '0.424')] [2024-03-21 00:05:10,521][03784] Fps is (10 sec: 26214.4, 60 sec: 42598.3, 300 sec: 44986.6). Total num frames: 225837056. Throughput: 0: 44251.1. Samples: 227210100. Policy #0 lag: (min: 0.0, avg: 41.3, max: 77.0) [2024-03-21 00:05:10,522][03784] Avg episode reward: [(0, '0.849')] [2024-03-21 00:05:10,732][04017] Updated weights for policy 0, policy_version 6893 (0.0013) [2024-03-21 00:05:15,521][03784] Fps is (10 sec: 42598.0, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 226033664. Throughput: 0: 44635.5. Samples: 227358400. Policy #0 lag: (min: 0.0, avg: 33.8, max: 68.0) [2024-03-21 00:05:15,522][03784] Avg episode reward: [(0, '0.668')] [2024-03-21 00:05:16,968][04017] Updated weights for policy 0, policy_version 6903 (0.0010) [2024-03-21 00:05:20,521][03784] Fps is (10 sec: 49151.7, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 226328576. Throughput: 0: 45088.8. Samples: 227639800. Policy #0 lag: (min: 1.0, avg: 42.0, max: 90.0) [2024-03-21 00:05:20,522][03784] Avg episode reward: [(0, '0.841')] [2024-03-21 00:05:25,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45329.0, 300 sec: 45764.1). Total num frames: 226492416. Throughput: 0: 44889.0. Samples: 227919000. Policy #0 lag: (min: 1.0, avg: 42.0, max: 90.0) [2024-03-21 00:05:25,522][03784] Avg episode reward: [(0, '0.413')] [2024-03-21 00:05:25,583][04017] Updated weights for policy 0, policy_version 6913 (0.0014) [2024-03-21 00:05:30,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44782.8, 300 sec: 45875.2). Total num frames: 226754560. Throughput: 0: 45302.1. Samples: 228060300. Policy #0 lag: (min: 1.0, avg: 23.2, max: 48.0) [2024-03-21 00:05:30,522][03784] Avg episode reward: [(0, '1.050')] [2024-03-21 00:05:31,543][04017] Updated weights for policy 0, policy_version 6923 (0.0011) [2024-03-21 00:05:35,521][03784] Fps is (10 sec: 58981.8, 60 sec: 45329.0, 300 sec: 46097.3). Total num frames: 227082240. Throughput: 0: 45844.3. Samples: 228355600. Policy #0 lag: (min: 0.0, avg: 49.6, max: 107.0) [2024-03-21 00:05:35,522][03784] Avg episode reward: [(0, '0.618')] [2024-03-21 00:05:37,578][04017] Updated weights for policy 0, policy_version 6933 (0.0017) [2024-03-21 00:05:37,636][03995] Signal inference workers to stop experience collection... (4600 times) [2024-03-21 00:05:37,637][03995] Signal inference workers to resume experience collection... (4600 times) [2024-03-21 00:05:37,743][04017] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-03-21 00:05:37,743][04017] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-03-21 00:05:40,521][03784] Fps is (10 sec: 58983.2, 60 sec: 45875.3, 300 sec: 45875.2). Total num frames: 227344384. Throughput: 0: 46453.4. Samples: 228634100. Policy #0 lag: (min: 0.0, avg: 49.6, max: 107.0) [2024-03-21 00:05:40,522][03784] Avg episode reward: [(0, '0.457')] [2024-03-21 00:05:41,952][04017] Updated weights for policy 0, policy_version 6943 (0.0016) [2024-03-21 00:05:45,521][03784] Fps is (10 sec: 58982.9, 60 sec: 48059.7, 300 sec: 46319.5). Total num frames: 227672064. Throughput: 0: 46595.5. Samples: 228771400. Policy #0 lag: (min: 1.0, avg: 35.3, max: 67.0) [2024-03-21 00:05:45,522][03784] Avg episode reward: [(0, '0.457')] [2024-03-21 00:05:48,368][04017] Updated weights for policy 0, policy_version 6953 (0.0011) [2024-03-21 00:05:50,521][03784] Fps is (10 sec: 52428.7, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 227868672. Throughput: 0: 46693.2. Samples: 229053800. Policy #0 lag: (min: 1.0, avg: 38.0, max: 68.0) [2024-03-21 00:05:50,522][03784] Avg episode reward: [(0, '0.749')] [2024-03-21 00:05:55,521][03784] Fps is (10 sec: 42598.6, 60 sec: 44236.8, 300 sec: 45653.0). Total num frames: 228098048. Throughput: 0: 47408.9. Samples: 229343500. Policy #0 lag: (min: 0.0, avg: 49.8, max: 116.0) [2024-03-21 00:05:55,522][03784] Avg episode reward: [(0, '0.749')] [2024-03-21 00:05:56,208][04017] Updated weights for policy 0, policy_version 6963 (0.0011) [2024-03-21 00:06:00,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 228327424. Throughput: 0: 47380.0. Samples: 229490500. Policy #0 lag: (min: 2.0, avg: 42.0, max: 91.0) [2024-03-21 00:06:00,522][03784] Avg episode reward: [(0, '0.749')] [2024-03-21 00:06:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000006968_228327424.pth... [2024-03-21 00:06:00,648][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000006630_217251840.pth [2024-03-21 00:06:05,521][03784] Fps is (10 sec: 32768.0, 60 sec: 46967.4, 300 sec: 45208.7). Total num frames: 228425728. Throughput: 0: 47497.9. Samples: 229777200. Policy #0 lag: (min: 2.0, avg: 42.0, max: 91.0) [2024-03-21 00:06:05,522][03784] Avg episode reward: [(0, '0.684')] [2024-03-21 00:06:05,912][04017] Updated weights for policy 0, policy_version 6973 (0.0010) [2024-03-21 00:06:10,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46967.4, 300 sec: 45541.9). Total num frames: 228655104. Throughput: 0: 47266.6. Samples: 230046000. Policy #0 lag: (min: 0.0, avg: 32.5, max: 72.0) [2024-03-21 00:06:10,522][03784] Avg episode reward: [(0, '0.796')] [2024-03-21 00:06:14,256][04017] Updated weights for policy 0, policy_version 6983 (0.0009) [2024-03-21 00:06:15,521][03784] Fps is (10 sec: 45875.2, 60 sec: 47513.6, 300 sec: 45430.9). Total num frames: 228884480. Throughput: 0: 47386.8. Samples: 230192700. Policy #0 lag: (min: 2.0, avg: 44.4, max: 91.0) [2024-03-21 00:06:15,522][03784] Avg episode reward: [(0, '1.008')] [2024-03-21 00:06:19,101][04017] Updated weights for policy 0, policy_version 6993 (0.0020) [2024-03-21 00:06:20,521][03784] Fps is (10 sec: 49152.4, 60 sec: 46967.6, 300 sec: 45319.8). Total num frames: 229146624. Throughput: 0: 46695.7. Samples: 230456900. Policy #0 lag: (min: 2.0, avg: 44.4, max: 91.0) [2024-03-21 00:06:20,522][03784] Avg episode reward: [(0, '0.311')] [2024-03-21 00:06:25,521][03784] Fps is (10 sec: 55705.3, 60 sec: 49152.0, 300 sec: 45986.3). Total num frames: 229441536. Throughput: 0: 47035.5. Samples: 230750700. Policy #0 lag: (min: 0.0, avg: 34.1, max: 73.0) [2024-03-21 00:06:25,522][03784] Avg episode reward: [(0, '0.311')] [2024-03-21 00:06:25,648][04017] Updated weights for policy 0, policy_version 7003 (0.0014) [2024-03-21 00:06:30,459][03995] Signal inference workers to stop experience collection... (4650 times) [2024-03-21 00:06:30,521][03784] Fps is (10 sec: 49152.0, 60 sec: 48059.8, 300 sec: 45542.0). Total num frames: 229638144. Throughput: 0: 47273.4. Samples: 230898700. Policy #0 lag: (min: 0.0, avg: 52.1, max: 116.0) [2024-03-21 00:06:30,522][03995] Signal inference workers to resume experience collection... (4650 times) [2024-03-21 00:06:30,522][03784] Avg episode reward: [(0, '0.311')] [2024-03-21 00:06:30,561][04017] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-03-21 00:06:30,611][04017] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-03-21 00:06:32,099][04017] Updated weights for policy 0, policy_version 7013 (0.0015) [2024-03-21 00:06:35,521][03784] Fps is (10 sec: 42598.8, 60 sec: 46421.5, 300 sec: 46097.4). Total num frames: 229867520. Throughput: 0: 47231.1. Samples: 231179200. Policy #0 lag: (min: 0.0, avg: 52.1, max: 116.0) [2024-03-21 00:06:35,522][03784] Avg episode reward: [(0, '0.706')] [2024-03-21 00:06:38,930][04017] Updated weights for policy 0, policy_version 7023 (0.0015) [2024-03-21 00:06:40,521][03784] Fps is (10 sec: 55705.6, 60 sec: 47513.6, 300 sec: 45875.2). Total num frames: 230195200. Throughput: 0: 47011.1. Samples: 231459000. Policy #0 lag: (min: 0.0, avg: 33.4, max: 65.0) [2024-03-21 00:06:40,522][03784] Avg episode reward: [(0, '0.328')] [2024-03-21 00:06:45,381][04017] Updated weights for policy 0, policy_version 7033 (0.0010) [2024-03-21 00:06:45,521][03784] Fps is (10 sec: 58982.1, 60 sec: 46421.4, 300 sec: 46097.3). Total num frames: 230457344. Throughput: 0: 47162.2. Samples: 231612800. Policy #0 lag: (min: 0.0, avg: 35.2, max: 67.0) [2024-03-21 00:06:45,522][03784] Avg episode reward: [(0, '0.328')] [2024-03-21 00:06:50,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 230686720. Throughput: 0: 46977.8. Samples: 231891200. Policy #0 lag: (min: 0.0, avg: 43.9, max: 103.0) [2024-03-21 00:06:50,522][03784] Avg episode reward: [(0, '0.894')] [2024-03-21 00:06:53,604][04017] Updated weights for policy 0, policy_version 7043 (0.0016) [2024-03-21 00:06:55,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.5, 300 sec: 46430.6). Total num frames: 230916096. Throughput: 0: 47037.8. Samples: 232162700. Policy #0 lag: (min: 0.0, avg: 43.9, max: 103.0) [2024-03-21 00:06:55,522][03784] Avg episode reward: [(0, '0.730')] [2024-03-21 00:07:00,521][03784] Fps is (10 sec: 39321.2, 60 sec: 45875.1, 300 sec: 45986.3). Total num frames: 231079936. Throughput: 0: 46551.0. Samples: 232287500. Policy #0 lag: (min: 2.0, avg: 43.6, max: 93.0) [2024-03-21 00:07:00,522][03784] Avg episode reward: [(0, '0.929')] [2024-03-21 00:07:00,721][04017] Updated weights for policy 0, policy_version 7053 (0.0017) [2024-03-21 00:07:05,521][03784] Fps is (10 sec: 39321.5, 60 sec: 48059.7, 300 sec: 45653.0). Total num frames: 231309312. Throughput: 0: 46044.4. Samples: 232528900. Policy #0 lag: (min: 1.0, avg: 43.0, max: 92.0) [2024-03-21 00:07:05,522][03784] Avg episode reward: [(0, '0.608')] [2024-03-21 00:07:10,235][04017] Updated weights for policy 0, policy_version 7063 (0.0014) [2024-03-21 00:07:10,521][03784] Fps is (10 sec: 39322.8, 60 sec: 46967.7, 300 sec: 44986.6). Total num frames: 231473152. Throughput: 0: 46238.0. Samples: 232831400. Policy #0 lag: (min: 0.0, avg: 34.9, max: 78.0) [2024-03-21 00:07:10,521][03784] Avg episode reward: [(0, '0.934')] [2024-03-21 00:07:15,521][03784] Fps is (10 sec: 36045.0, 60 sec: 46421.4, 300 sec: 44986.6). Total num frames: 231669760. Throughput: 0: 45944.4. Samples: 232966200. Policy #0 lag: (min: 0.0, avg: 34.9, max: 78.0) [2024-03-21 00:07:15,522][03784] Avg episode reward: [(0, '0.632')] [2024-03-21 00:07:16,472][04017] Updated weights for policy 0, policy_version 7073 (0.0014) [2024-03-21 00:07:20,521][03784] Fps is (10 sec: 55704.6, 60 sec: 48059.7, 300 sec: 45875.2). Total num frames: 232030208. Throughput: 0: 45702.2. Samples: 233235800. Policy #0 lag: (min: 1.0, avg: 34.5, max: 82.0) [2024-03-21 00:07:20,522][03784] Avg episode reward: [(0, '1.135')] [2024-03-21 00:07:22,508][04017] Updated weights for policy 0, policy_version 7083 (0.0011) [2024-03-21 00:07:22,863][03995] Signal inference workers to stop experience collection... (4700 times) [2024-03-21 00:07:22,864][03995] Signal inference workers to resume experience collection... (4700 times) [2024-03-21 00:07:23,046][04017] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-03-21 00:07:23,047][04017] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-03-21 00:07:25,521][03784] Fps is (10 sec: 49151.7, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 232161280. Throughput: 0: 45668.8. Samples: 233514100. Policy #0 lag: (min: 0.0, avg: 38.0, max: 78.0) [2024-03-21 00:07:25,522][03784] Avg episode reward: [(0, '0.938')] [2024-03-21 00:07:28,526][04017] Updated weights for policy 0, policy_version 7093 (0.0015) [2024-03-21 00:07:30,521][03784] Fps is (10 sec: 39321.4, 60 sec: 46421.3, 300 sec: 44875.5). Total num frames: 232423424. Throughput: 0: 44706.6. Samples: 233624600. Policy #0 lag: (min: 0.0, avg: 38.0, max: 78.0) [2024-03-21 00:07:30,522][03784] Avg episode reward: [(0, '0.732')] [2024-03-21 00:07:35,521][03784] Fps is (10 sec: 49151.7, 60 sec: 46421.2, 300 sec: 45430.9). Total num frames: 232652800. Throughput: 0: 44886.6. Samples: 233911100. Policy #0 lag: (min: 0.0, avg: 37.8, max: 74.0) [2024-03-21 00:07:35,522][03784] Avg episode reward: [(0, '0.935')] [2024-03-21 00:07:36,402][04017] Updated weights for policy 0, policy_version 7103 (0.0024) [2024-03-21 00:07:40,521][03784] Fps is (10 sec: 45875.6, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 232882176. Throughput: 0: 45406.7. Samples: 234206000. Policy #0 lag: (min: 0.0, avg: 44.7, max: 93.0) [2024-03-21 00:07:40,522][03784] Avg episode reward: [(0, '0.999')] [2024-03-21 00:07:43,932][04017] Updated weights for policy 0, policy_version 7113 (0.0011) [2024-03-21 00:07:45,521][03784] Fps is (10 sec: 49152.5, 60 sec: 44783.0, 300 sec: 45986.3). Total num frames: 233144320. Throughput: 0: 45857.9. Samples: 234351100. Policy #0 lag: (min: 3.0, avg: 32.9, max: 92.0) [2024-03-21 00:07:45,522][03784] Avg episode reward: [(0, '0.999')] [2024-03-21 00:07:50,521][03784] Fps is (10 sec: 36044.8, 60 sec: 42598.4, 300 sec: 45986.3). Total num frames: 233242624. Throughput: 0: 46489.0. Samples: 234620900. Policy #0 lag: (min: 3.0, avg: 32.9, max: 92.0) [2024-03-21 00:07:50,522][03784] Avg episode reward: [(0, '0.624')] [2024-03-21 00:07:54,726][04017] Updated weights for policy 0, policy_version 7123 (0.0011) [2024-03-21 00:07:55,521][03784] Fps is (10 sec: 32768.0, 60 sec: 42598.4, 300 sec: 45653.0). Total num frames: 233472000. Throughput: 0: 45610.9. Samples: 234883900. Policy #0 lag: (min: 0.0, avg: 34.8, max: 100.0) [2024-03-21 00:07:55,522][03784] Avg episode reward: [(0, '0.680')] [2024-03-21 00:08:00,521][03784] Fps is (10 sec: 45874.9, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 233701376. Throughput: 0: 45044.4. Samples: 234993200. Policy #0 lag: (min: 0.0, avg: 34.8, max: 100.0) [2024-03-21 00:08:00,522][03784] Avg episode reward: [(0, '0.503')] [2024-03-21 00:08:00,620][04017] Updated weights for policy 0, policy_version 7133 (0.0012) [2024-03-21 00:08:00,925][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000007134_233766912.pth... [2024-03-21 00:08:01,027][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000006800_222822400.pth [2024-03-21 00:08:05,521][03784] Fps is (10 sec: 49151.5, 60 sec: 44236.8, 300 sec: 45097.6). Total num frames: 233963520. Throughput: 0: 45075.5. Samples: 235264200. Policy #0 lag: (min: 0.0, avg: 49.9, max: 97.0) [2024-03-21 00:08:05,522][03784] Avg episode reward: [(0, '1.040')] [2024-03-21 00:08:07,489][04017] Updated weights for policy 0, policy_version 7143 (0.0012) [2024-03-21 00:08:10,521][03784] Fps is (10 sec: 58982.1, 60 sec: 46967.3, 300 sec: 45319.8). Total num frames: 234291200. Throughput: 0: 44695.5. Samples: 235525400. Policy #0 lag: (min: 0.0, avg: 32.0, max: 81.0) [2024-03-21 00:08:10,522][03784] Avg episode reward: [(0, '1.046')] [2024-03-21 00:08:12,134][04017] Updated weights for policy 0, policy_version 7153 (0.0014) [2024-03-21 00:08:13,091][03995] Signal inference workers to stop experience collection... (4750 times) [2024-03-21 00:08:13,091][03995] Signal inference workers to resume experience collection... (4750 times) [2024-03-21 00:08:13,284][04017] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-03-21 00:08:13,284][04017] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-03-21 00:08:15,521][03784] Fps is (10 sec: 58982.3, 60 sec: 48059.6, 300 sec: 45208.7). Total num frames: 234553344. Throughput: 0: 45199.9. Samples: 235658600. Policy #0 lag: (min: 0.0, avg: 39.7, max: 86.0) [2024-03-21 00:08:15,522][03784] Avg episode reward: [(0, '0.957')] [2024-03-21 00:08:18,312][04017] Updated weights for policy 0, policy_version 7163 (0.0013) [2024-03-21 00:08:20,521][03784] Fps is (10 sec: 58982.8, 60 sec: 47513.6, 300 sec: 45986.3). Total num frames: 234881024. Throughput: 0: 43973.4. Samples: 235889900. Policy #0 lag: (min: 0.0, avg: 39.7, max: 86.0) [2024-03-21 00:08:20,522][03784] Avg episode reward: [(0, '0.415')] [2024-03-21 00:08:25,149][04017] Updated weights for policy 0, policy_version 7173 (0.0013) [2024-03-21 00:08:25,521][03784] Fps is (10 sec: 49152.7, 60 sec: 48059.8, 300 sec: 45875.2). Total num frames: 235044864. Throughput: 0: 43197.8. Samples: 236149900. Policy #0 lag: (min: 0.0, avg: 38.7, max: 75.0) [2024-03-21 00:08:25,522][03784] Avg episode reward: [(0, '0.785')] [2024-03-21 00:08:30,521][03784] Fps is (10 sec: 32768.1, 60 sec: 46421.4, 300 sec: 45653.1). Total num frames: 235208704. Throughput: 0: 43097.8. Samples: 236290500. Policy #0 lag: (min: 2.0, avg: 44.0, max: 78.0) [2024-03-21 00:08:30,522][03784] Avg episode reward: [(0, '0.860')] [2024-03-21 00:08:35,521][03784] Fps is (10 sec: 26214.3, 60 sec: 44236.9, 300 sec: 45653.1). Total num frames: 235307008. Throughput: 0: 43255.5. Samples: 236567400. Policy #0 lag: (min: 2.0, avg: 44.0, max: 78.0) [2024-03-21 00:08:35,522][03784] Avg episode reward: [(0, '0.772')] [2024-03-21 00:08:37,287][04017] Updated weights for policy 0, policy_version 7183 (0.0010) [2024-03-21 00:08:40,521][03784] Fps is (10 sec: 29491.1, 60 sec: 43690.6, 300 sec: 45430.9). Total num frames: 235503616. Throughput: 0: 43708.8. Samples: 236850800. Policy #0 lag: (min: 0.0, avg: 41.3, max: 87.0) [2024-03-21 00:08:40,522][03784] Avg episode reward: [(0, '1.028')] [2024-03-21 00:08:45,521][03784] Fps is (10 sec: 29491.1, 60 sec: 40960.0, 300 sec: 44875.5). Total num frames: 235601920. Throughput: 0: 44255.6. Samples: 236984700. Policy #0 lag: (min: 0.0, avg: 33.3, max: 102.0) [2024-03-21 00:08:45,522][03784] Avg episode reward: [(0, '0.945')] [2024-03-21 00:08:48,108][04017] Updated weights for policy 0, policy_version 7193 (0.0009) [2024-03-21 00:08:50,521][03784] Fps is (10 sec: 36044.6, 60 sec: 43690.6, 300 sec: 44986.6). Total num frames: 235864064. Throughput: 0: 44193.3. Samples: 237252900. Policy #0 lag: (min: 4.0, avg: 41.3, max: 95.0) [2024-03-21 00:08:50,522][03784] Avg episode reward: [(0, '0.404')] [2024-03-21 00:08:53,077][04017] Updated weights for policy 0, policy_version 7203 (0.0016) [2024-03-21 00:08:55,521][03784] Fps is (10 sec: 55705.9, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 236158976. Throughput: 0: 44015.6. Samples: 237506100. Policy #0 lag: (min: 4.0, avg: 41.3, max: 95.0) [2024-03-21 00:08:55,522][03784] Avg episode reward: [(0, '0.914')] [2024-03-21 00:08:59,127][04017] Updated weights for policy 0, policy_version 7213 (0.0010) [2024-03-21 00:09:00,521][03784] Fps is (10 sec: 62260.1, 60 sec: 46421.4, 300 sec: 45319.8). Total num frames: 236486656. Throughput: 0: 44257.9. Samples: 237650200. Policy #0 lag: (min: 0.0, avg: 36.3, max: 79.0) [2024-03-21 00:09:00,522][03784] Avg episode reward: [(0, '0.312')] [2024-03-21 00:09:05,521][03784] Fps is (10 sec: 45875.3, 60 sec: 44236.9, 300 sec: 45208.7). Total num frames: 236617728. Throughput: 0: 45017.8. Samples: 237915700. Policy #0 lag: (min: 0.0, avg: 36.3, max: 79.0) [2024-03-21 00:09:05,522][03784] Avg episode reward: [(0, '0.486')] [2024-03-21 00:09:07,508][04017] Updated weights for policy 0, policy_version 7223 (0.0010) [2024-03-21 00:09:10,521][03784] Fps is (10 sec: 42597.7, 60 sec: 43690.6, 300 sec: 45986.3). Total num frames: 236912640. Throughput: 0: 45735.4. Samples: 238208000. Policy #0 lag: (min: 0.0, avg: 50.3, max: 111.0) [2024-03-21 00:09:10,522][03784] Avg episode reward: [(0, '0.923')] [2024-03-21 00:09:11,307][03995] Signal inference workers to stop experience collection... (4800 times) [2024-03-21 00:09:11,395][04017] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-03-21 00:09:11,546][03995] Signal inference workers to resume experience collection... (4800 times) [2024-03-21 00:09:11,546][04017] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-03-21 00:09:11,547][04017] Updated weights for policy 0, policy_version 7233 (0.0015) [2024-03-21 00:09:15,521][03784] Fps is (10 sec: 65535.4, 60 sec: 45329.1, 300 sec: 46430.6). Total num frames: 237273088. Throughput: 0: 45453.3. Samples: 238335900. Policy #0 lag: (min: 2.0, avg: 45.9, max: 117.0) [2024-03-21 00:09:15,522][03784] Avg episode reward: [(0, '1.065')] [2024-03-21 00:09:15,993][04017] Updated weights for policy 0, policy_version 7243 (0.0015) [2024-03-21 00:09:20,521][03784] Fps is (10 sec: 58983.2, 60 sec: 43690.7, 300 sec: 46541.7). Total num frames: 237502464. Throughput: 0: 45553.3. Samples: 238617300. Policy #0 lag: (min: 2.0, avg: 45.9, max: 117.0) [2024-03-21 00:09:20,522][03784] Avg episode reward: [(0, '0.630')] [2024-03-21 00:09:25,480][04017] Updated weights for policy 0, policy_version 7253 (0.0016) [2024-03-21 00:09:25,521][03784] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 46097.3). Total num frames: 237666304. Throughput: 0: 45900.0. Samples: 238916300. Policy #0 lag: (min: 0.0, avg: 41.5, max: 75.0) [2024-03-21 00:09:25,522][03784] Avg episode reward: [(0, '0.630')] [2024-03-21 00:09:30,521][03784] Fps is (10 sec: 26214.4, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 237764608. Throughput: 0: 46173.4. Samples: 239062500. Policy #0 lag: (min: 0.0, avg: 33.0, max: 69.0) [2024-03-21 00:09:30,522][03784] Avg episode reward: [(0, '0.760')] [2024-03-21 00:09:34,367][04017] Updated weights for policy 0, policy_version 7263 (0.0011) [2024-03-21 00:09:35,521][03784] Fps is (10 sec: 39321.9, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 238059520. Throughput: 0: 46400.1. Samples: 239340900. Policy #0 lag: (min: 2.0, avg: 31.1, max: 73.0) [2024-03-21 00:09:35,522][03784] Avg episode reward: [(0, '0.581')] [2024-03-21 00:09:40,521][03784] Fps is (10 sec: 49151.8, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 238256128. Throughput: 0: 46451.0. Samples: 239596400. Policy #0 lag: (min: 2.0, avg: 31.1, max: 73.0) [2024-03-21 00:09:40,522][03784] Avg episode reward: [(0, '0.688')] [2024-03-21 00:09:41,215][04017] Updated weights for policy 0, policy_version 7273 (0.0015) [2024-03-21 00:09:45,521][03784] Fps is (10 sec: 55705.2, 60 sec: 50244.2, 300 sec: 45764.1). Total num frames: 238616576. Throughput: 0: 46039.9. Samples: 239722000. Policy #0 lag: (min: 2.0, avg: 31.9, max: 69.0) [2024-03-21 00:09:45,522][03784] Avg episode reward: [(0, '0.565')] [2024-03-21 00:09:45,759][04017] Updated weights for policy 0, policy_version 7283 (0.0018) [2024-03-21 00:09:50,521][03784] Fps is (10 sec: 68813.1, 60 sec: 51336.6, 300 sec: 45764.1). Total num frames: 238944256. Throughput: 0: 45555.5. Samples: 239965700. Policy #0 lag: (min: 0.0, avg: 33.2, max: 69.0) [2024-03-21 00:09:50,522][03784] Avg episode reward: [(0, '0.544')] [2024-03-21 00:09:52,353][04017] Updated weights for policy 0, policy_version 7293 (0.0012) [2024-03-21 00:09:55,521][03784] Fps is (10 sec: 42598.7, 60 sec: 48059.7, 300 sec: 45653.1). Total num frames: 239042560. Throughput: 0: 45629.0. Samples: 240261300. Policy #0 lag: (min: 0.0, avg: 33.2, max: 69.0) [2024-03-21 00:09:55,522][03784] Avg episode reward: [(0, '0.709')] [2024-03-21 00:10:00,521][03784] Fps is (10 sec: 19660.7, 60 sec: 44236.7, 300 sec: 45875.2). Total num frames: 239140864. Throughput: 0: 46240.0. Samples: 240416700. Policy #0 lag: (min: 0.0, avg: 35.3, max: 94.0) [2024-03-21 00:10:00,522][03784] Avg episode reward: [(0, '0.354')] [2024-03-21 00:10:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000007298_239140864.pth... [2024-03-21 00:10:00,668][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000006968_228327424.pth [2024-03-21 00:10:03,333][04017] Updated weights for policy 0, policy_version 7303 (0.0010) [2024-03-21 00:10:05,521][03784] Fps is (10 sec: 32767.5, 60 sec: 45875.1, 300 sec: 45875.2). Total num frames: 239370240. Throughput: 0: 46579.9. Samples: 240713400. Policy #0 lag: (min: 0.0, avg: 32.3, max: 86.0) [2024-03-21 00:10:05,522][03784] Avg episode reward: [(0, '0.844')] [2024-03-21 00:10:08,272][04017] Updated weights for policy 0, policy_version 7313 (0.0020) [2024-03-21 00:10:10,521][03784] Fps is (10 sec: 55705.6, 60 sec: 46421.4, 300 sec: 46319.5). Total num frames: 239697920. Throughput: 0: 45937.8. Samples: 240983500. Policy #0 lag: (min: 0.0, avg: 32.3, max: 86.0) [2024-03-21 00:10:10,522][03784] Avg episode reward: [(0, '0.800')] [2024-03-21 00:10:11,364][03995] Signal inference workers to stop experience collection... (4850 times) [2024-03-21 00:10:11,365][03995] Signal inference workers to resume experience collection... (4850 times) [2024-03-21 00:10:11,421][04017] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-03-21 00:10:11,421][04017] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-03-21 00:10:15,521][03784] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 45653.1). Total num frames: 239796224. Throughput: 0: 45915.6. Samples: 241128700. Policy #0 lag: (min: 1.0, avg: 33.6, max: 66.0) [2024-03-21 00:10:15,522][03784] Avg episode reward: [(0, '0.404')] [2024-03-21 00:10:16,958][04017] Updated weights for policy 0, policy_version 7323 (0.0013) [2024-03-21 00:10:20,521][03784] Fps is (10 sec: 32768.0, 60 sec: 42052.2, 300 sec: 45875.2). Total num frames: 240025600. Throughput: 0: 45755.5. Samples: 241399900. Policy #0 lag: (min: 1.0, avg: 33.6, max: 66.0) [2024-03-21 00:10:20,522][03784] Avg episode reward: [(0, '0.541')] [2024-03-21 00:10:25,521][03784] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 45764.1). Total num frames: 240254976. Throughput: 0: 46248.9. Samples: 241677600. Policy #0 lag: (min: 0.0, avg: 48.1, max: 114.0) [2024-03-21 00:10:25,522][03784] Avg episode reward: [(0, '0.445')] [2024-03-21 00:10:26,139][04017] Updated weights for policy 0, policy_version 7333 (0.0017) [2024-03-21 00:10:30,521][03784] Fps is (10 sec: 42598.9, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 240451584. Throughput: 0: 46857.9. Samples: 241830600. Policy #0 lag: (min: 0.0, avg: 46.9, max: 112.0) [2024-03-21 00:10:30,522][03784] Avg episode reward: [(0, '0.399')] [2024-03-21 00:10:32,377][04017] Updated weights for policy 0, policy_version 7343 (0.0011) [2024-03-21 00:10:35,521][03784] Fps is (10 sec: 58982.7, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 240844800. Throughput: 0: 47757.8. Samples: 242114800. Policy #0 lag: (min: 4.0, avg: 44.7, max: 93.0) [2024-03-21 00:10:35,522][03784] Avg episode reward: [(0, '0.569')] [2024-03-21 00:10:36,764][04017] Updated weights for policy 0, policy_version 7353 (0.0017) [2024-03-21 00:10:40,521][03784] Fps is (10 sec: 72088.9, 60 sec: 48605.9, 300 sec: 45764.1). Total num frames: 241172480. Throughput: 0: 47215.5. Samples: 242386000. Policy #0 lag: (min: 4.0, avg: 44.7, max: 93.0) [2024-03-21 00:10:40,522][03784] Avg episode reward: [(0, '0.509')] [2024-03-21 00:10:41,656][04017] Updated weights for policy 0, policy_version 7363 (0.0017) [2024-03-21 00:10:45,521][03784] Fps is (10 sec: 65535.9, 60 sec: 48059.8, 300 sec: 46208.4). Total num frames: 241500160. Throughput: 0: 46766.7. Samples: 242521200. Policy #0 lag: (min: 1.0, avg: 45.6, max: 82.0) [2024-03-21 00:10:45,522][03784] Avg episode reward: [(0, '0.904')] [2024-03-21 00:10:48,811][04017] Updated weights for policy 0, policy_version 7373 (0.0015) [2024-03-21 00:10:50,521][03784] Fps is (10 sec: 58982.5, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 241762304. Throughput: 0: 46949.0. Samples: 242826100. Policy #0 lag: (min: 0.0, avg: 37.4, max: 75.0) [2024-03-21 00:10:50,522][03784] Avg episode reward: [(0, '0.904')] [2024-03-21 00:10:55,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 241860608. Throughput: 0: 47251.1. Samples: 243109800. Policy #0 lag: (min: 0.0, avg: 37.4, max: 75.0) [2024-03-21 00:10:55,522][03784] Avg episode reward: [(0, '0.297')] [2024-03-21 00:10:56,468][04017] Updated weights for policy 0, policy_version 7383 (0.0015) [2024-03-21 00:10:59,333][03995] Signal inference workers to stop experience collection... (4900 times) [2024-03-21 00:10:59,376][04017] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-03-21 00:10:59,582][03995] Signal inference workers to resume experience collection... (4900 times) [2024-03-21 00:10:59,582][04017] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-03-21 00:11:00,521][03784] Fps is (10 sec: 36044.9, 60 sec: 49698.2, 300 sec: 46430.6). Total num frames: 242122752. Throughput: 0: 46804.4. Samples: 243234900. Policy #0 lag: (min: 0.0, avg: 45.8, max: 89.0) [2024-03-21 00:11:00,522][03784] Avg episode reward: [(0, '0.923')] [2024-03-21 00:11:05,521][03784] Fps is (10 sec: 36044.9, 60 sec: 47513.7, 300 sec: 45986.3). Total num frames: 242221056. Throughput: 0: 47122.3. Samples: 243520400. Policy #0 lag: (min: 0.0, avg: 38.6, max: 87.0) [2024-03-21 00:11:05,522][03784] Avg episode reward: [(0, '0.577')] [2024-03-21 00:11:06,465][04017] Updated weights for policy 0, policy_version 7393 (0.0009) [2024-03-21 00:11:10,521][03784] Fps is (10 sec: 32767.8, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 242450432. Throughput: 0: 47137.8. Samples: 243798800. Policy #0 lag: (min: 0.0, avg: 38.6, max: 87.0) [2024-03-21 00:11:10,522][03784] Avg episode reward: [(0, '1.192')] [2024-03-21 00:11:10,688][03995] Saving new best policy, reward=1.192! [2024-03-21 00:11:13,664][04017] Updated weights for policy 0, policy_version 7403 (0.0014) [2024-03-21 00:11:15,521][03784] Fps is (10 sec: 45875.6, 60 sec: 48059.8, 300 sec: 45875.2). Total num frames: 242679808. Throughput: 0: 46906.7. Samples: 243941400. Policy #0 lag: (min: 1.0, avg: 41.0, max: 81.0) [2024-03-21 00:11:15,522][03784] Avg episode reward: [(0, '0.754')] [2024-03-21 00:11:20,521][03784] Fps is (10 sec: 36044.7, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 242810880. Throughput: 0: 46675.5. Samples: 244215200. Policy #0 lag: (min: 0.0, avg: 30.9, max: 90.0) [2024-03-21 00:11:20,522][03784] Avg episode reward: [(0, '0.867')] [2024-03-21 00:11:23,970][04017] Updated weights for policy 0, policy_version 7413 (0.0011) [2024-03-21 00:11:25,521][03784] Fps is (10 sec: 39321.0, 60 sec: 46967.5, 300 sec: 45542.0). Total num frames: 243073024. Throughput: 0: 47204.4. Samples: 244510200. Policy #0 lag: (min: 0.0, avg: 30.9, max: 90.0) [2024-03-21 00:11:25,522][03784] Avg episode reward: [(0, '0.867')] [2024-03-21 00:11:27,432][04017] Updated weights for policy 0, policy_version 7423 (0.0020) [2024-03-21 00:11:30,521][03784] Fps is (10 sec: 52428.2, 60 sec: 48059.5, 300 sec: 45653.0). Total num frames: 243335168. Throughput: 0: 47195.4. Samples: 244645000. Policy #0 lag: (min: 0.0, avg: 37.8, max: 93.0) [2024-03-21 00:11:30,522][03784] Avg episode reward: [(0, '1.190')] [2024-03-21 00:11:34,548][04017] Updated weights for policy 0, policy_version 7433 (0.0021) [2024-03-21 00:11:35,521][03784] Fps is (10 sec: 58982.9, 60 sec: 46967.5, 300 sec: 45653.0). Total num frames: 243662848. Throughput: 0: 46800.1. Samples: 244932100. Policy #0 lag: (min: 0.0, avg: 42.8, max: 102.0) [2024-03-21 00:11:35,522][03784] Avg episode reward: [(0, '0.659')] [2024-03-21 00:11:37,744][04017] Updated weights for policy 0, policy_version 7443 (0.0018) [2024-03-21 00:11:40,521][03784] Fps is (10 sec: 78645.0, 60 sec: 49152.1, 300 sec: 46319.5). Total num frames: 244121600. Throughput: 0: 45393.4. Samples: 245152500. Policy #0 lag: (min: 0.0, avg: 42.8, max: 102.0) [2024-03-21 00:11:40,522][03784] Avg episode reward: [(0, '0.572')] [2024-03-21 00:11:45,011][04017] Updated weights for policy 0, policy_version 7453 (0.0010) [2024-03-21 00:11:45,521][03784] Fps is (10 sec: 58982.0, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 244252672. Throughput: 0: 45991.1. Samples: 245304500. Policy #0 lag: (min: 0.0, avg: 41.7, max: 75.0) [2024-03-21 00:11:45,522][03784] Avg episode reward: [(0, '0.666')] [2024-03-21 00:11:47,661][03995] Signal inference workers to stop experience collection... (4950 times) [2024-03-21 00:11:47,728][04017] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-03-21 00:11:47,928][03995] Signal inference workers to resume experience collection... (4950 times) [2024-03-21 00:11:47,928][04017] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-03-21 00:11:50,521][03784] Fps is (10 sec: 39321.2, 60 sec: 45875.2, 300 sec: 46097.3). Total num frames: 244514816. Throughput: 0: 45711.0. Samples: 245577400. Policy #0 lag: (min: 4.0, avg: 46.5, max: 87.0) [2024-03-21 00:11:50,522][03784] Avg episode reward: [(0, '0.561')] [2024-03-21 00:11:52,155][04017] Updated weights for policy 0, policy_version 7463 (0.0015) [2024-03-21 00:11:55,521][03784] Fps is (10 sec: 42597.5, 60 sec: 46967.3, 300 sec: 46097.3). Total num frames: 244678656. Throughput: 0: 45835.4. Samples: 245861400. Policy #0 lag: (min: 4.0, avg: 46.5, max: 87.0) [2024-03-21 00:11:55,522][03784] Avg episode reward: [(0, '1.134')] [2024-03-21 00:12:00,521][03784] Fps is (10 sec: 29491.4, 60 sec: 44782.9, 300 sec: 45764.1). Total num frames: 244809728. Throughput: 0: 45855.5. Samples: 246004900. Policy #0 lag: (min: 0.0, avg: 39.5, max: 78.0) [2024-03-21 00:12:00,522][03784] Avg episode reward: [(0, '0.853')] [2024-03-21 00:12:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000007471_244809728.pth... [2024-03-21 00:12:00,695][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000007134_233766912.pth [2024-03-21 00:12:03,379][04017] Updated weights for policy 0, policy_version 7473 (0.0016) [2024-03-21 00:12:05,521][03784] Fps is (10 sec: 26215.0, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 244940800. Throughput: 0: 45789.0. Samples: 246275700. Policy #0 lag: (min: 0.0, avg: 45.1, max: 90.0) [2024-03-21 00:12:05,522][03784] Avg episode reward: [(0, '0.410')] [2024-03-21 00:12:10,521][03784] Fps is (10 sec: 32767.8, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 245137408. Throughput: 0: 45693.3. Samples: 246566400. Policy #0 lag: (min: 2.0, avg: 45.9, max: 106.0) [2024-03-21 00:12:10,522][03784] Avg episode reward: [(0, '1.085')] [2024-03-21 00:12:11,244][04017] Updated weights for policy 0, policy_version 7483 (0.0010) [2024-03-21 00:12:15,521][03784] Fps is (10 sec: 49151.8, 60 sec: 45875.1, 300 sec: 45430.9). Total num frames: 245432320. Throughput: 0: 45751.2. Samples: 246703800. Policy #0 lag: (min: 2.0, avg: 45.9, max: 106.0) [2024-03-21 00:12:15,522][03784] Avg episode reward: [(0, '0.586')] [2024-03-21 00:12:18,382][04017] Updated weights for policy 0, policy_version 7493 (0.0014) [2024-03-21 00:12:20,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 245596160. Throughput: 0: 44948.8. Samples: 246954800. Policy #0 lag: (min: 0.0, avg: 29.8, max: 71.0) [2024-03-21 00:12:20,522][03784] Avg episode reward: [(0, '0.342')] [2024-03-21 00:12:22,973][04017] Updated weights for policy 0, policy_version 7503 (0.0011) [2024-03-21 00:12:25,521][03784] Fps is (10 sec: 55705.7, 60 sec: 48605.9, 300 sec: 45986.3). Total num frames: 245989376. Throughput: 0: 46291.0. Samples: 247235600. Policy #0 lag: (min: 2.0, avg: 34.9, max: 72.0) [2024-03-21 00:12:25,522][03784] Avg episode reward: [(0, '0.953')] [2024-03-21 00:12:29,194][04017] Updated weights for policy 0, policy_version 7513 (0.0011) [2024-03-21 00:12:30,521][03784] Fps is (10 sec: 68814.1, 60 sec: 49152.2, 300 sec: 46208.5). Total num frames: 246284288. Throughput: 0: 46104.6. Samples: 247379200. Policy #0 lag: (min: 2.0, avg: 34.9, max: 72.0) [2024-03-21 00:12:30,521][03784] Avg episode reward: [(0, '0.596')] [2024-03-21 00:12:35,521][03784] Fps is (10 sec: 42598.6, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 246415360. Throughput: 0: 45833.4. Samples: 247639900. Policy #0 lag: (min: 0.0, avg: 35.1, max: 72.0) [2024-03-21 00:12:35,522][03784] Avg episode reward: [(0, '0.784')] [2024-03-21 00:12:38,238][04017] Updated weights for policy 0, policy_version 7523 (0.0019) [2024-03-21 00:12:40,521][03784] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 45875.2). Total num frames: 246677504. Throughput: 0: 45615.8. Samples: 247914100. Policy #0 lag: (min: 0.0, avg: 35.1, max: 72.0) [2024-03-21 00:12:40,522][03784] Avg episode reward: [(0, '0.754')] [2024-03-21 00:12:42,015][03995] Signal inference workers to stop experience collection... (5000 times) [2024-03-21 00:12:42,016][03995] Signal inference workers to resume experience collection... (5000 times) [2024-03-21 00:12:42,073][04017] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-03-21 00:12:42,074][04017] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-03-21 00:12:42,363][04017] Updated weights for policy 0, policy_version 7533 (0.0010) [2024-03-21 00:12:45,521][03784] Fps is (10 sec: 49151.9, 60 sec: 44236.8, 300 sec: 46319.5). Total num frames: 246906880. Throughput: 0: 45275.5. Samples: 248042300. Policy #0 lag: (min: 0.0, avg: 37.0, max: 73.0) [2024-03-21 00:12:45,522][03784] Avg episode reward: [(0, '0.814')] [2024-03-21 00:12:50,521][03784] Fps is (10 sec: 36044.6, 60 sec: 42052.3, 300 sec: 45986.3). Total num frames: 247037952. Throughput: 0: 45406.6. Samples: 248319000. Policy #0 lag: (min: 0.0, avg: 38.1, max: 68.0) [2024-03-21 00:12:50,522][03784] Avg episode reward: [(0, '0.810')] [2024-03-21 00:12:53,176][04017] Updated weights for policy 0, policy_version 7543 (0.0011) [2024-03-21 00:12:55,521][03784] Fps is (10 sec: 45875.4, 60 sec: 44783.1, 300 sec: 46319.5). Total num frames: 247365632. Throughput: 0: 44457.9. Samples: 248567000. Policy #0 lag: (min: 4.0, avg: 35.5, max: 76.0) [2024-03-21 00:12:55,522][03784] Avg episode reward: [(0, '0.601')] [2024-03-21 00:12:57,569][04017] Updated weights for policy 0, policy_version 7553 (0.0014) [2024-03-21 00:13:00,521][03784] Fps is (10 sec: 49152.0, 60 sec: 45329.0, 300 sec: 45986.3). Total num frames: 247529472. Throughput: 0: 43895.6. Samples: 248679100. Policy #0 lag: (min: 4.0, avg: 35.5, max: 76.0) [2024-03-21 00:13:00,522][03784] Avg episode reward: [(0, '0.958')] [2024-03-21 00:13:05,521][03784] Fps is (10 sec: 29491.0, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 247660544. Throughput: 0: 44320.0. Samples: 248949200. Policy #0 lag: (min: 0.0, avg: 31.7, max: 76.0) [2024-03-21 00:13:05,522][03784] Avg episode reward: [(0, '0.447')] [2024-03-21 00:13:07,165][04017] Updated weights for policy 0, policy_version 7563 (0.0015) [2024-03-21 00:13:10,521][03784] Fps is (10 sec: 52428.8, 60 sec: 48605.9, 300 sec: 45764.1). Total num frames: 248053760. Throughput: 0: 43455.5. Samples: 249191100. Policy #0 lag: (min: 0.0, avg: 31.7, max: 76.0) [2024-03-21 00:13:10,522][03784] Avg episode reward: [(0, '0.526')] [2024-03-21 00:13:13,380][04017] Updated weights for policy 0, policy_version 7573 (0.0010) [2024-03-21 00:13:15,521][03784] Fps is (10 sec: 65536.6, 60 sec: 48059.8, 300 sec: 45542.0). Total num frames: 248315904. Throughput: 0: 43422.2. Samples: 249333200. Policy #0 lag: (min: 0.0, avg: 48.1, max: 100.0) [2024-03-21 00:13:15,522][03784] Avg episode reward: [(0, '0.587')] [2024-03-21 00:13:20,152][04017] Updated weights for policy 0, policy_version 7583 (0.0015) [2024-03-21 00:13:20,521][03784] Fps is (10 sec: 45875.2, 60 sec: 48605.9, 300 sec: 45653.0). Total num frames: 248512512. Throughput: 0: 43622.2. Samples: 249602900. Policy #0 lag: (min: 0.0, avg: 39.6, max: 83.0) [2024-03-21 00:13:20,522][03784] Avg episode reward: [(0, '0.861')] [2024-03-21 00:13:25,521][03784] Fps is (10 sec: 36044.4, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 248676352. Throughput: 0: 43837.7. Samples: 249886800. Policy #0 lag: (min: 0.0, avg: 39.6, max: 83.0) [2024-03-21 00:13:25,523][03784] Avg episode reward: [(0, '0.919')] [2024-03-21 00:13:28,064][04017] Updated weights for policy 0, policy_version 7593 (0.0015) [2024-03-21 00:13:30,521][03784] Fps is (10 sec: 36045.0, 60 sec: 43144.5, 300 sec: 45986.3). Total num frames: 248872960. Throughput: 0: 43966.7. Samples: 250020800. Policy #0 lag: (min: 0.0, avg: 37.6, max: 94.0) [2024-03-21 00:13:30,522][03784] Avg episode reward: [(0, '0.621')] [2024-03-21 00:13:35,521][03784] Fps is (10 sec: 22937.8, 60 sec: 41506.2, 300 sec: 45430.9). Total num frames: 248905728. Throughput: 0: 44124.5. Samples: 250304600. Policy #0 lag: (min: 0.0, avg: 32.9, max: 69.0) [2024-03-21 00:13:35,522][03784] Avg episode reward: [(0, '0.670')] [2024-03-21 00:13:39,625][03995] Signal inference workers to stop experience collection... (5050 times) [2024-03-21 00:13:39,693][04017] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-03-21 00:13:39,853][03995] Signal inference workers to resume experience collection... (5050 times) [2024-03-21 00:13:39,853][04017] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-03-21 00:13:39,855][04017] Updated weights for policy 0, policy_version 7603 (0.0025) [2024-03-21 00:13:40,521][03784] Fps is (10 sec: 29491.2, 60 sec: 41506.1, 300 sec: 45986.3). Total num frames: 249167872. Throughput: 0: 44244.4. Samples: 250558000. Policy #0 lag: (min: 2.0, avg: 38.5, max: 95.0) [2024-03-21 00:13:40,522][03784] Avg episode reward: [(0, '1.183')] [2024-03-21 00:13:45,521][03784] Fps is (10 sec: 45874.9, 60 sec: 40960.0, 300 sec: 45764.1). Total num frames: 249364480. Throughput: 0: 44968.9. Samples: 250702700. Policy #0 lag: (min: 2.0, avg: 38.5, max: 95.0) [2024-03-21 00:13:45,522][03784] Avg episode reward: [(0, '1.172')] [2024-03-21 00:13:46,545][04017] Updated weights for policy 0, policy_version 7613 (0.0016) [2024-03-21 00:13:50,521][03784] Fps is (10 sec: 52428.1, 60 sec: 44236.7, 300 sec: 45875.2). Total num frames: 249692160. Throughput: 0: 44599.9. Samples: 250956200. Policy #0 lag: (min: 2.0, avg: 44.2, max: 102.0) [2024-03-21 00:13:50,522][03784] Avg episode reward: [(0, '0.566')] [2024-03-21 00:13:51,597][04017] Updated weights for policy 0, policy_version 7623 (0.0013) [2024-03-21 00:13:55,521][03784] Fps is (10 sec: 52428.6, 60 sec: 42052.2, 300 sec: 45430.9). Total num frames: 249888768. Throughput: 0: 45700.0. Samples: 251247600. Policy #0 lag: (min: 0.0, avg: 33.6, max: 72.0) [2024-03-21 00:13:55,522][03784] Avg episode reward: [(0, '1.028')] [2024-03-21 00:13:58,248][04017] Updated weights for policy 0, policy_version 7633 (0.0010) [2024-03-21 00:14:00,521][03784] Fps is (10 sec: 52429.2, 60 sec: 44782.9, 300 sec: 46097.3). Total num frames: 250216448. Throughput: 0: 45508.8. Samples: 251381100. Policy #0 lag: (min: 0.0, avg: 33.6, max: 72.0) [2024-03-21 00:14:00,522][03784] Avg episode reward: [(0, '0.493')] [2024-03-21 00:14:00,648][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000007637_250249216.pth... [2024-03-21 00:14:00,762][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000007298_239140864.pth [2024-03-21 00:14:03,822][04017] Updated weights for policy 0, policy_version 7643 (0.0039) [2024-03-21 00:14:05,521][03784] Fps is (10 sec: 72090.1, 60 sec: 49152.0, 300 sec: 46430.6). Total num frames: 250609664. Throughput: 0: 45615.6. Samples: 251655600. Policy #0 lag: (min: 0.0, avg: 40.9, max: 108.0) [2024-03-21 00:14:05,522][03784] Avg episode reward: [(0, '0.493')] [2024-03-21 00:14:08,776][04017] Updated weights for policy 0, policy_version 7653 (0.0015) [2024-03-21 00:14:10,521][03784] Fps is (10 sec: 58981.9, 60 sec: 45875.1, 300 sec: 45875.2). Total num frames: 250806272. Throughput: 0: 45506.6. Samples: 251934600. Policy #0 lag: (min: 0.0, avg: 44.3, max: 79.0) [2024-03-21 00:14:10,522][03784] Avg episode reward: [(0, '0.701')] [2024-03-21 00:14:15,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45875.1, 300 sec: 45986.3). Total num frames: 251068416. Throughput: 0: 45168.8. Samples: 252053400. Policy #0 lag: (min: 0.0, avg: 44.3, max: 79.0) [2024-03-21 00:14:15,522][03784] Avg episode reward: [(0, '1.052')] [2024-03-21 00:14:17,773][04017] Updated weights for policy 0, policy_version 7663 (0.0011) [2024-03-21 00:14:20,521][03784] Fps is (10 sec: 32768.5, 60 sec: 43690.7, 300 sec: 45653.1). Total num frames: 251133952. Throughput: 0: 44548.9. Samples: 252309300. Policy #0 lag: (min: 0.0, avg: 40.9, max: 73.0) [2024-03-21 00:14:20,522][03784] Avg episode reward: [(0, '0.424')] [2024-03-21 00:14:25,521][03784] Fps is (10 sec: 26214.5, 60 sec: 44236.9, 300 sec: 45986.3). Total num frames: 251330560. Throughput: 0: 45177.8. Samples: 252591000. Policy #0 lag: (min: 0.0, avg: 29.5, max: 73.0) [2024-03-21 00:14:25,522][03784] Avg episode reward: [(0, '0.946')] [2024-03-21 00:14:28,607][04017] Updated weights for policy 0, policy_version 7673 (0.0019) [2024-03-21 00:14:30,521][03784] Fps is (10 sec: 36044.6, 60 sec: 43690.6, 300 sec: 45542.0). Total num frames: 251494400. Throughput: 0: 45113.3. Samples: 252732800. Policy #0 lag: (min: 0.0, avg: 29.5, max: 73.0) [2024-03-21 00:14:30,522][03784] Avg episode reward: [(0, '0.825')] [2024-03-21 00:14:33,477][03995] Signal inference workers to stop experience collection... (5100 times) [2024-03-21 00:14:33,528][04017] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-03-21 00:14:33,732][03995] Signal inference workers to resume experience collection... (5100 times) [2024-03-21 00:14:33,733][04017] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-03-21 00:14:35,521][03784] Fps is (10 sec: 39321.7, 60 sec: 46967.5, 300 sec: 45653.1). Total num frames: 251723776. Throughput: 0: 45335.7. Samples: 252996300. Policy #0 lag: (min: 0.0, avg: 37.3, max: 106.0) [2024-03-21 00:14:35,522][03784] Avg episode reward: [(0, '0.432')] [2024-03-21 00:14:36,048][04017] Updated weights for policy 0, policy_version 7683 (0.0012) [2024-03-21 00:14:40,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46421.3, 300 sec: 45208.7). Total num frames: 251953152. Throughput: 0: 45022.3. Samples: 253273600. Policy #0 lag: (min: 0.0, avg: 37.3, max: 106.0) [2024-03-21 00:14:40,522][03784] Avg episode reward: [(0, '0.453')] [2024-03-21 00:14:45,087][04017] Updated weights for policy 0, policy_version 7693 (0.0012) [2024-03-21 00:14:45,521][03784] Fps is (10 sec: 39320.6, 60 sec: 45875.1, 300 sec: 44653.3). Total num frames: 252116992. Throughput: 0: 45402.1. Samples: 253424200. Policy #0 lag: (min: 0.0, avg: 36.9, max: 110.0) [2024-03-21 00:14:45,522][03784] Avg episode reward: [(0, '0.761')] [2024-03-21 00:14:50,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44783.0, 300 sec: 45208.7). Total num frames: 252379136. Throughput: 0: 45817.8. Samples: 253717400. Policy #0 lag: (min: 0.0, avg: 32.3, max: 77.0) [2024-03-21 00:14:50,522][03784] Avg episode reward: [(0, '0.937')] [2024-03-21 00:14:50,756][04017] Updated weights for policy 0, policy_version 7703 (0.0011) [2024-03-21 00:14:55,521][03784] Fps is (10 sec: 52430.2, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 252641280. Throughput: 0: 45551.3. Samples: 253984400. Policy #0 lag: (min: 0.0, avg: 32.3, max: 77.0) [2024-03-21 00:14:55,522][03784] Avg episode reward: [(0, '0.522')] [2024-03-21 00:14:56,873][04017] Updated weights for policy 0, policy_version 7713 (0.0019) [2024-03-21 00:15:00,521][03784] Fps is (10 sec: 49151.9, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 252870656. Throughput: 0: 45777.8. Samples: 254113400. Policy #0 lag: (min: 0.0, avg: 35.3, max: 114.0) [2024-03-21 00:15:00,522][03784] Avg episode reward: [(0, '0.522')] [2024-03-21 00:15:02,560][04017] Updated weights for policy 0, policy_version 7723 (0.0019) [2024-03-21 00:15:05,521][03784] Fps is (10 sec: 62258.6, 60 sec: 44236.8, 300 sec: 45986.3). Total num frames: 253263872. Throughput: 0: 46484.4. Samples: 254401100. Policy #0 lag: (min: 5.0, avg: 39.4, max: 68.0) [2024-03-21 00:15:05,522][03784] Avg episode reward: [(0, '0.389')] [2024-03-21 00:15:09,386][04017] Updated weights for policy 0, policy_version 7733 (0.0011) [2024-03-21 00:15:10,521][03784] Fps is (10 sec: 55705.3, 60 sec: 43690.7, 300 sec: 46208.4). Total num frames: 253427712. Throughput: 0: 46591.0. Samples: 254687600. Policy #0 lag: (min: 5.0, avg: 39.4, max: 68.0) [2024-03-21 00:15:10,522][03784] Avg episode reward: [(0, '0.656')] [2024-03-21 00:15:15,521][03784] Fps is (10 sec: 26214.6, 60 sec: 40960.0, 300 sec: 45764.1). Total num frames: 253526016. Throughput: 0: 46424.5. Samples: 254821900. Policy #0 lag: (min: 0.0, avg: 40.5, max: 93.0) [2024-03-21 00:15:15,530][03784] Avg episode reward: [(0, '0.494')] [2024-03-21 00:15:17,936][04017] Updated weights for policy 0, policy_version 7743 (0.0015) [2024-03-21 00:15:20,521][03784] Fps is (10 sec: 55706.3, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 253984768. Throughput: 0: 46233.3. Samples: 255076800. Policy #0 lag: (min: 3.0, avg: 42.2, max: 81.0) [2024-03-21 00:15:20,522][03784] Avg episode reward: [(0, '0.960')] [2024-03-21 00:15:25,521][03784] Fps is (10 sec: 45874.8, 60 sec: 44236.7, 300 sec: 45875.2). Total num frames: 253984768. Throughput: 0: 46928.8. Samples: 255385400. Policy #0 lag: (min: 3.0, avg: 42.2, max: 81.0) [2024-03-21 00:15:25,522][03784] Avg episode reward: [(0, '0.960')] [2024-03-21 00:15:26,935][04017] Updated weights for policy 0, policy_version 7754 (0.0014) [2024-03-21 00:15:28,534][03995] Signal inference workers to stop experience collection... (5150 times) [2024-03-21 00:15:28,535][03995] Signal inference workers to resume experience collection... (5150 times) [2024-03-21 00:15:28,602][04017] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-03-21 00:15:28,602][04017] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-03-21 00:15:30,521][03784] Fps is (10 sec: 19660.7, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 254181376. Throughput: 0: 46689.1. Samples: 255525200. Policy #0 lag: (min: 1.0, avg: 25.1, max: 56.0) [2024-03-21 00:15:30,522][03784] Avg episode reward: [(0, '0.778')] [2024-03-21 00:15:34,240][04017] Updated weights for policy 0, policy_version 7764 (0.0013) [2024-03-21 00:15:35,521][03784] Fps is (10 sec: 45875.5, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 254443520. Throughput: 0: 46104.5. Samples: 255792100. Policy #0 lag: (min: 0.0, avg: 37.3, max: 89.0) [2024-03-21 00:15:35,522][03784] Avg episode reward: [(0, '0.672')] [2024-03-21 00:15:38,283][04017] Updated weights for policy 0, policy_version 7774 (0.0012) [2024-03-21 00:15:40,521][03784] Fps is (10 sec: 62259.0, 60 sec: 47513.6, 300 sec: 45097.6). Total num frames: 254803968. Throughput: 0: 46264.3. Samples: 256066300. Policy #0 lag: (min: 0.0, avg: 37.3, max: 89.0) [2024-03-21 00:15:40,522][03784] Avg episode reward: [(0, '0.362')] [2024-03-21 00:15:44,908][04017] Updated weights for policy 0, policy_version 7784 (0.0010) [2024-03-21 00:15:45,521][03784] Fps is (10 sec: 65536.0, 60 sec: 49698.3, 300 sec: 45208.7). Total num frames: 255098880. Throughput: 0: 46566.7. Samples: 256208900. Policy #0 lag: (min: 0.0, avg: 32.9, max: 71.0) [2024-03-21 00:15:45,522][03784] Avg episode reward: [(0, '0.924')] [2024-03-21 00:15:50,521][03784] Fps is (10 sec: 49152.8, 60 sec: 48606.0, 300 sec: 45542.0). Total num frames: 255295488. Throughput: 0: 46606.8. Samples: 256498400. Policy #0 lag: (min: 0.0, avg: 38.8, max: 75.0) [2024-03-21 00:15:50,521][03784] Avg episode reward: [(0, '1.037')] [2024-03-21 00:15:52,050][04017] Updated weights for policy 0, policy_version 7794 (0.0014) [2024-03-21 00:15:55,521][03784] Fps is (10 sec: 49151.7, 60 sec: 49151.9, 300 sec: 45653.0). Total num frames: 255590400. Throughput: 0: 46248.9. Samples: 256768800. Policy #0 lag: (min: 0.0, avg: 38.8, max: 75.0) [2024-03-21 00:15:55,522][03784] Avg episode reward: [(0, '0.681')] [2024-03-21 00:15:59,291][04017] Updated weights for policy 0, policy_version 7804 (0.0016) [2024-03-21 00:16:00,521][03784] Fps is (10 sec: 49151.3, 60 sec: 48605.9, 300 sec: 45986.3). Total num frames: 255787008. Throughput: 0: 46546.6. Samples: 256916500. Policy #0 lag: (min: 0.0, avg: 47.6, max: 97.0) [2024-03-21 00:16:00,522][03784] Avg episode reward: [(0, '0.871')] [2024-03-21 00:16:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000007806_255787008.pth... [2024-03-21 00:16:00,648][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000007471_244809728.pth [2024-03-21 00:16:05,521][03784] Fps is (10 sec: 32768.1, 60 sec: 44236.8, 300 sec: 45653.0). Total num frames: 255918080. Throughput: 0: 47277.7. Samples: 257204300. Policy #0 lag: (min: 0.0, avg: 40.9, max: 91.0) [2024-03-21 00:16:05,522][03784] Avg episode reward: [(0, '0.757')] [2024-03-21 00:16:08,176][04017] Updated weights for policy 0, policy_version 7814 (0.0015) [2024-03-21 00:16:10,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45329.2, 300 sec: 45653.0). Total num frames: 256147456. Throughput: 0: 46715.7. Samples: 257487600. Policy #0 lag: (min: 0.0, avg: 40.9, max: 91.0) [2024-03-21 00:16:10,522][03784] Avg episode reward: [(0, '1.247')] [2024-03-21 00:16:10,533][03995] Saving new best policy, reward=1.247! [2024-03-21 00:16:15,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 256344064. Throughput: 0: 46760.0. Samples: 257629400. Policy #0 lag: (min: 0.0, avg: 42.0, max: 88.0) [2024-03-21 00:16:15,522][03784] Avg episode reward: [(0, '0.558')] [2024-03-21 00:16:15,898][04017] Updated weights for policy 0, policy_version 7824 (0.0011) [2024-03-21 00:16:20,521][03784] Fps is (10 sec: 45874.8, 60 sec: 43690.6, 300 sec: 45875.2). Total num frames: 256606208. Throughput: 0: 46562.1. Samples: 257887400. Policy #0 lag: (min: 0.0, avg: 28.8, max: 71.0) [2024-03-21 00:16:20,522][03784] Avg episode reward: [(0, '0.899')] [2024-03-21 00:16:21,925][03995] Signal inference workers to stop experience collection... (5200 times) [2024-03-21 00:16:21,972][04017] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-03-21 00:16:22,195][03995] Signal inference workers to resume experience collection... (5200 times) [2024-03-21 00:16:22,195][04017] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-03-21 00:16:22,804][04017] Updated weights for policy 0, policy_version 7834 (0.0010) [2024-03-21 00:16:25,521][03784] Fps is (10 sec: 52428.8, 60 sec: 48059.8, 300 sec: 45875.2). Total num frames: 256868352. Throughput: 0: 46404.5. Samples: 258154500. Policy #0 lag: (min: 0.0, avg: 28.8, max: 71.0) [2024-03-21 00:16:25,522][03784] Avg episode reward: [(0, '1.063')] [2024-03-21 00:16:28,258][04017] Updated weights for policy 0, policy_version 7844 (0.0023) [2024-03-21 00:16:30,521][03784] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 45542.0). Total num frames: 257097728. Throughput: 0: 46440.0. Samples: 258298700. Policy #0 lag: (min: 0.0, avg: 42.5, max: 99.0) [2024-03-21 00:16:30,522][03784] Avg episode reward: [(0, '0.553')] [2024-03-21 00:16:35,058][04017] Updated weights for policy 0, policy_version 7854 (0.0017) [2024-03-21 00:16:35,521][03784] Fps is (10 sec: 52429.2, 60 sec: 49152.0, 300 sec: 44986.6). Total num frames: 257392640. Throughput: 0: 46093.3. Samples: 258572600. Policy #0 lag: (min: 0.0, avg: 43.3, max: 92.0) [2024-03-21 00:16:35,522][03784] Avg episode reward: [(0, '0.553')] [2024-03-21 00:16:40,521][03784] Fps is (10 sec: 52428.5, 60 sec: 46967.4, 300 sec: 45319.8). Total num frames: 257622016. Throughput: 0: 45780.0. Samples: 258828900. Policy #0 lag: (min: 0.0, avg: 43.3, max: 92.0) [2024-03-21 00:16:40,522][03784] Avg episode reward: [(0, '0.930')] [2024-03-21 00:16:41,087][04017] Updated weights for policy 0, policy_version 7864 (0.0012) [2024-03-21 00:16:45,271][04017] Updated weights for policy 0, policy_version 7874 (0.0015) [2024-03-21 00:16:45,521][03784] Fps is (10 sec: 62258.8, 60 sec: 48605.8, 300 sec: 45764.1). Total num frames: 258015232. Throughput: 0: 45635.6. Samples: 258970100. Policy #0 lag: (min: 1.0, avg: 39.8, max: 76.0) [2024-03-21 00:16:45,522][03784] Avg episode reward: [(0, '0.930')] [2024-03-21 00:16:50,521][03784] Fps is (10 sec: 55706.1, 60 sec: 48059.7, 300 sec: 45764.2). Total num frames: 258179072. Throughput: 0: 45311.1. Samples: 259243300. Policy #0 lag: (min: 0.0, avg: 44.6, max: 100.0) [2024-03-21 00:16:50,522][03784] Avg episode reward: [(0, '0.930')] [2024-03-21 00:16:54,568][04017] Updated weights for policy 0, policy_version 7884 (0.0010) [2024-03-21 00:16:55,521][03784] Fps is (10 sec: 32768.2, 60 sec: 45875.3, 300 sec: 45875.2). Total num frames: 258342912. Throughput: 0: 45593.4. Samples: 259539300. Policy #0 lag: (min: 0.0, avg: 44.6, max: 100.0) [2024-03-21 00:16:55,522][03784] Avg episode reward: [(0, '0.463')] [2024-03-21 00:17:00,521][03784] Fps is (10 sec: 32768.0, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 258506752. Throughput: 0: 45677.8. Samples: 259684900. Policy #0 lag: (min: 0.0, avg: 32.6, max: 69.0) [2024-03-21 00:17:00,522][03784] Avg episode reward: [(0, '1.083')] [2024-03-21 00:17:05,521][03784] Fps is (10 sec: 29491.0, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 258637824. Throughput: 0: 45780.0. Samples: 259947500. Policy #0 lag: (min: 0.0, avg: 21.7, max: 64.0) [2024-03-21 00:17:05,522][03784] Avg episode reward: [(0, '0.621')] [2024-03-21 00:17:06,428][04017] Updated weights for policy 0, policy_version 7894 (0.0011) [2024-03-21 00:17:10,521][03784] Fps is (10 sec: 39321.5, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 258899968. Throughput: 0: 45828.9. Samples: 260216800. Policy #0 lag: (min: 0.0, avg: 21.7, max: 64.0) [2024-03-21 00:17:10,522][03784] Avg episode reward: [(0, '1.078')] [2024-03-21 00:17:14,437][04017] Updated weights for policy 0, policy_version 7904 (0.0023) [2024-03-21 00:17:15,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44236.8, 300 sec: 45430.9). Total num frames: 258998272. Throughput: 0: 45800.0. Samples: 260359700. Policy #0 lag: (min: 0.0, avg: 32.1, max: 73.0) [2024-03-21 00:17:15,522][03784] Avg episode reward: [(0, '0.726')] [2024-03-21 00:17:18,729][03995] Signal inference workers to stop experience collection... (5250 times) [2024-03-21 00:17:18,783][04017] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-03-21 00:17:18,977][03995] Signal inference workers to resume experience collection... (5250 times) [2024-03-21 00:17:18,978][04017] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-03-21 00:17:20,521][03784] Fps is (10 sec: 36044.6, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 259260416. Throughput: 0: 45653.2. Samples: 260627000. Policy #0 lag: (min: 3.0, avg: 39.9, max: 93.0) [2024-03-21 00:17:20,522][03784] Avg episode reward: [(0, '1.088')] [2024-03-21 00:17:22,023][04017] Updated weights for policy 0, policy_version 7914 (0.0012) [2024-03-21 00:17:25,521][03784] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 44653.3). Total num frames: 259457024. Throughput: 0: 46220.1. Samples: 260908800. Policy #0 lag: (min: 3.0, avg: 39.9, max: 93.0) [2024-03-21 00:17:25,522][03784] Avg episode reward: [(0, '0.659')] [2024-03-21 00:17:28,298][04017] Updated weights for policy 0, policy_version 7924 (0.0015) [2024-03-21 00:17:30,521][03784] Fps is (10 sec: 55706.2, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 259817472. Throughput: 0: 45780.1. Samples: 261030200. Policy #0 lag: (min: 0.0, avg: 40.3, max: 109.0) [2024-03-21 00:17:30,522][03784] Avg episode reward: [(0, '0.353')] [2024-03-21 00:17:32,462][04017] Updated weights for policy 0, policy_version 7934 (0.0012) [2024-03-21 00:17:35,521][03784] Fps is (10 sec: 72088.9, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 260177920. Throughput: 0: 45135.5. Samples: 261274400. Policy #0 lag: (min: 0.0, avg: 48.4, max: 113.0) [2024-03-21 00:17:35,522][03784] Avg episode reward: [(0, '0.556')] [2024-03-21 00:17:36,733][04017] Updated weights for policy 0, policy_version 7944 (0.0016) [2024-03-21 00:17:40,521][03784] Fps is (10 sec: 68812.7, 60 sec: 48059.8, 300 sec: 46097.4). Total num frames: 260505600. Throughput: 0: 44273.3. Samples: 261531600. Policy #0 lag: (min: 0.0, avg: 48.4, max: 113.0) [2024-03-21 00:17:40,522][03784] Avg episode reward: [(0, '1.091')] [2024-03-21 00:17:43,404][04017] Updated weights for policy 0, policy_version 7954 (0.0016) [2024-03-21 00:17:45,521][03784] Fps is (10 sec: 55706.1, 60 sec: 45329.1, 300 sec: 46430.6). Total num frames: 260734976. Throughput: 0: 43948.9. Samples: 261662600. Policy #0 lag: (min: 0.0, avg: 47.8, max: 87.0) [2024-03-21 00:17:45,522][03784] Avg episode reward: [(0, '0.760')] [2024-03-21 00:17:50,521][03784] Fps is (10 sec: 39321.7, 60 sec: 45329.1, 300 sec: 45875.2). Total num frames: 260898816. Throughput: 0: 43866.7. Samples: 261921500. Policy #0 lag: (min: 0.0, avg: 47.9, max: 111.0) [2024-03-21 00:17:50,522][03784] Avg episode reward: [(0, '0.308')] [2024-03-21 00:17:53,099][04017] Updated weights for policy 0, policy_version 7964 (0.0019) [2024-03-21 00:17:55,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 261095424. Throughput: 0: 43522.3. Samples: 262175300. Policy #0 lag: (min: 0.0, avg: 47.9, max: 111.0) [2024-03-21 00:17:55,522][03784] Avg episode reward: [(0, '0.905')] [2024-03-21 00:18:00,521][03784] Fps is (10 sec: 32767.7, 60 sec: 45329.0, 300 sec: 45986.3). Total num frames: 261226496. Throughput: 0: 43388.8. Samples: 262312200. Policy #0 lag: (min: 0.0, avg: 39.1, max: 79.0) [2024-03-21 00:18:00,522][03784] Avg episode reward: [(0, '0.688')] [2024-03-21 00:18:00,594][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000007973_261259264.pth... [2024-03-21 00:18:00,696][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000007637_250249216.pth [2024-03-21 00:18:01,706][04017] Updated weights for policy 0, policy_version 7974 (0.0015) [2024-03-21 00:18:05,521][03784] Fps is (10 sec: 32768.0, 60 sec: 46421.4, 300 sec: 45319.8). Total num frames: 261423104. Throughput: 0: 42964.5. Samples: 262560400. Policy #0 lag: (min: 0.0, avg: 39.1, max: 79.0) [2024-03-21 00:18:05,522][03784] Avg episode reward: [(0, '0.533')] [2024-03-21 00:18:10,521][03784] Fps is (10 sec: 32768.2, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 261554176. Throughput: 0: 42746.6. Samples: 262832400. Policy #0 lag: (min: 0.0, avg: 45.5, max: 113.0) [2024-03-21 00:18:10,522][03784] Avg episode reward: [(0, '0.505')] [2024-03-21 00:18:11,861][04017] Updated weights for policy 0, policy_version 7984 (0.0015) [2024-03-21 00:18:15,521][03784] Fps is (10 sec: 29491.2, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 261718016. Throughput: 0: 43433.3. Samples: 262984700. Policy #0 lag: (min: 0.0, avg: 34.6, max: 103.0) [2024-03-21 00:18:15,522][03784] Avg episode reward: [(0, '1.094')] [2024-03-21 00:18:20,521][03784] Fps is (10 sec: 26214.4, 60 sec: 42598.4, 300 sec: 44542.3). Total num frames: 261816320. Throughput: 0: 44351.2. Samples: 263270200. Policy #0 lag: (min: 0.0, avg: 34.6, max: 103.0) [2024-03-21 00:18:20,522][03784] Avg episode reward: [(0, '1.010')] [2024-03-21 00:18:22,301][03995] Signal inference workers to stop experience collection... (5300 times) [2024-03-21 00:18:22,363][04017] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-03-21 00:18:22,571][03995] Signal inference workers to resume experience collection... (5300 times) [2024-03-21 00:18:22,572][04017] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-03-21 00:18:22,574][04017] Updated weights for policy 0, policy_version 7994 (0.0015) [2024-03-21 00:18:25,521][03784] Fps is (10 sec: 32768.0, 60 sec: 43144.5, 300 sec: 44653.3). Total num frames: 262045696. Throughput: 0: 44931.1. Samples: 263553500. Policy #0 lag: (min: 0.0, avg: 28.3, max: 76.0) [2024-03-21 00:18:25,523][03784] Avg episode reward: [(0, '0.700')] [2024-03-21 00:18:27,739][04017] Updated weights for policy 0, policy_version 8004 (0.0018) [2024-03-21 00:18:30,521][03784] Fps is (10 sec: 68812.7, 60 sec: 44782.9, 300 sec: 46097.3). Total num frames: 262504448. Throughput: 0: 44728.8. Samples: 263675400. Policy #0 lag: (min: 6.0, avg: 44.2, max: 102.0) [2024-03-21 00:18:30,522][03784] Avg episode reward: [(0, '1.005')] [2024-03-21 00:18:32,276][04017] Updated weights for policy 0, policy_version 8014 (0.0015) [2024-03-21 00:18:35,521][03784] Fps is (10 sec: 81918.6, 60 sec: 44782.9, 300 sec: 46430.6). Total num frames: 262864896. Throughput: 0: 44782.0. Samples: 263936700. Policy #0 lag: (min: 6.0, avg: 44.2, max: 102.0) [2024-03-21 00:18:35,522][03784] Avg episode reward: [(0, '1.005')] [2024-03-21 00:18:37,581][04017] Updated weights for policy 0, policy_version 8024 (0.0011) [2024-03-21 00:18:40,521][03784] Fps is (10 sec: 55705.9, 60 sec: 42598.4, 300 sec: 46430.6). Total num frames: 263061504. Throughput: 0: 44820.0. Samples: 264192200. Policy #0 lag: (min: 2.0, avg: 37.7, max: 72.0) [2024-03-21 00:18:40,522][03784] Avg episode reward: [(0, '0.393')] [2024-03-21 00:18:44,520][04017] Updated weights for policy 0, policy_version 8034 (0.0015) [2024-03-21 00:18:45,521][03784] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 46208.4). Total num frames: 263323648. Throughput: 0: 44788.9. Samples: 264327700. Policy #0 lag: (min: 2.0, avg: 31.4, max: 55.0) [2024-03-21 00:18:45,522][03784] Avg episode reward: [(0, '0.542')] [2024-03-21 00:18:50,521][03784] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 46208.4). Total num frames: 263520256. Throughput: 0: 45266.7. Samples: 264597400. Policy #0 lag: (min: 0.0, avg: 38.8, max: 75.0) [2024-03-21 00:18:50,522][03784] Avg episode reward: [(0, '0.655')] [2024-03-21 00:18:52,414][04017] Updated weights for policy 0, policy_version 8044 (0.0010) [2024-03-21 00:18:55,521][03784] Fps is (10 sec: 32767.7, 60 sec: 42598.3, 300 sec: 45542.0). Total num frames: 263651328. Throughput: 0: 45097.7. Samples: 264861800. Policy #0 lag: (min: 0.0, avg: 38.8, max: 75.0) [2024-03-21 00:18:55,522][03784] Avg episode reward: [(0, '0.966')] [2024-03-21 00:19:00,521][03784] Fps is (10 sec: 29491.2, 60 sec: 43144.6, 300 sec: 44764.4). Total num frames: 263815168. Throughput: 0: 44837.8. Samples: 265002400. Policy #0 lag: (min: 1.0, avg: 28.2, max: 63.0) [2024-03-21 00:19:00,522][03784] Avg episode reward: [(0, '1.155')] [2024-03-21 00:19:01,205][04017] Updated weights for policy 0, policy_version 8054 (0.0010) [2024-03-21 00:19:05,521][03784] Fps is (10 sec: 32768.2, 60 sec: 42598.3, 300 sec: 44653.4). Total num frames: 263979008. Throughput: 0: 44717.7. Samples: 265282500. Policy #0 lag: (min: 1.0, avg: 28.2, max: 63.0) [2024-03-21 00:19:05,522][03784] Avg episode reward: [(0, '1.155')] [2024-03-21 00:19:10,521][03784] Fps is (10 sec: 39321.3, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 264208384. Throughput: 0: 44579.9. Samples: 265559600. Policy #0 lag: (min: 0.0, avg: 44.4, max: 119.0) [2024-03-21 00:19:10,522][03784] Avg episode reward: [(0, '0.567')] [2024-03-21 00:19:10,628][04017] Updated weights for policy 0, policy_version 8064 (0.0011) [2024-03-21 00:19:15,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 264339456. Throughput: 0: 44926.6. Samples: 265697100. Policy #0 lag: (min: 0.0, avg: 44.4, max: 119.0) [2024-03-21 00:19:15,522][03784] Avg episode reward: [(0, '0.567')] [2024-03-21 00:19:18,746][03995] Signal inference workers to stop experience collection... (5350 times) [2024-03-21 00:19:18,747][03995] Signal inference workers to resume experience collection... (5350 times) [2024-03-21 00:19:18,789][04017] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-03-21 00:19:18,789][04017] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-03-21 00:19:19,464][04017] Updated weights for policy 0, policy_version 8074 (0.0016) [2024-03-21 00:19:20,521][03784] Fps is (10 sec: 45875.5, 60 sec: 47513.6, 300 sec: 45208.7). Total num frames: 264667136. Throughput: 0: 45109.1. Samples: 265966600. Policy #0 lag: (min: 0.0, avg: 36.0, max: 85.0) [2024-03-21 00:19:20,522][03784] Avg episode reward: [(0, '1.126')] [2024-03-21 00:19:23,335][04017] Updated weights for policy 0, policy_version 8084 (0.0022) [2024-03-21 00:19:25,521][03784] Fps is (10 sec: 68813.5, 60 sec: 49698.2, 300 sec: 45875.2). Total num frames: 265027584. Throughput: 0: 45264.5. Samples: 266229100. Policy #0 lag: (min: 0.0, avg: 40.1, max: 88.0) [2024-03-21 00:19:25,522][03784] Avg episode reward: [(0, '1.436')] [2024-03-21 00:19:25,579][03995] Saving new best policy, reward=1.436! [2024-03-21 00:19:28,086][04017] Updated weights for policy 0, policy_version 8094 (0.0016) [2024-03-21 00:19:30,521][03784] Fps is (10 sec: 62259.6, 60 sec: 46421.4, 300 sec: 45986.3). Total num frames: 265289728. Throughput: 0: 45375.7. Samples: 266369600. Policy #0 lag: (min: 0.0, avg: 40.1, max: 88.0) [2024-03-21 00:19:30,521][03784] Avg episode reward: [(0, '0.904')] [2024-03-21 00:19:33,448][04017] Updated weights for policy 0, policy_version 8104 (0.0016) [2024-03-21 00:19:35,521][03784] Fps is (10 sec: 62258.5, 60 sec: 46421.4, 300 sec: 46430.6). Total num frames: 265650176. Throughput: 0: 45431.0. Samples: 266641800. Policy #0 lag: (min: 1.0, avg: 49.8, max: 102.0) [2024-03-21 00:19:35,522][03784] Avg episode reward: [(0, '0.889')] [2024-03-21 00:19:40,521][03784] Fps is (10 sec: 45874.5, 60 sec: 44782.9, 300 sec: 46208.5). Total num frames: 265748480. Throughput: 0: 45824.5. Samples: 266923900. Policy #0 lag: (min: 0.0, avg: 46.2, max: 118.0) [2024-03-21 00:19:40,522][03784] Avg episode reward: [(0, '0.394')] [2024-03-21 00:19:45,521][03784] Fps is (10 sec: 16384.1, 60 sec: 41506.2, 300 sec: 45542.0). Total num frames: 265814016. Throughput: 0: 46162.2. Samples: 267079700. Policy #0 lag: (min: 0.0, avg: 46.2, max: 118.0) [2024-03-21 00:19:45,522][03784] Avg episode reward: [(0, '0.736')] [2024-03-21 00:19:46,733][04017] Updated weights for policy 0, policy_version 8114 (0.0016) [2024-03-21 00:19:50,521][03784] Fps is (10 sec: 36044.9, 60 sec: 43144.5, 300 sec: 45653.0). Total num frames: 266108928. Throughput: 0: 46097.8. Samples: 267356900. Policy #0 lag: (min: 2.0, avg: 43.1, max: 116.0) [2024-03-21 00:19:50,522][03784] Avg episode reward: [(0, '0.724')] [2024-03-21 00:19:53,216][04017] Updated weights for policy 0, policy_version 8124 (0.0012) [2024-03-21 00:19:55,521][03784] Fps is (10 sec: 52429.0, 60 sec: 44783.1, 300 sec: 45653.1). Total num frames: 266338304. Throughput: 0: 46595.7. Samples: 267656400. Policy #0 lag: (min: 4.0, avg: 39.8, max: 80.0) [2024-03-21 00:19:55,522][03784] Avg episode reward: [(0, '0.724')] [2024-03-21 00:20:00,521][03784] Fps is (10 sec: 36045.1, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 266469376. Throughput: 0: 47044.5. Samples: 267814100. Policy #0 lag: (min: 4.0, avg: 39.8, max: 80.0) [2024-03-21 00:20:00,522][03784] Avg episode reward: [(0, '0.644')] [2024-03-21 00:20:00,532][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000008132_266469376.pth... [2024-03-21 00:20:00,648][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000007806_255787008.pth [2024-03-21 00:20:02,146][04017] Updated weights for policy 0, policy_version 8134 (0.0015) [2024-03-21 00:20:05,521][03784] Fps is (10 sec: 29491.0, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 266633216. Throughput: 0: 47695.5. Samples: 268112900. Policy #0 lag: (min: 0.0, avg: 30.9, max: 72.0) [2024-03-21 00:20:05,522][03784] Avg episode reward: [(0, '1.189')] [2024-03-21 00:20:08,975][04017] Updated weights for policy 0, policy_version 8144 (0.0020) [2024-03-21 00:20:10,521][03784] Fps is (10 sec: 42598.4, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 266895360. Throughput: 0: 48131.1. Samples: 268395000. Policy #0 lag: (min: 0.0, avg: 27.2, max: 65.0) [2024-03-21 00:20:10,522][03784] Avg episode reward: [(0, '0.620')] [2024-03-21 00:20:12,049][03995] Signal inference workers to stop experience collection... (5400 times) [2024-03-21 00:20:12,104][04017] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-03-21 00:20:12,118][03995] Signal inference workers to resume experience collection... (5400 times) [2024-03-21 00:20:12,155][04017] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-03-21 00:20:13,701][04017] Updated weights for policy 0, policy_version 8154 (0.0019) [2024-03-21 00:20:15,521][03784] Fps is (10 sec: 68812.3, 60 sec: 49698.1, 300 sec: 45208.7). Total num frames: 267321344. Throughput: 0: 48175.4. Samples: 268537500. Policy #0 lag: (min: 0.0, avg: 27.2, max: 65.0) [2024-03-21 00:20:15,522][03784] Avg episode reward: [(0, '0.620')] [2024-03-21 00:20:17,241][04017] Updated weights for policy 0, policy_version 8164 (0.0011) [2024-03-21 00:20:20,521][03784] Fps is (10 sec: 85194.8, 60 sec: 51336.3, 300 sec: 46652.7). Total num frames: 267747328. Throughput: 0: 47879.9. Samples: 268796400. Policy #0 lag: (min: 2.0, avg: 41.0, max: 84.0) [2024-03-21 00:20:20,523][03784] Avg episode reward: [(0, '0.713')] [2024-03-21 00:20:21,581][04017] Updated weights for policy 0, policy_version 8174 (0.0012) [2024-03-21 00:20:25,521][03784] Fps is (10 sec: 65536.8, 60 sec: 49152.0, 300 sec: 46763.8). Total num frames: 267976704. Throughput: 0: 47695.7. Samples: 269070200. Policy #0 lag: (min: 2.0, avg: 41.0, max: 84.0) [2024-03-21 00:20:25,522][03784] Avg episode reward: [(0, '1.105')] [2024-03-21 00:20:30,521][03784] Fps is (10 sec: 32768.6, 60 sec: 46421.3, 300 sec: 46208.4). Total num frames: 268075008. Throughput: 0: 47553.3. Samples: 269219600. Policy #0 lag: (min: 0.0, avg: 47.3, max: 89.0) [2024-03-21 00:20:30,522][03784] Avg episode reward: [(0, '0.494')] [2024-03-21 00:20:33,418][04017] Updated weights for policy 0, policy_version 8184 (0.0012) [2024-03-21 00:20:35,521][03784] Fps is (10 sec: 26214.2, 60 sec: 43144.5, 300 sec: 45542.0). Total num frames: 268238848. Throughput: 0: 47773.3. Samples: 269506700. Policy #0 lag: (min: 1.0, avg: 44.5, max: 85.0) [2024-03-21 00:20:35,522][03784] Avg episode reward: [(0, '1.068')] [2024-03-21 00:20:40,464][04017] Updated weights for policy 0, policy_version 8194 (0.0010) [2024-03-21 00:20:40,521][03784] Fps is (10 sec: 42598.3, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 268500992. Throughput: 0: 47002.1. Samples: 269771500. Policy #0 lag: (min: 1.0, avg: 44.5, max: 85.0) [2024-03-21 00:20:40,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 00:20:45,521][03784] Fps is (10 sec: 49152.1, 60 sec: 48605.8, 300 sec: 45541.9). Total num frames: 268730368. Throughput: 0: 46593.3. Samples: 269910800. Policy #0 lag: (min: 1.0, avg: 41.0, max: 79.0) [2024-03-21 00:20:45,522][03784] Avg episode reward: [(0, '0.804')] [2024-03-21 00:20:48,001][04017] Updated weights for policy 0, policy_version 8204 (0.0018) [2024-03-21 00:20:50,521][03784] Fps is (10 sec: 42597.9, 60 sec: 46967.4, 300 sec: 45208.7). Total num frames: 268926976. Throughput: 0: 45775.4. Samples: 270172800. Policy #0 lag: (min: 0.0, avg: 32.5, max: 93.0) [2024-03-21 00:20:50,522][03784] Avg episode reward: [(0, '0.673')] [2024-03-21 00:20:54,532][04017] Updated weights for policy 0, policy_version 8214 (0.0012) [2024-03-21 00:20:55,521][03784] Fps is (10 sec: 45875.2, 60 sec: 47513.5, 300 sec: 45430.9). Total num frames: 269189120. Throughput: 0: 45673.3. Samples: 270450300. Policy #0 lag: (min: 0.0, avg: 32.5, max: 93.0) [2024-03-21 00:20:55,522][03784] Avg episode reward: [(0, '0.828')] [2024-03-21 00:21:00,521][03784] Fps is (10 sec: 49152.2, 60 sec: 49151.9, 300 sec: 45764.1). Total num frames: 269418496. Throughput: 0: 45511.1. Samples: 270585500. Policy #0 lag: (min: 0.0, avg: 39.6, max: 84.0) [2024-03-21 00:21:00,522][03784] Avg episode reward: [(0, '1.187')] [2024-03-21 00:21:00,620][03995] Signal inference workers to stop experience collection... (5450 times) [2024-03-21 00:21:00,675][04017] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-03-21 00:21:00,684][03995] Signal inference workers to resume experience collection... (5450 times) [2024-03-21 00:21:00,725][04017] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-03-21 00:21:01,048][04017] Updated weights for policy 0, policy_version 8224 (0.0010) [2024-03-21 00:21:05,521][03784] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 45875.2). Total num frames: 269680640. Throughput: 0: 46164.7. Samples: 270873800. Policy #0 lag: (min: 1.0, avg: 38.1, max: 81.0) [2024-03-21 00:21:05,522][03784] Avg episode reward: [(0, '0.639')] [2024-03-21 00:21:08,665][04017] Updated weights for policy 0, policy_version 8234 (0.0014) [2024-03-21 00:21:10,521][03784] Fps is (10 sec: 45875.7, 60 sec: 49698.1, 300 sec: 45875.2). Total num frames: 269877248. Throughput: 0: 46417.8. Samples: 271159000. Policy #0 lag: (min: 1.0, avg: 44.7, max: 100.0) [2024-03-21 00:21:10,522][03784] Avg episode reward: [(0, '0.496')] [2024-03-21 00:21:14,635][04017] Updated weights for policy 0, policy_version 8244 (0.0012) [2024-03-21 00:21:15,521][03784] Fps is (10 sec: 49151.4, 60 sec: 47513.6, 300 sec: 45986.3). Total num frames: 270172160. Throughput: 0: 46231.0. Samples: 271300000. Policy #0 lag: (min: 1.0, avg: 44.7, max: 100.0) [2024-03-21 00:21:15,522][03784] Avg episode reward: [(0, '0.392')] [2024-03-21 00:21:20,521][03784] Fps is (10 sec: 52428.6, 60 sec: 44236.9, 300 sec: 45875.2). Total num frames: 270401536. Throughput: 0: 45440.0. Samples: 271551500. Policy #0 lag: (min: 0.0, avg: 44.8, max: 102.0) [2024-03-21 00:21:20,522][03784] Avg episode reward: [(0, '0.800')] [2024-03-21 00:21:20,985][04017] Updated weights for policy 0, policy_version 8254 (0.0012) [2024-03-21 00:21:25,521][03784] Fps is (10 sec: 49152.7, 60 sec: 44783.0, 300 sec: 45986.3). Total num frames: 270663680. Throughput: 0: 45593.4. Samples: 271823200. Policy #0 lag: (min: 0.0, avg: 44.8, max: 102.0) [2024-03-21 00:21:25,522][03784] Avg episode reward: [(0, '0.591')] [2024-03-21 00:21:27,348][04017] Updated weights for policy 0, policy_version 8264 (0.0011) [2024-03-21 00:21:30,521][03784] Fps is (10 sec: 49152.0, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 270893056. Throughput: 0: 45528.9. Samples: 271959600. Policy #0 lag: (min: 0.0, avg: 35.0, max: 75.0) [2024-03-21 00:21:30,522][03784] Avg episode reward: [(0, '1.177')] [2024-03-21 00:21:35,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46421.4, 300 sec: 45430.9). Total num frames: 271024128. Throughput: 0: 45797.9. Samples: 272233700. Policy #0 lag: (min: 1.0, avg: 38.0, max: 76.0) [2024-03-21 00:21:35,522][03784] Avg episode reward: [(0, '1.099')] [2024-03-21 00:21:38,520][04017] Updated weights for policy 0, policy_version 8274 (0.0019) [2024-03-21 00:21:40,521][03784] Fps is (10 sec: 32768.1, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 271220736. Throughput: 0: 45982.3. Samples: 272519500. Policy #0 lag: (min: 1.0, avg: 27.8, max: 68.0) [2024-03-21 00:21:40,522][03784] Avg episode reward: [(0, '0.578')] [2024-03-21 00:21:45,521][03784] Fps is (10 sec: 32767.8, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 271351808. Throughput: 0: 46120.0. Samples: 272660900. Policy #0 lag: (min: 1.0, avg: 27.8, max: 68.0) [2024-03-21 00:21:45,522][03784] Avg episode reward: [(0, '0.930')] [2024-03-21 00:21:49,241][04017] Updated weights for policy 0, policy_version 8284 (0.0015) [2024-03-21 00:21:50,521][03784] Fps is (10 sec: 32767.8, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 271548416. Throughput: 0: 45902.1. Samples: 272939400. Policy #0 lag: (min: 0.0, avg: 29.6, max: 86.0) [2024-03-21 00:21:50,522][03784] Avg episode reward: [(0, '1.110')] [2024-03-21 00:21:52,459][04017] Updated weights for policy 0, policy_version 8294 (0.0038) [2024-03-21 00:21:55,521][03784] Fps is (10 sec: 65536.8, 60 sec: 46967.5, 300 sec: 45764.1). Total num frames: 272007168. Throughput: 0: 44517.8. Samples: 273162300. Policy #0 lag: (min: 0.0, avg: 29.6, max: 86.0) [2024-03-21 00:21:55,522][03784] Avg episode reward: [(0, '0.642')] [2024-03-21 00:22:00,521][03784] Fps is (10 sec: 52428.6, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 272072704. Throughput: 0: 44782.2. Samples: 273315200. Policy #0 lag: (min: 0.0, avg: 35.0, max: 80.0) [2024-03-21 00:22:00,522][03784] Avg episode reward: [(0, '0.559')] [2024-03-21 00:22:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000008303_272072704.pth... [2024-03-21 00:22:00,705][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000007973_261259264.pth [2024-03-21 00:22:00,837][03995] Signal inference workers to stop experience collection... (5500 times) [2024-03-21 00:22:00,889][04017] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-03-21 00:22:01,159][03995] Signal inference workers to resume experience collection... (5500 times) [2024-03-21 00:22:01,159][04017] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-03-21 00:22:01,161][04017] Updated weights for policy 0, policy_version 8304 (0.0015) [2024-03-21 00:22:05,521][03784] Fps is (10 sec: 26214.2, 60 sec: 43144.5, 300 sec: 45319.8). Total num frames: 272269312. Throughput: 0: 45366.7. Samples: 273593000. Policy #0 lag: (min: 0.0, avg: 35.0, max: 80.0) [2024-03-21 00:22:05,522][03784] Avg episode reward: [(0, '0.723')] [2024-03-21 00:22:07,998][04017] Updated weights for policy 0, policy_version 8314 (0.0015) [2024-03-21 00:22:10,521][03784] Fps is (10 sec: 49152.6, 60 sec: 44782.9, 300 sec: 45986.3). Total num frames: 272564224. Throughput: 0: 45224.4. Samples: 273858300. Policy #0 lag: (min: 0.0, avg: 33.1, max: 68.0) [2024-03-21 00:22:10,522][03784] Avg episode reward: [(0, '0.825')] [2024-03-21 00:22:13,502][04017] Updated weights for policy 0, policy_version 8324 (0.0011) [2024-03-21 00:22:15,521][03784] Fps is (10 sec: 65536.3, 60 sec: 45875.3, 300 sec: 46319.5). Total num frames: 272924672. Throughput: 0: 45095.6. Samples: 273988900. Policy #0 lag: (min: 2.0, avg: 29.9, max: 74.0) [2024-03-21 00:22:15,522][03784] Avg episode reward: [(0, '0.962')] [2024-03-21 00:22:16,802][04017] Updated weights for policy 0, policy_version 8334 (0.0012) [2024-03-21 00:22:20,521][03784] Fps is (10 sec: 72089.4, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 273285120. Throughput: 0: 44420.0. Samples: 274232600. Policy #0 lag: (min: 2.0, avg: 29.9, max: 74.0) [2024-03-21 00:22:20,522][03784] Avg episode reward: [(0, '1.228')] [2024-03-21 00:22:24,853][04017] Updated weights for policy 0, policy_version 8344 (0.0011) [2024-03-21 00:22:25,521][03784] Fps is (10 sec: 52428.6, 60 sec: 46421.3, 300 sec: 46208.4). Total num frames: 273448960. Throughput: 0: 44157.7. Samples: 274506600. Policy #0 lag: (min: 0.0, avg: 44.1, max: 80.0) [2024-03-21 00:22:25,522][03784] Avg episode reward: [(0, '0.535')] [2024-03-21 00:22:30,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 273645568. Throughput: 0: 44100.0. Samples: 274645400. Policy #0 lag: (min: 0.0, avg: 44.1, max: 80.0) [2024-03-21 00:22:30,522][03784] Avg episode reward: [(0, '1.530')] [2024-03-21 00:22:30,533][03995] Saving new best policy, reward=1.530! [2024-03-21 00:22:34,316][04017] Updated weights for policy 0, policy_version 8354 (0.0023) [2024-03-21 00:22:35,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46421.3, 300 sec: 45097.6). Total num frames: 273809408. Throughput: 0: 44220.0. Samples: 274929300. Policy #0 lag: (min: 0.0, avg: 34.5, max: 72.0) [2024-03-21 00:22:35,522][03784] Avg episode reward: [(0, '0.943')] [2024-03-21 00:22:40,521][03784] Fps is (10 sec: 29491.0, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 273940480. Throughput: 0: 45059.8. Samples: 275190000. Policy #0 lag: (min: 0.0, avg: 34.5, max: 72.0) [2024-03-21 00:22:40,522][03784] Avg episode reward: [(0, '0.920')] [2024-03-21 00:22:43,570][04017] Updated weights for policy 0, policy_version 8364 (0.0012) [2024-03-21 00:22:45,521][03784] Fps is (10 sec: 29491.4, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 274104320. Throughput: 0: 44620.1. Samples: 275323100. Policy #0 lag: (min: 0.0, avg: 44.8, max: 109.0) [2024-03-21 00:22:45,522][03784] Avg episode reward: [(0, '0.891')] [2024-03-21 00:22:50,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 274300928. Throughput: 0: 44342.2. Samples: 275588400. Policy #0 lag: (min: 0.0, avg: 33.5, max: 79.0) [2024-03-21 00:22:50,522][03784] Avg episode reward: [(0, '0.667')] [2024-03-21 00:22:52,373][04017] Updated weights for policy 0, policy_version 8374 (0.0011) [2024-03-21 00:22:55,521][03784] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 45208.7). Total num frames: 274563072. Throughput: 0: 44900.0. Samples: 275878800. Policy #0 lag: (min: 0.0, avg: 24.4, max: 107.0) [2024-03-21 00:22:55,522][03784] Avg episode reward: [(0, '0.667')] [2024-03-21 00:22:58,122][04017] Updated weights for policy 0, policy_version 8384 (0.0010) [2024-03-21 00:23:00,521][03784] Fps is (10 sec: 45875.3, 60 sec: 44783.0, 300 sec: 45208.7). Total num frames: 274759680. Throughput: 0: 45024.4. Samples: 276015000. Policy #0 lag: (min: 0.0, avg: 24.4, max: 107.0) [2024-03-21 00:23:00,522][03784] Avg episode reward: [(0, '1.246')] [2024-03-21 00:23:04,673][03995] Signal inference workers to stop experience collection... (5550 times) [2024-03-21 00:23:04,735][04017] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-03-21 00:23:04,735][03995] Signal inference workers to resume experience collection... (5550 times) [2024-03-21 00:23:04,791][04017] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-03-21 00:23:05,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 275054592. Throughput: 0: 46135.5. Samples: 276308700. Policy #0 lag: (min: 0.0, avg: 32.6, max: 70.0) [2024-03-21 00:23:05,522][03784] Avg episode reward: [(0, '0.709')] [2024-03-21 00:23:05,545][04017] Updated weights for policy 0, policy_version 8394 (0.0019) [2024-03-21 00:23:10,030][04017] Updated weights for policy 0, policy_version 8404 (0.0016) [2024-03-21 00:23:10,521][03784] Fps is (10 sec: 62259.4, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 275382272. Throughput: 0: 45871.2. Samples: 276570800. Policy #0 lag: (min: 0.0, avg: 34.9, max: 68.0) [2024-03-21 00:23:10,522][03784] Avg episode reward: [(0, '0.709')] [2024-03-21 00:23:15,521][03784] Fps is (10 sec: 52428.5, 60 sec: 44236.7, 300 sec: 46652.7). Total num frames: 275578880. Throughput: 0: 45988.8. Samples: 276714900. Policy #0 lag: (min: 0.0, avg: 34.9, max: 68.0) [2024-03-21 00:23:15,522][03784] Avg episode reward: [(0, '0.726')] [2024-03-21 00:23:17,071][04017] Updated weights for policy 0, policy_version 8414 (0.0013) [2024-03-21 00:23:20,521][03784] Fps is (10 sec: 52428.2, 60 sec: 43690.6, 300 sec: 46986.0). Total num frames: 275906560. Throughput: 0: 45055.6. Samples: 276956800. Policy #0 lag: (min: 0.0, avg: 46.4, max: 107.0) [2024-03-21 00:23:20,522][03784] Avg episode reward: [(0, '0.898')] [2024-03-21 00:23:25,521][03784] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 45653.0). Total num frames: 275972096. Throughput: 0: 46220.1. Samples: 277269900. Policy #0 lag: (min: 0.0, avg: 46.4, max: 107.0) [2024-03-21 00:23:25,522][03784] Avg episode reward: [(0, '1.137')] [2024-03-21 00:23:26,161][04017] Updated weights for policy 0, policy_version 8424 (0.0021) [2024-03-21 00:23:30,521][03784] Fps is (10 sec: 39322.0, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 276299776. Throughput: 0: 46346.7. Samples: 277408700. Policy #0 lag: (min: 1.0, avg: 41.2, max: 82.0) [2024-03-21 00:23:30,522][03784] Avg episode reward: [(0, '0.841')] [2024-03-21 00:23:31,861][04017] Updated weights for policy 0, policy_version 8434 (0.0015) [2024-03-21 00:23:35,521][03784] Fps is (10 sec: 49152.6, 60 sec: 44236.9, 300 sec: 45430.9). Total num frames: 276463616. Throughput: 0: 46460.1. Samples: 277679100. Policy #0 lag: (min: 0.0, avg: 30.7, max: 74.0) [2024-03-21 00:23:35,522][03784] Avg episode reward: [(0, '0.384')] [2024-03-21 00:23:39,027][04017] Updated weights for policy 0, policy_version 8444 (0.0010) [2024-03-21 00:23:40,521][03784] Fps is (10 sec: 45874.5, 60 sec: 46967.4, 300 sec: 45542.0). Total num frames: 276758528. Throughput: 0: 45962.1. Samples: 277947100. Policy #0 lag: (min: 0.0, avg: 30.7, max: 74.0) [2024-03-21 00:23:40,522][03784] Avg episode reward: [(0, '0.375')] [2024-03-21 00:23:45,521][03784] Fps is (10 sec: 45874.8, 60 sec: 46967.5, 300 sec: 45430.9). Total num frames: 276922368. Throughput: 0: 45860.0. Samples: 278078700. Policy #0 lag: (min: 2.0, avg: 35.2, max: 110.0) [2024-03-21 00:23:45,522][03784] Avg episode reward: [(0, '0.830')] [2024-03-21 00:23:50,521][03784] Fps is (10 sec: 22937.7, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 276987904. Throughput: 0: 45853.3. Samples: 278372100. Policy #0 lag: (min: 0.0, avg: 46.2, max: 108.0) [2024-03-21 00:23:50,522][03784] Avg episode reward: [(0, '0.540')] [2024-03-21 00:23:50,657][04017] Updated weights for policy 0, policy_version 8454 (0.0011) [2024-03-21 00:23:55,498][04017] Updated weights for policy 0, policy_version 8464 (0.0011) [2024-03-21 00:23:55,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 277348352. Throughput: 0: 45848.8. Samples: 278634000. Policy #0 lag: (min: 0.0, avg: 46.2, max: 108.0) [2024-03-21 00:23:55,522][03784] Avg episode reward: [(0, '0.506')] [2024-03-21 00:24:00,521][03784] Fps is (10 sec: 58982.6, 60 sec: 46967.4, 300 sec: 46097.4). Total num frames: 277577728. Throughput: 0: 45677.8. Samples: 278770400. Policy #0 lag: (min: 1.0, avg: 32.4, max: 74.0) [2024-03-21 00:24:00,522][03784] Avg episode reward: [(0, '0.742')] [2024-03-21 00:24:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000008471_277577728.pth... [2024-03-21 00:24:00,715][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000008132_266469376.pth [2024-03-21 00:24:01,992][04017] Updated weights for policy 0, policy_version 8474 (0.0012) [2024-03-21 00:24:04,806][03995] Signal inference workers to stop experience collection... (5600 times) [2024-03-21 00:24:04,807][03995] Signal inference workers to resume experience collection... (5600 times) [2024-03-21 00:24:04,846][04017] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-03-21 00:24:04,846][04017] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-03-21 00:24:05,521][03784] Fps is (10 sec: 49152.6, 60 sec: 46421.4, 300 sec: 46208.4). Total num frames: 277839872. Throughput: 0: 46620.1. Samples: 279054700. Policy #0 lag: (min: 0.0, avg: 31.9, max: 72.0) [2024-03-21 00:24:05,522][03784] Avg episode reward: [(0, '0.742')] [2024-03-21 00:24:09,593][04017] Updated weights for policy 0, policy_version 8484 (0.0027) [2024-03-21 00:24:10,521][03784] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 46319.5). Total num frames: 278003712. Throughput: 0: 45973.3. Samples: 279338700. Policy #0 lag: (min: 0.0, avg: 31.9, max: 72.0) [2024-03-21 00:24:10,522][03784] Avg episode reward: [(0, '1.089')] [2024-03-21 00:24:15,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44783.0, 300 sec: 46097.4). Total num frames: 278265856. Throughput: 0: 45615.5. Samples: 279461400. Policy #0 lag: (min: 0.0, avg: 42.1, max: 89.0) [2024-03-21 00:24:15,522][03784] Avg episode reward: [(0, '0.931')] [2024-03-21 00:24:16,901][04017] Updated weights for policy 0, policy_version 8494 (0.0016) [2024-03-21 00:24:20,250][04017] Updated weights for policy 0, policy_version 8504 (0.0016) [2024-03-21 00:24:20,521][03784] Fps is (10 sec: 65536.1, 60 sec: 45875.2, 300 sec: 46208.4). Total num frames: 278659072. Throughput: 0: 45264.3. Samples: 279716000. Policy #0 lag: (min: 3.0, avg: 41.2, max: 71.0) [2024-03-21 00:24:20,522][03784] Avg episode reward: [(0, '0.449')] [2024-03-21 00:24:25,441][04017] Updated weights for policy 0, policy_version 8514 (0.0012) [2024-03-21 00:24:25,521][03784] Fps is (10 sec: 72089.3, 60 sec: 50244.3, 300 sec: 46430.6). Total num frames: 278986752. Throughput: 0: 45322.3. Samples: 279986600. Policy #0 lag: (min: 3.0, avg: 41.2, max: 71.0) [2024-03-21 00:24:25,522][03784] Avg episode reward: [(0, '1.274')] [2024-03-21 00:24:30,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 279085056. Throughput: 0: 45728.9. Samples: 280136500. Policy #0 lag: (min: 0.0, avg: 43.3, max: 82.0) [2024-03-21 00:24:30,522][03784] Avg episode reward: [(0, '0.782')] [2024-03-21 00:24:35,521][03784] Fps is (10 sec: 22937.7, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 279216128. Throughput: 0: 45846.8. Samples: 280435200. Policy #0 lag: (min: 0.0, avg: 43.3, max: 82.0) [2024-03-21 00:24:35,522][03784] Avg episode reward: [(0, '0.680')] [2024-03-21 00:24:39,082][04017] Updated weights for policy 0, policy_version 8524 (0.0021) [2024-03-21 00:24:40,521][03784] Fps is (10 sec: 29491.3, 60 sec: 43690.8, 300 sec: 45986.3). Total num frames: 279379968. Throughput: 0: 46544.5. Samples: 280728500. Policy #0 lag: (min: 0.0, avg: 32.2, max: 107.0) [2024-03-21 00:24:40,522][03784] Avg episode reward: [(0, '0.774')] [2024-03-21 00:24:45,521][03784] Fps is (10 sec: 36044.5, 60 sec: 44236.8, 300 sec: 45653.0). Total num frames: 279576576. Throughput: 0: 46420.0. Samples: 280859300. Policy #0 lag: (min: 0.0, avg: 32.2, max: 107.0) [2024-03-21 00:24:45,522][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 00:24:46,326][04017] Updated weights for policy 0, policy_version 8534 (0.0014) [2024-03-21 00:24:50,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46967.5, 300 sec: 45653.0). Total num frames: 279805952. Throughput: 0: 46533.3. Samples: 281148700. Policy #0 lag: (min: 0.0, avg: 40.1, max: 112.0) [2024-03-21 00:24:50,522][03784] Avg episode reward: [(0, '0.828')] [2024-03-21 00:24:54,236][04017] Updated weights for policy 0, policy_version 8544 (0.0031) [2024-03-21 00:24:55,521][03784] Fps is (10 sec: 49152.4, 60 sec: 45329.1, 300 sec: 46097.4). Total num frames: 280068096. Throughput: 0: 46249.0. Samples: 281419900. Policy #0 lag: (min: 0.0, avg: 33.2, max: 88.0) [2024-03-21 00:24:55,522][03784] Avg episode reward: [(0, '0.625')] [2024-03-21 00:24:55,947][03995] Signal inference workers to stop experience collection... (5650 times) [2024-03-21 00:24:56,003][04017] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-03-21 00:24:56,016][03995] Signal inference workers to resume experience collection... (5650 times) [2024-03-21 00:24:56,049][04017] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-03-21 00:24:58,808][04017] Updated weights for policy 0, policy_version 8554 (0.0015) [2024-03-21 00:25:00,521][03784] Fps is (10 sec: 58981.8, 60 sec: 46967.4, 300 sec: 46652.7). Total num frames: 280395776. Throughput: 0: 46739.9. Samples: 281564700. Policy #0 lag: (min: 0.0, avg: 33.2, max: 88.0) [2024-03-21 00:25:00,522][03784] Avg episode reward: [(0, '0.672')] [2024-03-21 00:25:03,469][04017] Updated weights for policy 0, policy_version 8564 (0.0011) [2024-03-21 00:25:05,521][03784] Fps is (10 sec: 55705.1, 60 sec: 46421.2, 300 sec: 46541.7). Total num frames: 280625152. Throughput: 0: 47020.0. Samples: 281831900. Policy #0 lag: (min: 1.0, avg: 42.1, max: 94.0) [2024-03-21 00:25:05,522][03784] Avg episode reward: [(0, '0.367')] [2024-03-21 00:25:10,521][03784] Fps is (10 sec: 45875.4, 60 sec: 47513.6, 300 sec: 45875.2). Total num frames: 280854528. Throughput: 0: 47317.7. Samples: 282115900. Policy #0 lag: (min: 1.0, avg: 43.3, max: 89.0) [2024-03-21 00:25:10,522][03784] Avg episode reward: [(0, '0.622')] [2024-03-21 00:25:12,653][04017] Updated weights for policy 0, policy_version 8574 (0.0011) [2024-03-21 00:25:15,521][03784] Fps is (10 sec: 45875.8, 60 sec: 46967.5, 300 sec: 45208.8). Total num frames: 281083904. Throughput: 0: 46949.0. Samples: 282249200. Policy #0 lag: (min: 1.0, avg: 43.3, max: 89.0) [2024-03-21 00:25:15,522][03784] Avg episode reward: [(0, '0.274')] [2024-03-21 00:25:20,521][03784] Fps is (10 sec: 36045.1, 60 sec: 42598.4, 300 sec: 44875.5). Total num frames: 281214976. Throughput: 0: 46713.3. Samples: 282537300. Policy #0 lag: (min: 0.0, avg: 37.0, max: 109.0) [2024-03-21 00:25:20,522][03784] Avg episode reward: [(0, '0.610')] [2024-03-21 00:25:21,552][04017] Updated weights for policy 0, policy_version 8584 (0.0016) [2024-03-21 00:25:25,521][03784] Fps is (10 sec: 39320.6, 60 sec: 41506.0, 300 sec: 45430.9). Total num frames: 281477120. Throughput: 0: 45964.2. Samples: 282796900. Policy #0 lag: (min: 0.0, avg: 37.0, max: 109.0) [2024-03-21 00:25:25,522][03784] Avg episode reward: [(0, '0.805')] [2024-03-21 00:25:28,501][04017] Updated weights for policy 0, policy_version 8594 (0.0011) [2024-03-21 00:25:30,521][03784] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 281640960. Throughput: 0: 46122.2. Samples: 282934800. Policy #0 lag: (min: 0.0, avg: 41.8, max: 86.0) [2024-03-21 00:25:30,522][03784] Avg episode reward: [(0, '0.468')] [2024-03-21 00:25:35,521][03784] Fps is (10 sec: 42599.3, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 281903104. Throughput: 0: 46060.0. Samples: 283221400. Policy #0 lag: (min: 4.0, avg: 67.5, max: 116.0) [2024-03-21 00:25:35,522][03784] Avg episode reward: [(0, '0.468')] [2024-03-21 00:25:35,919][04017] Updated weights for policy 0, policy_version 8604 (0.0016) [2024-03-21 00:25:40,521][03784] Fps is (10 sec: 39321.8, 60 sec: 44236.8, 300 sec: 45097.7). Total num frames: 282034176. Throughput: 0: 46682.2. Samples: 283520600. Policy #0 lag: (min: 4.0, avg: 67.5, max: 116.0) [2024-03-21 00:25:40,522][03784] Avg episode reward: [(0, '0.681')] [2024-03-21 00:25:43,025][04017] Updated weights for policy 0, policy_version 8614 (0.0011) [2024-03-21 00:25:45,264][03995] Signal inference workers to stop experience collection... (5700 times) [2024-03-21 00:25:45,357][04017] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-03-21 00:25:45,521][03784] Fps is (10 sec: 58982.8, 60 sec: 48606.0, 300 sec: 45986.3). Total num frames: 282492928. Throughput: 0: 46426.9. Samples: 283653900. Policy #0 lag: (min: 0.0, avg: 31.8, max: 72.0) [2024-03-21 00:25:45,521][03784] Avg episode reward: [(0, '0.339')] [2024-03-21 00:25:45,530][03995] Signal inference workers to resume experience collection... (5700 times) [2024-03-21 00:25:45,530][04017] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-03-21 00:25:47,895][04017] Updated weights for policy 0, policy_version 8624 (0.0011) [2024-03-21 00:25:50,521][03784] Fps is (10 sec: 68813.0, 60 sec: 48605.9, 300 sec: 45875.2). Total num frames: 282722304. Throughput: 0: 46273.4. Samples: 283914200. Policy #0 lag: (min: 1.0, avg: 36.3, max: 64.0) [2024-03-21 00:25:50,522][03784] Avg episode reward: [(0, '0.607')] [2024-03-21 00:25:55,521][03784] Fps is (10 sec: 26214.1, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 282755072. Throughput: 0: 47006.7. Samples: 284231200. Policy #0 lag: (min: 1.0, avg: 36.3, max: 64.0) [2024-03-21 00:25:55,522][03784] Avg episode reward: [(0, '0.684')] [2024-03-21 00:25:57,390][04017] Updated weights for policy 0, policy_version 8634 (0.0019) [2024-03-21 00:26:00,521][03784] Fps is (10 sec: 29491.1, 60 sec: 43690.7, 300 sec: 45208.7). Total num frames: 283017216. Throughput: 0: 46928.8. Samples: 284361000. Policy #0 lag: (min: 0.0, avg: 29.6, max: 76.0) [2024-03-21 00:26:00,522][03784] Avg episode reward: [(0, '0.799')] [2024-03-21 00:26:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000008637_283017216.pth... [2024-03-21 00:26:00,680][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000008303_272072704.pth [2024-03-21 00:26:03,014][04017] Updated weights for policy 0, policy_version 8644 (0.0034) [2024-03-21 00:26:05,521][03784] Fps is (10 sec: 72089.7, 60 sec: 47513.7, 300 sec: 46097.4). Total num frames: 283475968. Throughput: 0: 46673.3. Samples: 284637600. Policy #0 lag: (min: 4.0, avg: 38.5, max: 78.0) [2024-03-21 00:26:05,522][03784] Avg episode reward: [(0, '0.799')] [2024-03-21 00:26:09,483][04017] Updated weights for policy 0, policy_version 8654 (0.0020) [2024-03-21 00:26:10,521][03784] Fps is (10 sec: 55705.7, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 283574272. Throughput: 0: 47033.5. Samples: 284913400. Policy #0 lag: (min: 4.0, avg: 38.5, max: 78.0) [2024-03-21 00:26:10,530][03784] Avg episode reward: [(0, '1.130')] [2024-03-21 00:26:15,521][03784] Fps is (10 sec: 29491.0, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 283770880. Throughput: 0: 47182.2. Samples: 285058000. Policy #0 lag: (min: 0.0, avg: 45.0, max: 92.0) [2024-03-21 00:26:15,522][03784] Avg episode reward: [(0, '1.048')] [2024-03-21 00:26:18,336][04017] Updated weights for policy 0, policy_version 8664 (0.0012) [2024-03-21 00:26:20,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 283934720. Throughput: 0: 46739.9. Samples: 285324700. Policy #0 lag: (min: 0.0, avg: 45.0, max: 92.0) [2024-03-21 00:26:20,522][03784] Avg episode reward: [(0, '0.618')] [2024-03-21 00:26:25,521][03784] Fps is (10 sec: 42598.3, 60 sec: 45329.2, 300 sec: 45097.6). Total num frames: 284196864. Throughput: 0: 46153.3. Samples: 285597500. Policy #0 lag: (min: 1.0, avg: 66.3, max: 109.0) [2024-03-21 00:26:25,522][03784] Avg episode reward: [(0, '1.050')] [2024-03-21 00:26:27,731][04017] Updated weights for policy 0, policy_version 8674 (0.0012) [2024-03-21 00:26:30,521][03784] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 45097.6). Total num frames: 284327936. Throughput: 0: 46373.1. Samples: 285740700. Policy #0 lag: (min: 0.0, avg: 22.4, max: 52.0) [2024-03-21 00:26:30,522][03784] Avg episode reward: [(0, '1.001')] [2024-03-21 00:26:34,981][04017] Updated weights for policy 0, policy_version 8684 (0.0017) [2024-03-21 00:26:35,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 284590080. Throughput: 0: 46866.6. Samples: 286023200. Policy #0 lag: (min: 0.0, avg: 22.4, max: 52.0) [2024-03-21 00:26:35,522][03784] Avg episode reward: [(0, '1.125')] [2024-03-21 00:26:38,622][03995] Signal inference workers to stop experience collection... (5750 times) [2024-03-21 00:26:38,703][04017] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-03-21 00:26:38,928][03995] Signal inference workers to resume experience collection... (5750 times) [2024-03-21 00:26:38,928][04017] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-03-21 00:26:40,338][04017] Updated weights for policy 0, policy_version 8694 (0.0014) [2024-03-21 00:26:40,521][03784] Fps is (10 sec: 55705.8, 60 sec: 47513.6, 300 sec: 45875.2). Total num frames: 284884992. Throughput: 0: 45275.5. Samples: 286268600. Policy #0 lag: (min: 0.0, avg: 67.9, max: 114.0) [2024-03-21 00:26:40,522][03784] Avg episode reward: [(0, '0.832')] [2024-03-21 00:26:45,521][03784] Fps is (10 sec: 58982.9, 60 sec: 44782.9, 300 sec: 46208.4). Total num frames: 285179904. Throughput: 0: 45246.7. Samples: 286397100. Policy #0 lag: (min: 0.0, avg: 67.9, max: 114.0) [2024-03-21 00:26:45,522][03784] Avg episode reward: [(0, '1.113')] [2024-03-21 00:26:45,838][04017] Updated weights for policy 0, policy_version 8704 (0.0016) [2024-03-21 00:26:50,521][03784] Fps is (10 sec: 49152.1, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 285376512. Throughput: 0: 45286.6. Samples: 286675500. Policy #0 lag: (min: 3.0, avg: 60.6, max: 108.0) [2024-03-21 00:26:50,522][03784] Avg episode reward: [(0, '0.580')] [2024-03-21 00:26:53,418][04017] Updated weights for policy 0, policy_version 8714 (0.0011) [2024-03-21 00:26:55,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46967.5, 300 sec: 45764.1). Total num frames: 285573120. Throughput: 0: 45426.7. Samples: 286957600. Policy #0 lag: (min: 1.0, avg: 36.7, max: 74.0) [2024-03-21 00:26:55,522][03784] Avg episode reward: [(0, '0.835')] [2024-03-21 00:27:00,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.5, 300 sec: 45986.3). Total num frames: 285835264. Throughput: 0: 45168.9. Samples: 287090600. Policy #0 lag: (min: 1.0, avg: 36.7, max: 74.0) [2024-03-21 00:27:00,522][03784] Avg episode reward: [(0, '0.278')] [2024-03-21 00:27:00,625][04017] Updated weights for policy 0, policy_version 8724 (0.0015) [2024-03-21 00:27:03,972][04017] Updated weights for policy 0, policy_version 8734 (0.0042) [2024-03-21 00:27:05,521][03784] Fps is (10 sec: 65536.2, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 286228480. Throughput: 0: 44375.6. Samples: 287321600. Policy #0 lag: (min: 5.0, avg: 46.5, max: 114.0) [2024-03-21 00:27:05,522][03784] Avg episode reward: [(0, '0.725')] [2024-03-21 00:27:10,521][03784] Fps is (10 sec: 52428.6, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 286359552. Throughput: 0: 44168.9. Samples: 287585100. Policy #0 lag: (min: 1.0, avg: 37.2, max: 72.0) [2024-03-21 00:27:10,522][03784] Avg episode reward: [(0, '0.952')] [2024-03-21 00:27:15,521][03784] Fps is (10 sec: 26214.1, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 286490624. Throughput: 0: 44211.1. Samples: 287730200. Policy #0 lag: (min: 1.0, avg: 37.2, max: 72.0) [2024-03-21 00:27:15,522][03784] Avg episode reward: [(0, '0.469')] [2024-03-21 00:27:16,681][04017] Updated weights for policy 0, policy_version 8744 (0.0010) [2024-03-21 00:27:20,521][03784] Fps is (10 sec: 29491.3, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 286654464. Throughput: 0: 44613.3. Samples: 288030800. Policy #0 lag: (min: 0.0, avg: 43.1, max: 114.0) [2024-03-21 00:27:20,522][03784] Avg episode reward: [(0, '1.104')] [2024-03-21 00:27:23,768][04017] Updated weights for policy 0, policy_version 8754 (0.0010) [2024-03-21 00:27:25,521][03784] Fps is (10 sec: 36045.1, 60 sec: 44236.9, 300 sec: 44764.4). Total num frames: 286851072. Throughput: 0: 45335.6. Samples: 288308700. Policy #0 lag: (min: 0.0, avg: 43.1, max: 114.0) [2024-03-21 00:27:25,522][03784] Avg episode reward: [(0, '0.728')] [2024-03-21 00:27:30,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 287014912. Throughput: 0: 45604.3. Samples: 288449300. Policy #0 lag: (min: 0.0, avg: 32.3, max: 69.0) [2024-03-21 00:27:30,522][03784] Avg episode reward: [(0, '0.646')] [2024-03-21 00:27:34,026][04017] Updated weights for policy 0, policy_version 8764 (0.0011) [2024-03-21 00:27:35,521][03784] Fps is (10 sec: 32768.0, 60 sec: 43144.6, 300 sec: 44875.5). Total num frames: 287178752. Throughput: 0: 45631.2. Samples: 288728900. Policy #0 lag: (min: 1.0, avg: 32.5, max: 84.0) [2024-03-21 00:27:35,522][03784] Avg episode reward: [(0, '0.525')] [2024-03-21 00:27:37,223][03995] Signal inference workers to stop experience collection... (5800 times) [2024-03-21 00:27:37,267][04017] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-03-21 00:27:37,514][03995] Signal inference workers to resume experience collection... (5800 times) [2024-03-21 00:27:37,514][04017] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-03-21 00:27:40,521][03784] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 45097.6). Total num frames: 287408128. Throughput: 0: 44971.0. Samples: 288981300. Policy #0 lag: (min: 1.0, avg: 32.5, max: 84.0) [2024-03-21 00:27:40,522][03784] Avg episode reward: [(0, '1.299')] [2024-03-21 00:27:41,444][04017] Updated weights for policy 0, policy_version 8774 (0.0020) [2024-03-21 00:27:45,521][03784] Fps is (10 sec: 52428.6, 60 sec: 42052.2, 300 sec: 45430.9). Total num frames: 287703040. Throughput: 0: 44435.6. Samples: 289090200. Policy #0 lag: (min: 3.0, avg: 28.2, max: 58.0) [2024-03-21 00:27:45,522][03784] Avg episode reward: [(0, '0.918')] [2024-03-21 00:27:46,836][04017] Updated weights for policy 0, policy_version 8784 (0.0017) [2024-03-21 00:27:49,881][04017] Updated weights for policy 0, policy_version 8794 (0.0016) [2024-03-21 00:27:50,521][03784] Fps is (10 sec: 81921.3, 60 sec: 47513.7, 300 sec: 46319.5). Total num frames: 288227328. Throughput: 0: 44728.9. Samples: 289334400. Policy #0 lag: (min: 8.0, avg: 44.2, max: 123.0) [2024-03-21 00:27:50,521][03784] Avg episode reward: [(0, '1.168')] [2024-03-21 00:27:53,886][04017] Updated weights for policy 0, policy_version 8804 (0.0018) [2024-03-21 00:27:55,521][03784] Fps is (10 sec: 88473.9, 60 sec: 50244.3, 300 sec: 46874.9). Total num frames: 288587776. Throughput: 0: 44135.6. Samples: 289571200. Policy #0 lag: (min: 8.0, avg: 44.2, max: 123.0) [2024-03-21 00:27:55,522][03784] Avg episode reward: [(0, '0.348')] [2024-03-21 00:28:00,521][03784] Fps is (10 sec: 52428.3, 60 sec: 48605.9, 300 sec: 46430.6). Total num frames: 288751616. Throughput: 0: 44340.1. Samples: 289725500. Policy #0 lag: (min: 0.0, avg: 46.9, max: 88.0) [2024-03-21 00:28:00,522][03784] Avg episode reward: [(0, '0.401')] [2024-03-21 00:28:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000008812_288751616.pth... [2024-03-21 00:28:00,649][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000008471_277577728.pth [2024-03-21 00:28:02,822][04017] Updated weights for policy 0, policy_version 8814 (0.0010) [2024-03-21 00:28:05,521][03784] Fps is (10 sec: 26214.5, 60 sec: 43690.7, 300 sec: 45653.0). Total num frames: 288849920. Throughput: 0: 43940.1. Samples: 290008100. Policy #0 lag: (min: 0.0, avg: 48.4, max: 95.0) [2024-03-21 00:28:05,522][03784] Avg episode reward: [(0, '1.049')] [2024-03-21 00:28:10,521][03784] Fps is (10 sec: 22937.6, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 288980992. Throughput: 0: 44251.1. Samples: 290300000. Policy #0 lag: (min: 0.0, avg: 48.4, max: 95.0) [2024-03-21 00:28:10,522][03784] Avg episode reward: [(0, '1.072')] [2024-03-21 00:28:15,521][03784] Fps is (10 sec: 19660.7, 60 sec: 42598.5, 300 sec: 44542.3). Total num frames: 289046528. Throughput: 0: 44280.1. Samples: 290441900. Policy #0 lag: (min: 0.0, avg: 35.3, max: 75.0) [2024-03-21 00:28:15,522][03784] Avg episode reward: [(0, '0.812')] [2024-03-21 00:28:16,901][04017] Updated weights for policy 0, policy_version 8824 (0.0012) [2024-03-21 00:28:20,521][03784] Fps is (10 sec: 29491.2, 60 sec: 43690.7, 300 sec: 45097.7). Total num frames: 289275904. Throughput: 0: 44360.0. Samples: 290725100. Policy #0 lag: (min: 0.0, avg: 35.3, max: 75.0) [2024-03-21 00:28:20,522][03784] Avg episode reward: [(0, '0.652')] [2024-03-21 00:28:25,521][03784] Fps is (10 sec: 39321.8, 60 sec: 43144.6, 300 sec: 44542.3). Total num frames: 289439744. Throughput: 0: 44871.2. Samples: 291000500. Policy #0 lag: (min: 0.0, avg: 31.6, max: 76.0) [2024-03-21 00:28:25,522][03784] Avg episode reward: [(0, '0.444')] [2024-03-21 00:28:25,578][04017] Updated weights for policy 0, policy_version 8834 (0.0011) [2024-03-21 00:28:25,599][03995] Signal inference workers to stop experience collection... (5850 times) [2024-03-21 00:28:25,600][03995] Signal inference workers to resume experience collection... (5850 times) [2024-03-21 00:28:25,657][04017] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-03-21 00:28:25,658][04017] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-03-21 00:28:30,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44236.9, 300 sec: 44764.4). Total num frames: 289669120. Throughput: 0: 45420.0. Samples: 291134100. Policy #0 lag: (min: 0.0, avg: 31.6, max: 76.0) [2024-03-21 00:28:30,522][03784] Avg episode reward: [(0, '0.897')] [2024-03-21 00:28:31,805][04017] Updated weights for policy 0, policy_version 8844 (0.0019) [2024-03-21 00:28:35,285][04017] Updated weights for policy 0, policy_version 8854 (0.0024) [2024-03-21 00:28:35,521][03784] Fps is (10 sec: 68812.1, 60 sec: 49151.9, 300 sec: 45319.8). Total num frames: 290127872. Throughput: 0: 45535.4. Samples: 291383500. Policy #0 lag: (min: 1.0, avg: 22.4, max: 55.0) [2024-03-21 00:28:35,522][03784] Avg episode reward: [(0, '0.480')] [2024-03-21 00:28:39,666][04017] Updated weights for policy 0, policy_version 8864 (0.0011) [2024-03-21 00:28:40,521][03784] Fps is (10 sec: 81919.7, 60 sec: 51336.6, 300 sec: 45986.3). Total num frames: 290488320. Throughput: 0: 45920.0. Samples: 291637600. Policy #0 lag: (min: 2.0, avg: 53.4, max: 112.0) [2024-03-21 00:28:40,522][03784] Avg episode reward: [(0, '0.914')] [2024-03-21 00:28:45,521][03784] Fps is (10 sec: 45875.6, 60 sec: 48059.8, 300 sec: 46097.4). Total num frames: 290586624. Throughput: 0: 45615.6. Samples: 291778200. Policy #0 lag: (min: 2.0, avg: 53.4, max: 112.0) [2024-03-21 00:28:45,522][03784] Avg episode reward: [(0, '0.637')] [2024-03-21 00:28:49,544][04017] Updated weights for policy 0, policy_version 8874 (0.0016) [2024-03-21 00:28:50,521][03784] Fps is (10 sec: 36044.7, 60 sec: 43690.6, 300 sec: 45764.1). Total num frames: 290848768. Throughput: 0: 46088.8. Samples: 292082100. Policy #0 lag: (min: 1.0, avg: 32.8, max: 63.0) [2024-03-21 00:28:50,522][03784] Avg episode reward: [(0, '0.870')] [2024-03-21 00:28:53,204][04017] Updated weights for policy 0, policy_version 8884 (0.0013) [2024-03-21 00:28:55,521][03784] Fps is (10 sec: 52428.6, 60 sec: 42052.3, 300 sec: 45875.2). Total num frames: 291110912. Throughput: 0: 46073.3. Samples: 292373300. Policy #0 lag: (min: 1.0, avg: 32.8, max: 63.0) [2024-03-21 00:28:55,530][03784] Avg episode reward: [(0, '0.870')] [2024-03-21 00:29:00,521][03784] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 45875.2). Total num frames: 291373056. Throughput: 0: 46057.8. Samples: 292514500. Policy #0 lag: (min: 0.0, avg: 43.5, max: 104.0) [2024-03-21 00:29:00,528][03784] Avg episode reward: [(0, '0.870')] [2024-03-21 00:29:05,521][03784] Fps is (10 sec: 29491.3, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 291405824. Throughput: 0: 46122.2. Samples: 292800600. Policy #0 lag: (min: 0.0, avg: 37.5, max: 72.0) [2024-03-21 00:29:05,522][03784] Avg episode reward: [(0, '1.001')] [2024-03-21 00:29:06,109][04017] Updated weights for policy 0, policy_version 8894 (0.0017) [2024-03-21 00:29:10,521][03784] Fps is (10 sec: 29491.1, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 291667968. Throughput: 0: 46166.6. Samples: 293078000. Policy #0 lag: (min: 0.0, avg: 37.5, max: 72.0) [2024-03-21 00:29:10,522][03784] Avg episode reward: [(0, '0.943')] [2024-03-21 00:29:11,755][03995] Signal inference workers to stop experience collection... (5900 times) [2024-03-21 00:29:11,756][03995] Signal inference workers to resume experience collection... (5900 times) [2024-03-21 00:29:11,815][04017] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-03-21 00:29:11,816][04017] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-03-21 00:29:12,093][04017] Updated weights for policy 0, policy_version 8904 (0.0012) [2024-03-21 00:29:15,521][03784] Fps is (10 sec: 55705.7, 60 sec: 48605.9, 300 sec: 45097.7). Total num frames: 291962880. Throughput: 0: 46584.5. Samples: 293230400. Policy #0 lag: (min: 0.0, avg: 41.3, max: 89.0) [2024-03-21 00:29:15,530][03784] Avg episode reward: [(0, '0.943')] [2024-03-21 00:29:16,971][04017] Updated weights for policy 0, policy_version 8914 (0.0012) [2024-03-21 00:29:20,521][03784] Fps is (10 sec: 72089.4, 60 sec: 51882.6, 300 sec: 45430.9). Total num frames: 292388864. Throughput: 0: 47024.4. Samples: 293499600. Policy #0 lag: (min: 0.0, avg: 41.3, max: 89.0) [2024-03-21 00:29:20,530][03784] Avg episode reward: [(0, '0.332')] [2024-03-21 00:29:21,148][04017] Updated weights for policy 0, policy_version 8924 (0.0012) [2024-03-21 00:29:25,521][03784] Fps is (10 sec: 65535.0, 60 sec: 52974.8, 300 sec: 45875.2). Total num frames: 292618240. Throughput: 0: 48042.1. Samples: 293799500. Policy #0 lag: (min: 0.0, avg: 36.4, max: 74.0) [2024-03-21 00:29:25,522][03784] Avg episode reward: [(0, '0.332')] [2024-03-21 00:29:29,381][04017] Updated weights for policy 0, policy_version 8934 (0.0015) [2024-03-21 00:29:30,521][03784] Fps is (10 sec: 36044.8, 60 sec: 51336.4, 300 sec: 45875.2). Total num frames: 292749312. Throughput: 0: 48246.6. Samples: 293949300. Policy #0 lag: (min: 0.0, avg: 37.4, max: 92.0) [2024-03-21 00:29:30,522][03784] Avg episode reward: [(0, '0.679')] [2024-03-21 00:29:35,521][03784] Fps is (10 sec: 39322.1, 60 sec: 48059.8, 300 sec: 46208.4). Total num frames: 293011456. Throughput: 0: 47975.6. Samples: 294241000. Policy #0 lag: (min: 0.0, avg: 37.4, max: 92.0) [2024-03-21 00:29:35,522][03784] Avg episode reward: [(0, '0.363')] [2024-03-21 00:29:37,739][04017] Updated weights for policy 0, policy_version 8944 (0.0015) [2024-03-21 00:29:40,521][03784] Fps is (10 sec: 39322.0, 60 sec: 44236.8, 300 sec: 45986.3). Total num frames: 293142528. Throughput: 0: 48180.0. Samples: 294541400. Policy #0 lag: (min: 0.0, avg: 44.6, max: 91.0) [2024-03-21 00:29:40,522][03784] Avg episode reward: [(0, '0.363')] [2024-03-21 00:29:45,358][04017] Updated weights for policy 0, policy_version 8954 (0.0020) [2024-03-21 00:29:45,521][03784] Fps is (10 sec: 39321.6, 60 sec: 46967.5, 300 sec: 46097.4). Total num frames: 293404672. Throughput: 0: 48073.4. Samples: 294677800. Policy #0 lag: (min: 1.0, avg: 45.2, max: 87.0) [2024-03-21 00:29:45,522][03784] Avg episode reward: [(0, '1.137')] [2024-03-21 00:29:50,521][03784] Fps is (10 sec: 36044.2, 60 sec: 44236.7, 300 sec: 45541.9). Total num frames: 293502976. Throughput: 0: 48342.1. Samples: 294976000. Policy #0 lag: (min: 1.0, avg: 45.2, max: 87.0) [2024-03-21 00:29:50,522][03784] Avg episode reward: [(0, '0.968')] [2024-03-21 00:29:53,210][04017] Updated weights for policy 0, policy_version 8964 (0.0012) [2024-03-21 00:29:55,521][03784] Fps is (10 sec: 36044.3, 60 sec: 44236.7, 300 sec: 45319.8). Total num frames: 293765120. Throughput: 0: 48344.4. Samples: 295253500. Policy #0 lag: (min: 1.0, avg: 41.4, max: 88.0) [2024-03-21 00:29:55,522][03784] Avg episode reward: [(0, '0.968')] [2024-03-21 00:29:59,899][04017] Updated weights for policy 0, policy_version 8974 (0.0018) [2024-03-21 00:30:00,521][03784] Fps is (10 sec: 58981.6, 60 sec: 45328.9, 300 sec: 45653.0). Total num frames: 294092800. Throughput: 0: 47928.6. Samples: 295387200. Policy #0 lag: (min: 5.0, avg: 40.3, max: 93.0) [2024-03-21 00:30:00,523][03784] Avg episode reward: [(0, '0.895')] [2024-03-21 00:30:00,796][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000008976_294125568.pth... [2024-03-21 00:30:00,910][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000008637_283017216.pth [2024-03-21 00:30:03,790][03995] Signal inference workers to stop experience collection... (5950 times) [2024-03-21 00:30:03,790][03995] Signal inference workers to resume experience collection... (5950 times) [2024-03-21 00:30:03,832][04017] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-03-21 00:30:03,833][04017] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-03-21 00:30:04,140][04017] Updated weights for policy 0, policy_version 8984 (0.0018) [2024-03-21 00:30:05,521][03784] Fps is (10 sec: 65536.3, 60 sec: 50244.2, 300 sec: 45986.3). Total num frames: 294420480. Throughput: 0: 47635.6. Samples: 295643200. Policy #0 lag: (min: 5.0, avg: 40.3, max: 93.0) [2024-03-21 00:30:05,522][03784] Avg episode reward: [(0, '0.571')] [2024-03-21 00:30:10,521][03784] Fps is (10 sec: 58984.1, 60 sec: 50244.3, 300 sec: 46097.4). Total num frames: 294682624. Throughput: 0: 47557.9. Samples: 295939600. Policy #0 lag: (min: 2.0, avg: 57.3, max: 117.0) [2024-03-21 00:30:10,522][03784] Avg episode reward: [(0, '1.134')] [2024-03-21 00:30:11,062][04017] Updated weights for policy 0, policy_version 8994 (0.0011) [2024-03-21 00:30:15,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49698.0, 300 sec: 46541.7). Total num frames: 294944768. Throughput: 0: 47402.2. Samples: 296082400. Policy #0 lag: (min: 2.0, avg: 57.3, max: 117.0) [2024-03-21 00:30:15,522][03784] Avg episode reward: [(0, '1.134')] [2024-03-21 00:30:16,604][04017] Updated weights for policy 0, policy_version 9004 (0.0015) [2024-03-21 00:30:20,521][03784] Fps is (10 sec: 49151.5, 60 sec: 46421.3, 300 sec: 46430.6). Total num frames: 295174144. Throughput: 0: 47051.0. Samples: 296358300. Policy #0 lag: (min: 0.0, avg: 44.1, max: 100.0) [2024-03-21 00:30:20,522][03784] Avg episode reward: [(0, '1.060')] [2024-03-21 00:30:25,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44783.0, 300 sec: 46319.5). Total num frames: 295305216. Throughput: 0: 46466.6. Samples: 296632400. Policy #0 lag: (min: 0.0, avg: 35.0, max: 73.0) [2024-03-21 00:30:25,522][03784] Avg episode reward: [(0, '0.687')] [2024-03-21 00:30:27,751][04017] Updated weights for policy 0, policy_version 9014 (0.0014) [2024-03-21 00:30:30,521][03784] Fps is (10 sec: 32768.0, 60 sec: 45875.2, 300 sec: 46097.3). Total num frames: 295501824. Throughput: 0: 46588.8. Samples: 296774300. Policy #0 lag: (min: 0.0, avg: 35.0, max: 73.0) [2024-03-21 00:30:30,522][03784] Avg episode reward: [(0, '0.707')] [2024-03-21 00:30:34,791][04017] Updated weights for policy 0, policy_version 9024 (0.0016) [2024-03-21 00:30:35,521][03784] Fps is (10 sec: 45875.2, 60 sec: 45875.1, 300 sec: 46541.7). Total num frames: 295763968. Throughput: 0: 45964.5. Samples: 297044400. Policy #0 lag: (min: 0.0, avg: 26.1, max: 72.0) [2024-03-21 00:30:35,522][03784] Avg episode reward: [(0, '0.495')] [2024-03-21 00:30:39,665][04017] Updated weights for policy 0, policy_version 9034 (0.0010) [2024-03-21 00:30:40,521][03784] Fps is (10 sec: 52429.2, 60 sec: 48059.7, 300 sec: 45875.2). Total num frames: 296026112. Throughput: 0: 46033.4. Samples: 297325000. Policy #0 lag: (min: 0.0, avg: 26.1, max: 72.0) [2024-03-21 00:30:40,523][03784] Avg episode reward: [(0, '0.666')] [2024-03-21 00:30:45,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 296222720. Throughput: 0: 46291.3. Samples: 297470300. Policy #0 lag: (min: 0.0, avg: 54.8, max: 119.0) [2024-03-21 00:30:45,522][03784] Avg episode reward: [(0, '0.653')] [2024-03-21 00:30:48,376][04017] Updated weights for policy 0, policy_version 9044 (0.0011) [2024-03-21 00:30:50,521][03784] Fps is (10 sec: 39321.5, 60 sec: 48605.9, 300 sec: 46319.5). Total num frames: 296419328. Throughput: 0: 46817.8. Samples: 297750000. Policy #0 lag: (min: 0.0, avg: 33.6, max: 75.0) [2024-03-21 00:30:50,522][03784] Avg episode reward: [(0, '0.528')] [2024-03-21 00:30:54,915][04017] Updated weights for policy 0, policy_version 9054 (0.0020) [2024-03-21 00:30:55,521][03784] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 46430.6). Total num frames: 296714240. Throughput: 0: 46133.2. Samples: 298015600. Policy #0 lag: (min: 0.0, avg: 33.6, max: 75.0) [2024-03-21 00:30:55,523][03784] Avg episode reward: [(0, '1.087')] [2024-03-21 00:31:00,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46421.5, 300 sec: 45430.9). Total num frames: 296878080. Throughput: 0: 45860.0. Samples: 298146100. Policy #0 lag: (min: 0.0, avg: 51.6, max: 108.0) [2024-03-21 00:31:00,522][03784] Avg episode reward: [(0, '1.087')] [2024-03-21 00:31:02,547][04017] Updated weights for policy 0, policy_version 9064 (0.0012) [2024-03-21 00:31:05,521][03784] Fps is (10 sec: 32768.3, 60 sec: 43690.7, 300 sec: 45653.0). Total num frames: 297041920. Throughput: 0: 45100.1. Samples: 298387800. Policy #0 lag: (min: 0.0, avg: 51.6, max: 108.0) [2024-03-21 00:31:05,522][03784] Avg episode reward: [(0, '0.713')] [2024-03-21 00:31:10,521][03784] Fps is (10 sec: 26214.4, 60 sec: 40960.0, 300 sec: 45319.8). Total num frames: 297140224. Throughput: 0: 44975.6. Samples: 298656300. Policy #0 lag: (min: 0.0, avg: 24.2, max: 65.0) [2024-03-21 00:31:10,522][03784] Avg episode reward: [(0, '0.868')] [2024-03-21 00:31:12,418][03995] Signal inference workers to stop experience collection... (6000 times) [2024-03-21 00:31:12,486][03995] Signal inference workers to resume experience collection... (6000 times) [2024-03-21 00:31:12,503][04017] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-03-21 00:31:12,549][04017] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-03-21 00:31:14,119][04017] Updated weights for policy 0, policy_version 9074 (0.0011) [2024-03-21 00:31:15,521][03784] Fps is (10 sec: 36044.9, 60 sec: 40960.0, 300 sec: 45653.1). Total num frames: 297402368. Throughput: 0: 44813.4. Samples: 298790900. Policy #0 lag: (min: 0.0, avg: 24.2, max: 65.0) [2024-03-21 00:31:15,522][03784] Avg episode reward: [(0, '0.490')] [2024-03-21 00:31:20,204][04017] Updated weights for policy 0, policy_version 9084 (0.0014) [2024-03-21 00:31:20,521][03784] Fps is (10 sec: 52428.4, 60 sec: 41506.1, 300 sec: 45653.0). Total num frames: 297664512. Throughput: 0: 44115.5. Samples: 299029600. Policy #0 lag: (min: 0.0, avg: 53.7, max: 110.0) [2024-03-21 00:31:20,522][03784] Avg episode reward: [(0, '0.293')] [2024-03-21 00:31:25,223][04017] Updated weights for policy 0, policy_version 9094 (0.0011) [2024-03-21 00:31:25,521][03784] Fps is (10 sec: 58982.3, 60 sec: 44783.0, 300 sec: 46319.5). Total num frames: 297992192. Throughput: 0: 43811.1. Samples: 299296500. Policy #0 lag: (min: 0.0, avg: 53.7, max: 110.0) [2024-03-21 00:31:25,522][03784] Avg episode reward: [(0, '0.973')] [2024-03-21 00:31:30,521][03784] Fps is (10 sec: 62259.8, 60 sec: 46421.4, 300 sec: 46430.6). Total num frames: 298287104. Throughput: 0: 43448.9. Samples: 299425500. Policy #0 lag: (min: 2.0, avg: 55.2, max: 107.0) [2024-03-21 00:31:30,522][03784] Avg episode reward: [(0, '0.511')] [2024-03-21 00:31:31,494][04017] Updated weights for policy 0, policy_version 9104 (0.0014) [2024-03-21 00:31:35,521][03784] Fps is (10 sec: 49151.8, 60 sec: 45329.1, 300 sec: 46097.4). Total num frames: 298483712. Throughput: 0: 43151.1. Samples: 299691800. Policy #0 lag: (min: 2.0, avg: 55.2, max: 107.0) [2024-03-21 00:31:35,522][03784] Avg episode reward: [(0, '1.022')] [2024-03-21 00:31:39,680][04017] Updated weights for policy 0, policy_version 9114 (0.0026) [2024-03-21 00:31:40,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 298680320. Throughput: 0: 43453.5. Samples: 299971000. Policy #0 lag: (min: 0.0, avg: 55.1, max: 105.0) [2024-03-21 00:31:40,531][03784] Avg episode reward: [(0, '1.220')] [2024-03-21 00:31:45,521][03784] Fps is (10 sec: 29491.3, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 298778624. Throughput: 0: 43520.0. Samples: 300104500. Policy #0 lag: (min: 0.0, avg: 55.1, max: 105.0) [2024-03-21 00:31:45,522][03784] Avg episode reward: [(0, '1.200')] [2024-03-21 00:31:47,609][04017] Updated weights for policy 0, policy_version 9124 (0.0011) [2024-03-21 00:31:50,521][03784] Fps is (10 sec: 45874.7, 60 sec: 45329.0, 300 sec: 45986.3). Total num frames: 299139072. Throughput: 0: 44073.3. Samples: 300371100. Policy #0 lag: (min: 0.0, avg: 39.0, max: 73.0) [2024-03-21 00:31:50,522][03784] Avg episode reward: [(0, '1.144')] [2024-03-21 00:31:55,034][04017] Updated weights for policy 0, policy_version 9134 (0.0013) [2024-03-21 00:31:55,521][03784] Fps is (10 sec: 52429.1, 60 sec: 43144.6, 300 sec: 45653.1). Total num frames: 299302912. Throughput: 0: 44000.1. Samples: 300636300. Policy #0 lag: (min: 0.0, avg: 42.2, max: 104.0) [2024-03-21 00:31:55,522][03784] Avg episode reward: [(0, '1.003')] [2024-03-21 00:32:00,521][03784] Fps is (10 sec: 36045.1, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 299499520. Throughput: 0: 43782.2. Samples: 300761100. Policy #0 lag: (min: 0.0, avg: 42.2, max: 104.0) [2024-03-21 00:32:00,522][03784] Avg episode reward: [(0, '0.833')] [2024-03-21 00:32:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000009140_299499520.pth... [2024-03-21 00:32:00,691][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000008812_288751616.pth [2024-03-21 00:32:02,510][04017] Updated weights for policy 0, policy_version 9144 (0.0012) [2024-03-21 00:32:05,521][03784] Fps is (10 sec: 36044.6, 60 sec: 43690.7, 300 sec: 45097.7). Total num frames: 299663360. Throughput: 0: 44820.1. Samples: 301046500. Policy #0 lag: (min: 1.0, avg: 30.0, max: 69.0) [2024-03-21 00:32:05,522][03784] Avg episode reward: [(0, '0.443')] [2024-03-21 00:32:08,155][03995] Signal inference workers to stop experience collection... (6050 times) [2024-03-21 00:32:08,188][04017] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-03-21 00:32:08,465][03995] Signal inference workers to resume experience collection... (6050 times) [2024-03-21 00:32:08,466][04017] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-03-21 00:32:09,984][04017] Updated weights for policy 0, policy_version 9154 (0.0010) [2024-03-21 00:32:10,521][03784] Fps is (10 sec: 45874.5, 60 sec: 46967.4, 300 sec: 45653.0). Total num frames: 299958272. Throughput: 0: 45077.6. Samples: 301325000. Policy #0 lag: (min: 1.0, avg: 30.0, max: 69.0) [2024-03-21 00:32:10,522][03784] Avg episode reward: [(0, '0.781')] [2024-03-21 00:32:15,521][03784] Fps is (10 sec: 55705.5, 60 sec: 46967.4, 300 sec: 45986.3). Total num frames: 300220416. Throughput: 0: 44891.1. Samples: 301445600. Policy #0 lag: (min: 0.0, avg: 32.6, max: 69.0) [2024-03-21 00:32:15,522][03784] Avg episode reward: [(0, '0.457')] [2024-03-21 00:32:17,172][04017] Updated weights for policy 0, policy_version 9164 (0.0014) [2024-03-21 00:32:20,521][03784] Fps is (10 sec: 45875.6, 60 sec: 45875.3, 300 sec: 45986.3). Total num frames: 300417024. Throughput: 0: 45337.8. Samples: 301732000. Policy #0 lag: (min: 1.0, avg: 53.1, max: 106.0) [2024-03-21 00:32:20,522][03784] Avg episode reward: [(0, '0.473')] [2024-03-21 00:32:25,472][04017] Updated weights for policy 0, policy_version 9174 (0.0015) [2024-03-21 00:32:25,521][03784] Fps is (10 sec: 39321.6, 60 sec: 43690.6, 300 sec: 46097.4). Total num frames: 300613632. Throughput: 0: 45357.7. Samples: 302012100. Policy #0 lag: (min: 1.0, avg: 53.1, max: 106.0) [2024-03-21 00:32:25,522][03784] Avg episode reward: [(0, '0.473')] [2024-03-21 00:32:30,521][03784] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 46319.5). Total num frames: 300843008. Throughput: 0: 45224.5. Samples: 302139600. Policy #0 lag: (min: 0.0, avg: 41.0, max: 85.0) [2024-03-21 00:32:30,522][03784] Avg episode reward: [(0, '0.594')] [2024-03-21 00:32:32,054][04017] Updated weights for policy 0, policy_version 9184 (0.0011) [2024-03-21 00:32:35,521][03784] Fps is (10 sec: 36044.6, 60 sec: 41506.1, 300 sec: 45986.3). Total num frames: 300974080. Throughput: 0: 45746.6. Samples: 302429700. Policy #0 lag: (min: 0.0, avg: 41.0, max: 85.0) [2024-03-21 00:32:35,522][03784] Avg episode reward: [(0, '1.140')] [2024-03-21 00:32:39,693][04017] Updated weights for policy 0, policy_version 9194 (0.0011) [2024-03-21 00:32:40,521][03784] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 46097.4). Total num frames: 301301760. Throughput: 0: 45937.7. Samples: 302703500. Policy #0 lag: (min: 0.0, avg: 41.9, max: 82.0) [2024-03-21 00:32:40,522][03784] Avg episode reward: [(0, '1.245')] [2024-03-21 00:32:45,521][03784] Fps is (10 sec: 49152.2, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 301465600. Throughput: 0: 46368.8. Samples: 302847700. Policy #0 lag: (min: 0.0, avg: 33.5, max: 72.0) [2024-03-21 00:32:45,522][03784] Avg episode reward: [(0, '1.175')] [2024-03-21 00:32:47,001][04017] Updated weights for policy 0, policy_version 9204 (0.0011) [2024-03-21 00:32:50,521][03784] Fps is (10 sec: 58981.5, 60 sec: 45875.1, 300 sec: 45097.6). Total num frames: 301891584. Throughput: 0: 46266.5. Samples: 303128500. Policy #0 lag: (min: 0.0, avg: 33.5, max: 72.0) [2024-03-21 00:32:50,522][03784] Avg episode reward: [(0, '0.977')] [2024-03-21 00:32:50,776][04017] Updated weights for policy 0, policy_version 9214 (0.0018) [2024-03-21 00:32:54,565][03995] Signal inference workers to stop experience collection... (6100 times) [2024-03-21 00:32:54,687][04017] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-03-21 00:32:54,774][03995] Signal inference workers to resume experience collection... (6100 times) [2024-03-21 00:32:54,774][04017] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-03-21 00:32:55,521][03784] Fps is (10 sec: 75367.5, 60 sec: 48605.9, 300 sec: 45653.1). Total num frames: 302219264. Throughput: 0: 45786.9. Samples: 303385400. Policy #0 lag: (min: 0.0, avg: 47.3, max: 87.0) [2024-03-21 00:32:55,522][03784] Avg episode reward: [(0, '1.066')] [2024-03-21 00:32:56,838][04017] Updated weights for policy 0, policy_version 9224 (0.0011) [2024-03-21 00:33:00,521][03784] Fps is (10 sec: 58983.1, 60 sec: 49698.1, 300 sec: 46208.4). Total num frames: 302481408. Throughput: 0: 46391.1. Samples: 303533200. Policy #0 lag: (min: 0.0, avg: 47.3, max: 87.0) [2024-03-21 00:33:00,522][03784] Avg episode reward: [(0, '1.170')] [2024-03-21 00:33:02,973][04017] Updated weights for policy 0, policy_version 9234 (0.0012) [2024-03-21 00:33:05,521][03784] Fps is (10 sec: 58981.7, 60 sec: 52428.8, 300 sec: 46874.9). Total num frames: 302809088. Throughput: 0: 46328.9. Samples: 303816800. Policy #0 lag: (min: 0.0, avg: 40.7, max: 74.0) [2024-03-21 00:33:05,522][03784] Avg episode reward: [(0, '0.909')] [2024-03-21 00:33:10,521][03784] Fps is (10 sec: 39321.7, 60 sec: 48606.0, 300 sec: 46874.9). Total num frames: 302874624. Throughput: 0: 46746.7. Samples: 304115700. Policy #0 lag: (min: 0.0, avg: 40.7, max: 74.0) [2024-03-21 00:33:10,522][03784] Avg episode reward: [(0, '0.909')] [2024-03-21 00:33:15,521][03784] Fps is (10 sec: 6553.6, 60 sec: 44236.8, 300 sec: 46097.3). Total num frames: 302874624. Throughput: 0: 47426.6. Samples: 304273800. Policy #0 lag: (min: 0.0, avg: 40.7, max: 74.0) [2024-03-21 00:33:15,522][03784] Avg episode reward: [(0, '1.426')] [2024-03-21 00:33:16,246][04017] Updated weights for policy 0, policy_version 9244 (0.0010) [2024-03-21 00:33:20,521][03784] Fps is (10 sec: 22937.5, 60 sec: 44782.9, 300 sec: 46319.5). Total num frames: 303104000. Throughput: 0: 47551.2. Samples: 304569500. Policy #0 lag: (min: 0.0, avg: 40.5, max: 92.0) [2024-03-21 00:33:20,522][03784] Avg episode reward: [(0, '1.012')] [2024-03-21 00:33:23,401][04017] Updated weights for policy 0, policy_version 9254 (0.0013) [2024-03-21 00:33:25,521][03784] Fps is (10 sec: 45874.7, 60 sec: 45329.0, 300 sec: 46319.5). Total num frames: 303333376. Throughput: 0: 47513.2. Samples: 304841600. Policy #0 lag: (min: 0.0, avg: 28.7, max: 68.0) [2024-03-21 00:33:25,522][03784] Avg episode reward: [(0, '0.621')] [2024-03-21 00:33:27,738][04017] Updated weights for policy 0, policy_version 9264 (0.0035) [2024-03-21 00:33:30,521][03784] Fps is (10 sec: 55705.8, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 303661056. Throughput: 0: 47208.9. Samples: 304972100. Policy #0 lag: (min: 0.0, avg: 28.7, max: 68.0) [2024-03-21 00:33:30,522][03784] Avg episode reward: [(0, '1.313')] [2024-03-21 00:33:35,436][04017] Updated weights for policy 0, policy_version 9274 (0.0019) [2024-03-21 00:33:35,521][03784] Fps is (10 sec: 55706.3, 60 sec: 48605.9, 300 sec: 45430.9). Total num frames: 303890432. Throughput: 0: 47820.1. Samples: 305280400. Policy #0 lag: (min: 0.0, avg: 34.9, max: 87.0) [2024-03-21 00:33:35,522][03784] Avg episode reward: [(0, '1.290')] [2024-03-21 00:33:39,427][04017] Updated weights for policy 0, policy_version 9284 (0.0016) [2024-03-21 00:33:40,521][03784] Fps is (10 sec: 62259.4, 60 sec: 49698.1, 300 sec: 46430.6). Total num frames: 304283648. Throughput: 0: 47957.7. Samples: 305543500. Policy #0 lag: (min: 0.0, avg: 34.9, max: 87.0) [2024-03-21 00:33:40,522][03784] Avg episode reward: [(0, '1.187')] [2024-03-21 00:33:42,195][03995] Signal inference workers to stop experience collection... (6150 times) [2024-03-21 00:33:42,196][03995] Signal inference workers to resume experience collection... (6150 times) [2024-03-21 00:33:42,308][04017] InferenceWorker_p0-w0: stopping experience collection (6150 times) [2024-03-21 00:33:42,309][04017] InferenceWorker_p0-w0: resuming experience collection (6150 times) [2024-03-21 00:33:44,233][04017] Updated weights for policy 0, policy_version 9294 (0.0019) [2024-03-21 00:33:45,521][03784] Fps is (10 sec: 75366.6, 60 sec: 52975.0, 300 sec: 46763.8). Total num frames: 304644096. Throughput: 0: 47508.9. Samples: 305671100. Policy #0 lag: (min: 0.0, avg: 38.1, max: 72.0) [2024-03-21 00:33:45,522][03784] Avg episode reward: [(0, '0.919')] [2024-03-21 00:33:49,296][04017] Updated weights for policy 0, policy_version 9304 (0.0022) [2024-03-21 00:33:50,521][03784] Fps is (10 sec: 58981.8, 60 sec: 49698.2, 300 sec: 46652.7). Total num frames: 304873472. Throughput: 0: 47471.1. Samples: 305953000. Policy #0 lag: (min: 0.0, avg: 38.1, max: 72.0) [2024-03-21 00:33:50,522][03784] Avg episode reward: [(0, '0.401')] [2024-03-21 00:33:55,521][03784] Fps is (10 sec: 42598.0, 60 sec: 47513.5, 300 sec: 46430.6). Total num frames: 305070080. Throughput: 0: 47335.5. Samples: 306245800. Policy #0 lag: (min: 0.0, avg: 49.6, max: 104.0) [2024-03-21 00:33:55,522][03784] Avg episode reward: [(0, '1.142')] [2024-03-21 00:34:00,521][03784] Fps is (10 sec: 26214.3, 60 sec: 44236.7, 300 sec: 46541.6). Total num frames: 305135616. Throughput: 0: 47271.0. Samples: 306401000. Policy #0 lag: (min: 0.0, avg: 49.6, max: 104.0) [2024-03-21 00:34:00,522][03784] Avg episode reward: [(0, '0.767')] [2024-03-21 00:34:00,688][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000009313_305168384.pth... [2024-03-21 00:34:00,809][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000008976_294125568.pth [2024-03-21 00:34:01,164][04017] Updated weights for policy 0, policy_version 9314 (0.0016) [2024-03-21 00:34:05,521][03784] Fps is (10 sec: 22937.7, 60 sec: 41506.1, 300 sec: 46208.4). Total num frames: 305299456. Throughput: 0: 47111.1. Samples: 306689500. Policy #0 lag: (min: 0.0, avg: 47.6, max: 89.0) [2024-03-21 00:34:05,522][03784] Avg episode reward: [(0, '0.709')] [2024-03-21 00:34:10,521][03784] Fps is (10 sec: 36045.2, 60 sec: 43690.6, 300 sec: 45875.2). Total num frames: 305496064. Throughput: 0: 47569.0. Samples: 306982200. Policy #0 lag: (min: 2.0, avg: 28.1, max: 74.0) [2024-03-21 00:34:10,522][03784] Avg episode reward: [(0, '1.305')] [2024-03-21 00:34:11,463][04017] Updated weights for policy 0, policy_version 9324 (0.0010) [2024-03-21 00:34:15,521][03784] Fps is (10 sec: 45875.4, 60 sec: 48059.8, 300 sec: 45319.8). Total num frames: 305758208. Throughput: 0: 47606.7. Samples: 307114400. Policy #0 lag: (min: 2.0, avg: 28.1, max: 74.0) [2024-03-21 00:34:15,522][03784] Avg episode reward: [(0, '1.243')] [2024-03-21 00:34:17,747][04017] Updated weights for policy 0, policy_version 9334 (0.0011) [2024-03-21 00:34:20,521][03784] Fps is (10 sec: 39321.6, 60 sec: 46421.4, 300 sec: 44986.6). Total num frames: 305889280. Throughput: 0: 46597.8. Samples: 307377300. Policy #0 lag: (min: 0.0, avg: 33.7, max: 78.0) [2024-03-21 00:34:20,522][03784] Avg episode reward: [(0, '0.475')] [2024-03-21 00:34:23,767][04017] Updated weights for policy 0, policy_version 9344 (0.0011) [2024-03-21 00:34:25,521][03784] Fps is (10 sec: 58982.5, 60 sec: 50244.4, 300 sec: 46097.4). Total num frames: 306348032. Throughput: 0: 46377.8. Samples: 307630500. Policy #0 lag: (min: 2.0, avg: 41.7, max: 89.0) [2024-03-21 00:34:25,522][03784] Avg episode reward: [(0, '0.539')] [2024-03-21 00:34:28,362][04017] Updated weights for policy 0, policy_version 9354 (0.0012) [2024-03-21 00:34:30,521][03784] Fps is (10 sec: 72089.9, 60 sec: 49152.0, 300 sec: 46097.4). Total num frames: 306610176. Throughput: 0: 46653.3. Samples: 307770500. Policy #0 lag: (min: 2.0, avg: 41.7, max: 89.0) [2024-03-21 00:34:30,522][03784] Avg episode reward: [(0, '0.756')] [2024-03-21 00:34:35,130][04017] Updated weights for policy 0, policy_version 9364 (0.0010) [2024-03-21 00:34:35,521][03784] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 46430.6). Total num frames: 306839552. Throughput: 0: 46657.9. Samples: 308052600. Policy #0 lag: (min: 0.0, avg: 38.7, max: 92.0) [2024-03-21 00:34:35,522][03784] Avg episode reward: [(0, '0.579')] [2024-03-21 00:34:40,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44782.9, 300 sec: 45986.3). Total num frames: 306970624. Throughput: 0: 46397.8. Samples: 308333700. Policy #0 lag: (min: 0.0, avg: 38.7, max: 92.0) [2024-03-21 00:34:40,522][03784] Avg episode reward: [(0, '0.850')] [2024-03-21 00:34:42,915][03995] Signal inference workers to stop experience collection... (6200 times) [2024-03-21 00:34:42,916][03995] Signal inference workers to resume experience collection... (6200 times) [2024-03-21 00:34:42,987][04017] InferenceWorker_p0-w0: stopping experience collection (6200 times) [2024-03-21 00:34:42,987][04017] InferenceWorker_p0-w0: resuming experience collection (6200 times) [2024-03-21 00:34:43,835][04017] Updated weights for policy 0, policy_version 9374 (0.0017) [2024-03-21 00:34:45,521][03784] Fps is (10 sec: 45874.8, 60 sec: 44236.8, 300 sec: 46763.8). Total num frames: 307298304. Throughput: 0: 45809.0. Samples: 308462400. Policy #0 lag: (min: 3.0, avg: 42.0, max: 85.0) [2024-03-21 00:34:45,522][03784] Avg episode reward: [(0, '0.851')] [2024-03-21 00:34:48,092][04017] Updated weights for policy 0, policy_version 9384 (0.0016) [2024-03-21 00:34:50,521][03784] Fps is (10 sec: 62259.5, 60 sec: 45329.1, 300 sec: 46874.9). Total num frames: 307593216. Throughput: 0: 45749.0. Samples: 308748200. Policy #0 lag: (min: 0.0, avg: 53.2, max: 109.0) [2024-03-21 00:34:50,522][03784] Avg episode reward: [(0, '1.264')] [2024-03-21 00:34:55,098][04017] Updated weights for policy 0, policy_version 9394 (0.0018) [2024-03-21 00:34:55,521][03784] Fps is (10 sec: 55705.5, 60 sec: 46421.4, 300 sec: 46652.8). Total num frames: 307855360. Throughput: 0: 45428.9. Samples: 309026500. Policy #0 lag: (min: 0.0, avg: 53.2, max: 109.0) [2024-03-21 00:34:55,522][03784] Avg episode reward: [(0, '0.755')] [2024-03-21 00:35:00,521][03784] Fps is (10 sec: 29491.0, 60 sec: 45875.3, 300 sec: 45653.0). Total num frames: 307888128. Throughput: 0: 45940.0. Samples: 309181700. Policy #0 lag: (min: 0.0, avg: 53.2, max: 109.0) [2024-03-21 00:35:00,522][03784] Avg episode reward: [(0, '0.343')] [2024-03-21 00:35:04,362][04017] Updated weights for policy 0, policy_version 9404 (0.0016) [2024-03-21 00:35:05,521][03784] Fps is (10 sec: 29491.3, 60 sec: 47513.6, 300 sec: 45653.0). Total num frames: 308150272. Throughput: 0: 46624.5. Samples: 309475400. Policy #0 lag: (min: 0.0, avg: 37.3, max: 71.0) [2024-03-21 00:35:05,522][03784] Avg episode reward: [(0, '0.412')] [2024-03-21 00:35:10,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45875.2, 300 sec: 45097.7). Total num frames: 308248576. Throughput: 0: 47164.5. Samples: 309752900. Policy #0 lag: (min: 0.0, avg: 27.6, max: 74.0) [2024-03-21 00:35:10,522][03784] Avg episode reward: [(0, '0.466')] [2024-03-21 00:35:15,422][04017] Updated weights for policy 0, policy_version 9414 (0.0011) [2024-03-21 00:35:15,521][03784] Fps is (10 sec: 32768.2, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 308477952. Throughput: 0: 46915.6. Samples: 309881700. Policy #0 lag: (min: 0.0, avg: 27.6, max: 74.0) [2024-03-21 00:35:15,522][03784] Avg episode reward: [(0, '0.805')] [2024-03-21 00:35:19,662][04017] Updated weights for policy 0, policy_version 9424 (0.0012) [2024-03-21 00:35:20,525][03784] Fps is (10 sec: 65512.9, 60 sec: 50241.4, 300 sec: 46096.8). Total num frames: 308903936. Throughput: 0: 46149.7. Samples: 310129500. Policy #0 lag: (min: 2.0, avg: 37.0, max: 81.0) [2024-03-21 00:35:20,525][03784] Avg episode reward: [(0, '1.089')] [2024-03-21 00:35:25,521][03784] Fps is (10 sec: 55705.0, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 309035008. Throughput: 0: 46102.2. Samples: 310408300. Policy #0 lag: (min: 2.0, avg: 37.0, max: 81.0) [2024-03-21 00:35:25,522][03784] Avg episode reward: [(0, '1.123')] [2024-03-21 00:35:27,622][04017] Updated weights for policy 0, policy_version 9434 (0.0012) [2024-03-21 00:35:30,521][03784] Fps is (10 sec: 39335.3, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 309297152. Throughput: 0: 46353.3. Samples: 310548300. Policy #0 lag: (min: 0.0, avg: 39.5, max: 111.0) [2024-03-21 00:35:30,522][03784] Avg episode reward: [(0, '1.123')] [2024-03-21 00:35:34,468][03995] Signal inference workers to stop experience collection... (6250 times) [2024-03-21 00:35:34,513][04017] InferenceWorker_p0-w0: stopping experience collection (6250 times) [2024-03-21 00:35:34,758][03995] Signal inference workers to resume experience collection... (6250 times) [2024-03-21 00:35:34,758][04017] InferenceWorker_p0-w0: resuming experience collection (6250 times) [2024-03-21 00:35:35,521][03784] Fps is (10 sec: 39321.4, 60 sec: 43144.4, 300 sec: 45430.9). Total num frames: 309428224. Throughput: 0: 46559.9. Samples: 310843400. Policy #0 lag: (min: 1.0, avg: 33.7, max: 70.0) [2024-03-21 00:35:35,522][03784] Avg episode reward: [(0, '1.211')] [2024-03-21 00:35:35,909][04017] Updated weights for policy 0, policy_version 9444 (0.0010) [2024-03-21 00:35:39,601][04017] Updated weights for policy 0, policy_version 9454 (0.0025) [2024-03-21 00:35:40,521][03784] Fps is (10 sec: 55706.1, 60 sec: 48059.8, 300 sec: 46208.5). Total num frames: 309854208. Throughput: 0: 45264.6. Samples: 311063400. Policy #0 lag: (min: 1.0, avg: 33.7, max: 70.0) [2024-03-21 00:35:40,521][03784] Avg episode reward: [(0, '1.301')] [2024-03-21 00:35:45,521][03784] Fps is (10 sec: 55705.9, 60 sec: 44782.9, 300 sec: 45986.3). Total num frames: 309985280. Throughput: 0: 44853.3. Samples: 311200100. Policy #0 lag: (min: 1.0, avg: 33.7, max: 70.0) [2024-03-21 00:35:45,522][03784] Avg episode reward: [(0, '1.301')] [2024-03-21 00:35:48,568][04017] Updated weights for policy 0, policy_version 9464 (0.0011) [2024-03-21 00:35:50,521][03784] Fps is (10 sec: 29491.2, 60 sec: 42598.4, 300 sec: 45542.0). Total num frames: 310149120. Throughput: 0: 45113.4. Samples: 311505500. Policy #0 lag: (min: 0.0, avg: 42.7, max: 73.0) [2024-03-21 00:35:50,522][03784] Avg episode reward: [(0, '1.094')] [2024-03-21 00:35:54,739][04017] Updated weights for policy 0, policy_version 9474 (0.0017) [2024-03-21 00:35:55,521][03784] Fps is (10 sec: 52428.9, 60 sec: 44236.8, 300 sec: 46208.4). Total num frames: 310509568. Throughput: 0: 45062.2. Samples: 311780700. Policy #0 lag: (min: 1.0, avg: 34.2, max: 73.0) [2024-03-21 00:35:55,522][03784] Avg episode reward: [(0, '0.806')] [2024-03-21 00:36:00,521][03784] Fps is (10 sec: 52428.4, 60 sec: 46421.4, 300 sec: 46208.4). Total num frames: 310673408. Throughput: 0: 45366.6. Samples: 311923200. Policy #0 lag: (min: 1.0, avg: 34.2, max: 73.0) [2024-03-21 00:36:00,522][03784] Avg episode reward: [(0, '1.055')] [2024-03-21 00:36:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000009481_310673408.pth... [2024-03-21 00:36:00,651][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000009140_299499520.pth [2024-03-21 00:36:02,295][04017] Updated weights for policy 0, policy_version 9484 (0.0011) [2024-03-21 00:36:05,521][03784] Fps is (10 sec: 29491.1, 60 sec: 44236.8, 300 sec: 46319.5). Total num frames: 310804480. Throughput: 0: 46359.1. Samples: 312215500. Policy #0 lag: (min: 1.0, avg: 54.5, max: 99.0) [2024-03-21 00:36:05,522][03784] Avg episode reward: [(0, '0.659')] [2024-03-21 00:36:10,078][04017] Updated weights for policy 0, policy_version 9494 (0.0019) [2024-03-21 00:36:10,521][03784] Fps is (10 sec: 45874.8, 60 sec: 48059.6, 300 sec: 46541.7). Total num frames: 311132160. Throughput: 0: 45513.3. Samples: 312456400. Policy #0 lag: (min: 3.0, avg: 37.8, max: 89.0) [2024-03-21 00:36:10,522][03784] Avg episode reward: [(0, '1.244')] [2024-03-21 00:36:15,521][03784] Fps is (10 sec: 58982.8, 60 sec: 48605.9, 300 sec: 46541.7). Total num frames: 311394304. Throughput: 0: 45351.2. Samples: 312589100. Policy #0 lag: (min: 3.0, avg: 37.8, max: 89.0) [2024-03-21 00:36:15,522][03784] Avg episode reward: [(0, '0.887')] [2024-03-21 00:36:16,776][04017] Updated weights for policy 0, policy_version 9504 (0.0021) [2024-03-21 00:36:18,312][03995] Signal inference workers to stop experience collection... (6300 times) [2024-03-21 00:36:18,313][03995] Signal inference workers to resume experience collection... (6300 times) [2024-03-21 00:36:18,366][04017] InferenceWorker_p0-w0: stopping experience collection (6300 times) [2024-03-21 00:36:18,367][04017] InferenceWorker_p0-w0: resuming experience collection (6300 times) [2024-03-21 00:36:20,521][03784] Fps is (10 sec: 45875.5, 60 sec: 44785.5, 300 sec: 46097.4). Total num frames: 311590912. Throughput: 0: 45255.6. Samples: 312879900. Policy #0 lag: (min: 0.0, avg: 47.8, max: 104.0) [2024-03-21 00:36:20,522][03784] Avg episode reward: [(0, '0.746')] [2024-03-21 00:36:23,607][04017] Updated weights for policy 0, policy_version 9514 (0.0019) [2024-03-21 00:36:25,521][03784] Fps is (10 sec: 49151.9, 60 sec: 47513.7, 300 sec: 46097.4). Total num frames: 311885824. Throughput: 0: 46211.1. Samples: 313142900. Policy #0 lag: (min: 0.0, avg: 47.8, max: 104.0) [2024-03-21 00:36:25,522][03784] Avg episode reward: [(0, '0.529')] [2024-03-21 00:36:29,160][04017] Updated weights for policy 0, policy_version 9524 (0.0010) [2024-03-21 00:36:30,521][03784] Fps is (10 sec: 55705.6, 60 sec: 47513.6, 300 sec: 46319.5). Total num frames: 312147968. Throughput: 0: 45915.5. Samples: 313266300. Policy #0 lag: (min: 1.0, avg: 45.8, max: 80.0) [2024-03-21 00:36:30,522][03784] Avg episode reward: [(0, '1.119')] [2024-03-21 00:36:35,521][03784] Fps is (10 sec: 42598.5, 60 sec: 48059.8, 300 sec: 46208.4). Total num frames: 312311808. Throughput: 0: 45808.9. Samples: 313566900. Policy #0 lag: (min: 0.0, avg: 38.2, max: 83.0) [2024-03-21 00:36:35,522][03784] Avg episode reward: [(0, '1.119')] [2024-03-21 00:36:38,179][04017] Updated weights for policy 0, policy_version 9534 (0.0019) [2024-03-21 00:36:40,521][03784] Fps is (10 sec: 26214.3, 60 sec: 42598.3, 300 sec: 46208.4). Total num frames: 312410112. Throughput: 0: 46171.1. Samples: 313858400. Policy #0 lag: (min: 0.0, avg: 38.2, max: 83.0) [2024-03-21 00:36:40,522][03784] Avg episode reward: [(0, '0.860')] [2024-03-21 00:36:45,521][03784] Fps is (10 sec: 36044.6, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 312672256. Throughput: 0: 45824.4. Samples: 313985300. Policy #0 lag: (min: 2.0, avg: 52.3, max: 104.0) [2024-03-21 00:36:45,522][03784] Avg episode reward: [(0, '0.646')] [2024-03-21 00:36:46,561][04017] Updated weights for policy 0, policy_version 9544 (0.0015) [2024-03-21 00:36:50,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46421.2, 300 sec: 46208.4). Total num frames: 312934400. Throughput: 0: 45560.0. Samples: 314265700. Policy #0 lag: (min: 2.0, avg: 52.3, max: 104.0) [2024-03-21 00:36:50,522][03784] Avg episode reward: [(0, '0.526')] [2024-03-21 00:36:53,142][04017] Updated weights for policy 0, policy_version 9554 (0.0016) [2024-03-21 00:36:55,521][03784] Fps is (10 sec: 49152.1, 60 sec: 44236.8, 300 sec: 46319.5). Total num frames: 313163776. Throughput: 0: 46002.3. Samples: 314526500. Policy #0 lag: (min: 0.0, avg: 32.6, max: 66.0) [2024-03-21 00:36:55,522][03784] Avg episode reward: [(0, '1.339')] [2024-03-21 00:36:59,318][04017] Updated weights for policy 0, policy_version 9564 (0.0009) [2024-03-21 00:37:00,521][03784] Fps is (10 sec: 55705.7, 60 sec: 46967.4, 300 sec: 46874.9). Total num frames: 313491456. Throughput: 0: 46437.7. Samples: 314678800. Policy #0 lag: (min: 1.0, avg: 33.2, max: 72.0) [2024-03-21 00:37:00,522][03784] Avg episode reward: [(0, '1.112')] [2024-03-21 00:37:05,521][03784] Fps is (10 sec: 52428.8, 60 sec: 48059.8, 300 sec: 46541.7). Total num frames: 313688064. Throughput: 0: 46122.3. Samples: 314955400. Policy #0 lag: (min: 1.0, avg: 33.2, max: 72.0) [2024-03-21 00:37:05,522][03784] Avg episode reward: [(0, '1.112')] [2024-03-21 00:37:05,743][04017] Updated weights for policy 0, policy_version 9574 (0.0011) [2024-03-21 00:37:10,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 313950208. Throughput: 0: 46031.1. Samples: 315214300. Policy #0 lag: (min: 2.0, avg: 50.7, max: 110.0) [2024-03-21 00:37:10,522][03784] Avg episode reward: [(0, '1.112')] [2024-03-21 00:37:13,979][04017] Updated weights for policy 0, policy_version 9584 (0.0011) [2024-03-21 00:37:14,351][03995] Signal inference workers to stop experience collection... (6350 times) [2024-03-21 00:37:14,351][03995] Signal inference workers to resume experience collection... (6350 times) [2024-03-21 00:37:14,442][04017] InferenceWorker_p0-w0: stopping experience collection (6350 times) [2024-03-21 00:37:14,442][04017] InferenceWorker_p0-w0: resuming experience collection (6350 times) [2024-03-21 00:37:15,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45875.1, 300 sec: 46541.7). Total num frames: 314146816. Throughput: 0: 46453.3. Samples: 315356700. Policy #0 lag: (min: 1.0, avg: 56.5, max: 114.0) [2024-03-21 00:37:15,522][03784] Avg episode reward: [(0, '1.091')] [2024-03-21 00:37:20,523][03784] Fps is (10 sec: 36037.0, 60 sec: 45327.4, 300 sec: 46430.3). Total num frames: 314310656. Throughput: 0: 45517.7. Samples: 315615300. Policy #0 lag: (min: 1.0, avg: 56.5, max: 114.0) [2024-03-21 00:37:20,524][03784] Avg episode reward: [(0, '1.139')] [2024-03-21 00:37:22,304][04017] Updated weights for policy 0, policy_version 9594 (0.0011) [2024-03-21 00:37:25,521][03784] Fps is (10 sec: 36045.0, 60 sec: 43690.7, 300 sec: 46319.5). Total num frames: 314507264. Throughput: 0: 45169.0. Samples: 315891000. Policy #0 lag: (min: 1.0, avg: 33.4, max: 66.0) [2024-03-21 00:37:25,522][03784] Avg episode reward: [(0, '0.432')] [2024-03-21 00:37:28,539][04017] Updated weights for policy 0, policy_version 9604 (0.0011) [2024-03-21 00:37:30,521][03784] Fps is (10 sec: 52439.5, 60 sec: 44782.9, 300 sec: 46986.0). Total num frames: 314834944. Throughput: 0: 45291.0. Samples: 316023400. Policy #0 lag: (min: 1.0, avg: 33.4, max: 66.0) [2024-03-21 00:37:30,522][03784] Avg episode reward: [(0, '0.570')] [2024-03-21 00:37:35,521][03784] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 46319.5). Total num frames: 314966016. Throughput: 0: 44942.3. Samples: 316288100. Policy #0 lag: (min: 1.0, avg: 35.2, max: 65.0) [2024-03-21 00:37:35,522][03784] Avg episode reward: [(0, '1.109')] [2024-03-21 00:37:36,609][04017] Updated weights for policy 0, policy_version 9614 (0.0011) [2024-03-21 00:37:40,521][03784] Fps is (10 sec: 29491.6, 60 sec: 45329.1, 300 sec: 46319.5). Total num frames: 315129856. Throughput: 0: 45204.4. Samples: 316560700. Policy #0 lag: (min: 0.0, avg: 35.5, max: 74.0) [2024-03-21 00:37:40,522][03784] Avg episode reward: [(0, '0.588')] [2024-03-21 00:37:44,240][04017] Updated weights for policy 0, policy_version 9624 (0.0011) [2024-03-21 00:37:45,521][03784] Fps is (10 sec: 42598.2, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 315392000. Throughput: 0: 44782.3. Samples: 316694000. Policy #0 lag: (min: 0.0, avg: 35.5, max: 74.0) [2024-03-21 00:37:45,522][03784] Avg episode reward: [(0, '1.090')] [2024-03-21 00:37:50,521][03784] Fps is (10 sec: 49151.8, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 315621376. Throughput: 0: 44415.5. Samples: 316954100. Policy #0 lag: (min: 1.0, avg: 43.8, max: 111.0) [2024-03-21 00:37:50,522][03784] Avg episode reward: [(0, '0.854')] [2024-03-21 00:37:51,597][04017] Updated weights for policy 0, policy_version 9634 (0.0015) [2024-03-21 00:37:55,521][03784] Fps is (10 sec: 52428.9, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 315916288. Throughput: 0: 45084.5. Samples: 317243100. Policy #0 lag: (min: 0.0, avg: 38.0, max: 79.0) [2024-03-21 00:37:55,522][03784] Avg episode reward: [(0, '1.099')] [2024-03-21 00:37:56,256][04017] Updated weights for policy 0, policy_version 9644 (0.0020) [2024-03-21 00:38:00,521][03784] Fps is (10 sec: 62259.5, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 316243968. Throughput: 0: 45133.4. Samples: 317387700. Policy #0 lag: (min: 0.0, avg: 38.0, max: 79.0) [2024-03-21 00:38:00,522][03784] Avg episode reward: [(0, '1.099')] [2024-03-21 00:38:00,657][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000009652_316276736.pth... [2024-03-21 00:38:00,775][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000009313_305168384.pth [2024-03-21 00:38:02,906][04017] Updated weights for policy 0, policy_version 9654 (0.0015) [2024-03-21 00:38:05,521][03784] Fps is (10 sec: 49151.5, 60 sec: 45329.0, 300 sec: 45875.2). Total num frames: 316407808. Throughput: 0: 45677.7. Samples: 317670700. Policy #0 lag: (min: 0.0, avg: 62.3, max: 119.0) [2024-03-21 00:38:05,522][03784] Avg episode reward: [(0, '1.099')] [2024-03-21 00:38:10,521][03784] Fps is (10 sec: 26214.4, 60 sec: 42598.4, 300 sec: 46208.4). Total num frames: 316506112. Throughput: 0: 45835.6. Samples: 317953600. Policy #0 lag: (min: 0.0, avg: 62.3, max: 119.0) [2024-03-21 00:38:10,522][03784] Avg episode reward: [(0, '0.934')] [2024-03-21 00:38:13,862][04017] Updated weights for policy 0, policy_version 9664 (0.0010) [2024-03-21 00:38:15,224][03995] Signal inference workers to stop experience collection... (6400 times) [2024-03-21 00:38:15,331][04017] InferenceWorker_p0-w0: stopping experience collection (6400 times) [2024-03-21 00:38:15,431][03995] Signal inference workers to resume experience collection... (6400 times) [2024-03-21 00:38:15,432][04017] InferenceWorker_p0-w0: resuming experience collection (6400 times) [2024-03-21 00:38:15,521][03784] Fps is (10 sec: 42598.2, 60 sec: 44782.9, 300 sec: 46541.7). Total num frames: 316833792. Throughput: 0: 46142.2. Samples: 318099800. Policy #0 lag: (min: 0.0, avg: 40.8, max: 86.0) [2024-03-21 00:38:15,522][03784] Avg episode reward: [(0, '0.910')] [2024-03-21 00:38:18,002][04017] Updated weights for policy 0, policy_version 9674 (0.0011) [2024-03-21 00:38:20,521][03784] Fps is (10 sec: 72089.5, 60 sec: 48607.6, 300 sec: 47097.1). Total num frames: 317227008. Throughput: 0: 46391.1. Samples: 318375700. Policy #0 lag: (min: 2.0, avg: 33.5, max: 71.0) [2024-03-21 00:38:20,522][03784] Avg episode reward: [(0, '0.878')] [2024-03-21 00:38:23,409][04017] Updated weights for policy 0, policy_version 9684 (0.0012) [2024-03-21 00:38:25,521][03784] Fps is (10 sec: 58983.6, 60 sec: 48605.9, 300 sec: 46652.8). Total num frames: 317423616. Throughput: 0: 46286.8. Samples: 318643600. Policy #0 lag: (min: 2.0, avg: 33.5, max: 71.0) [2024-03-21 00:38:25,522][03784] Avg episode reward: [(0, '1.007')] [2024-03-21 00:38:30,521][03784] Fps is (10 sec: 29490.9, 60 sec: 44783.0, 300 sec: 46208.4). Total num frames: 317521920. Throughput: 0: 46871.0. Samples: 318803200. Policy #0 lag: (min: 0.0, avg: 42.6, max: 86.0) [2024-03-21 00:38:30,522][03784] Avg episode reward: [(0, '1.017')] [2024-03-21 00:38:31,720][04017] Updated weights for policy 0, policy_version 9694 (0.0009) [2024-03-21 00:38:35,521][03784] Fps is (10 sec: 29491.1, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 317718528. Throughput: 0: 47231.2. Samples: 319079500. Policy #0 lag: (min: 0.0, avg: 42.6, max: 86.0) [2024-03-21 00:38:35,522][03784] Avg episode reward: [(0, '1.027')] [2024-03-21 00:38:39,402][04017] Updated weights for policy 0, policy_version 9704 (0.0016) [2024-03-21 00:38:40,521][03784] Fps is (10 sec: 52429.3, 60 sec: 48605.9, 300 sec: 45430.9). Total num frames: 318046208. Throughput: 0: 46611.1. Samples: 319340600. Policy #0 lag: (min: 2.0, avg: 28.3, max: 61.0) [2024-03-21 00:38:40,531][03784] Avg episode reward: [(0, '1.042')] [2024-03-21 00:38:45,521][03784] Fps is (10 sec: 45874.6, 60 sec: 46421.3, 300 sec: 45097.7). Total num frames: 318177280. Throughput: 0: 46555.5. Samples: 319482700. Policy #0 lag: (min: 2.0, avg: 28.3, max: 61.0) [2024-03-21 00:38:45,522][03784] Avg episode reward: [(0, '0.941')] [2024-03-21 00:38:46,732][04017] Updated weights for policy 0, policy_version 9714 (0.0012) [2024-03-21 00:38:50,521][03784] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 45653.1). Total num frames: 318537728. Throughput: 0: 46068.9. Samples: 319743800. Policy #0 lag: (min: 3.0, avg: 44.2, max: 82.0) [2024-03-21 00:38:50,522][03784] Avg episode reward: [(0, '0.849')] [2024-03-21 00:38:54,340][04017] Updated weights for policy 0, policy_version 9724 (0.0011) [2024-03-21 00:38:55,521][03784] Fps is (10 sec: 52429.5, 60 sec: 46421.4, 300 sec: 45986.3). Total num frames: 318701568. Throughput: 0: 46648.9. Samples: 320052800. Policy #0 lag: (min: 1.0, avg: 28.5, max: 64.0) [2024-03-21 00:38:55,522][03784] Avg episode reward: [(0, '0.849')] [2024-03-21 00:39:00,521][03784] Fps is (10 sec: 32768.2, 60 sec: 43690.7, 300 sec: 45986.3). Total num frames: 318865408. Throughput: 0: 46895.7. Samples: 320210100. Policy #0 lag: (min: 1.0, avg: 28.5, max: 64.0) [2024-03-21 00:39:00,522][03784] Avg episode reward: [(0, '0.849')] [2024-03-21 00:39:02,347][04017] Updated weights for policy 0, policy_version 9734 (0.0010) [2024-03-21 00:39:05,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46421.4, 300 sec: 46430.6). Total num frames: 319193088. Throughput: 0: 47315.6. Samples: 320504900. Policy #0 lag: (min: 2.0, avg: 34.2, max: 87.0) [2024-03-21 00:39:05,522][03784] Avg episode reward: [(0, '0.813')] [2024-03-21 00:39:07,253][03995] Signal inference workers to stop experience collection... (6450 times) [2024-03-21 00:39:07,297][04017] InferenceWorker_p0-w0: stopping experience collection (6450 times) [2024-03-21 00:39:07,518][03995] Signal inference workers to resume experience collection... (6450 times) [2024-03-21 00:39:07,519][04017] InferenceWorker_p0-w0: resuming experience collection (6450 times) [2024-03-21 00:39:07,520][04017] Updated weights for policy 0, policy_version 9744 (0.0011) [2024-03-21 00:39:10,521][03784] Fps is (10 sec: 72088.5, 60 sec: 51336.4, 300 sec: 46874.9). Total num frames: 319586304. Throughput: 0: 46933.1. Samples: 320755600. Policy #0 lag: (min: 2.0, avg: 34.2, max: 87.0) [2024-03-21 00:39:10,522][03784] Avg episode reward: [(0, '1.069')] [2024-03-21 00:39:10,702][04017] Updated weights for policy 0, policy_version 9754 (0.0017) [2024-03-21 00:39:15,521][03784] Fps is (10 sec: 62259.2, 60 sec: 49698.2, 300 sec: 47208.1). Total num frames: 319815680. Throughput: 0: 46800.1. Samples: 320909200. Policy #0 lag: (min: 6.0, avg: 58.2, max: 107.0) [2024-03-21 00:39:15,522][03784] Avg episode reward: [(0, '1.069')] [2024-03-21 00:39:17,667][04017] Updated weights for policy 0, policy_version 9764 (0.0018) [2024-03-21 00:39:20,521][03784] Fps is (10 sec: 36045.4, 60 sec: 45329.1, 300 sec: 46097.4). Total num frames: 319946752. Throughput: 0: 47268.9. Samples: 321206600. Policy #0 lag: (min: 6.0, avg: 58.2, max: 107.0) [2024-03-21 00:39:20,522][03784] Avg episode reward: [(0, '1.069')] [2024-03-21 00:39:25,521][03784] Fps is (10 sec: 32768.0, 60 sec: 45329.0, 300 sec: 45875.2). Total num frames: 320143360. Throughput: 0: 48048.9. Samples: 321502800. Policy #0 lag: (min: 0.0, avg: 35.6, max: 82.0) [2024-03-21 00:39:25,522][03784] Avg episode reward: [(0, '0.982')] [2024-03-21 00:39:29,036][04017] Updated weights for policy 0, policy_version 9774 (0.0015) [2024-03-21 00:39:30,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46967.6, 300 sec: 45764.1). Total num frames: 320339968. Throughput: 0: 48029.0. Samples: 321644000. Policy #0 lag: (min: 0.0, avg: 41.8, max: 92.0) [2024-03-21 00:39:30,522][03784] Avg episode reward: [(0, '0.649')] [2024-03-21 00:39:35,521][03784] Fps is (10 sec: 39321.6, 60 sec: 46967.4, 300 sec: 45986.3). Total num frames: 320536576. Throughput: 0: 48580.1. Samples: 321929900. Policy #0 lag: (min: 0.0, avg: 41.8, max: 92.0) [2024-03-21 00:39:35,522][03784] Avg episode reward: [(0, '1.199')] [2024-03-21 00:39:36,972][04017] Updated weights for policy 0, policy_version 9784 (0.0016) [2024-03-21 00:39:40,521][03784] Fps is (10 sec: 42598.0, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 320765952. Throughput: 0: 47682.1. Samples: 322198500. Policy #0 lag: (min: 1.0, avg: 31.8, max: 63.0) [2024-03-21 00:39:40,522][03784] Avg episode reward: [(0, '0.857')] [2024-03-21 00:39:42,116][04017] Updated weights for policy 0, policy_version 9794 (0.0021) [2024-03-21 00:39:45,521][03784] Fps is (10 sec: 62258.6, 60 sec: 49698.1, 300 sec: 45986.3). Total num frames: 321159168. Throughput: 0: 47106.6. Samples: 322329900. Policy #0 lag: (min: 2.0, avg: 38.7, max: 77.0) [2024-03-21 00:39:45,522][03784] Avg episode reward: [(0, '0.923')] [2024-03-21 00:39:46,515][04017] Updated weights for policy 0, policy_version 9804 (0.0011) [2024-03-21 00:39:50,521][03784] Fps is (10 sec: 68813.4, 60 sec: 48605.9, 300 sec: 46097.4). Total num frames: 321454080. Throughput: 0: 46753.3. Samples: 322608800. Policy #0 lag: (min: 2.0, avg: 38.7, max: 77.0) [2024-03-21 00:39:50,522][03784] Avg episode reward: [(0, '0.973')] [2024-03-21 00:39:52,723][04017] Updated weights for policy 0, policy_version 9814 (0.0021) [2024-03-21 00:39:55,521][03784] Fps is (10 sec: 52429.3, 60 sec: 49698.1, 300 sec: 46763.8). Total num frames: 321683456. Throughput: 0: 47764.6. Samples: 322905000. Policy #0 lag: (min: 2.0, avg: 41.3, max: 83.0) [2024-03-21 00:39:55,522][03784] Avg episode reward: [(0, '0.973')] [2024-03-21 00:39:58,127][03995] Signal inference workers to stop experience collection... (6500 times) [2024-03-21 00:39:58,128][03995] Signal inference workers to resume experience collection... (6500 times) [2024-03-21 00:39:58,193][04017] InferenceWorker_p0-w0: stopping experience collection (6500 times) [2024-03-21 00:39:58,193][04017] InferenceWorker_p0-w0: resuming experience collection (6500 times) [2024-03-21 00:40:00,521][03784] Fps is (10 sec: 42598.2, 60 sec: 50244.2, 300 sec: 46541.7). Total num frames: 321880064. Throughput: 0: 47571.1. Samples: 323049900. Policy #0 lag: (min: 2.0, avg: 41.3, max: 83.0) [2024-03-21 00:40:00,522][03784] Avg episode reward: [(0, '0.564')] [2024-03-21 00:40:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000009823_321880064.pth... [2024-03-21 00:40:00,653][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000009481_310673408.pth [2024-03-21 00:40:01,978][04017] Updated weights for policy 0, policy_version 9824 (0.0015) [2024-03-21 00:40:05,521][03784] Fps is (10 sec: 39321.5, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 322076672. Throughput: 0: 47231.1. Samples: 323332000. Policy #0 lag: (min: 0.0, avg: 42.6, max: 97.0) [2024-03-21 00:40:05,522][03784] Avg episode reward: [(0, '0.686')] [2024-03-21 00:40:09,732][04017] Updated weights for policy 0, policy_version 9834 (0.0017) [2024-03-21 00:40:10,521][03784] Fps is (10 sec: 39321.2, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 322273280. Throughput: 0: 46522.1. Samples: 323596300. Policy #0 lag: (min: 0.0, avg: 39.9, max: 95.0) [2024-03-21 00:40:10,522][03784] Avg episode reward: [(0, '1.094')] [2024-03-21 00:40:15,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 46097.9). Total num frames: 322502656. Throughput: 0: 46533.3. Samples: 323738000. Policy #0 lag: (min: 0.0, avg: 39.9, max: 95.0) [2024-03-21 00:40:15,522][03784] Avg episode reward: [(0, '0.546')] [2024-03-21 00:40:20,521][03784] Fps is (10 sec: 29491.6, 60 sec: 43690.6, 300 sec: 45875.2). Total num frames: 322568192. Throughput: 0: 46446.7. Samples: 324020000. Policy #0 lag: (min: 0.0, avg: 28.2, max: 71.0) [2024-03-21 00:40:20,522][03784] Avg episode reward: [(0, '0.737')] [2024-03-21 00:40:20,555][04017] Updated weights for policy 0, policy_version 9844 (0.0015) [2024-03-21 00:40:25,521][03784] Fps is (10 sec: 13107.0, 60 sec: 41506.0, 300 sec: 45208.7). Total num frames: 322633728. Throughput: 0: 46726.5. Samples: 324301200. Policy #0 lag: (min: 0.0, avg: 28.2, max: 71.0) [2024-03-21 00:40:25,522][03784] Avg episode reward: [(0, '0.491')] [2024-03-21 00:40:28,393][04017] Updated weights for policy 0, policy_version 9854 (0.0015) [2024-03-21 00:40:30,521][03784] Fps is (10 sec: 52428.8, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 323092480. Throughput: 0: 46004.6. Samples: 324400100. Policy #0 lag: (min: 2.0, avg: 39.6, max: 102.0) [2024-03-21 00:40:30,522][03784] Avg episode reward: [(0, '0.967')] [2024-03-21 00:40:31,479][04017] Updated weights for policy 0, policy_version 9864 (0.0017) [2024-03-21 00:40:35,521][03784] Fps is (10 sec: 72090.6, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 323354624. Throughput: 0: 45304.4. Samples: 324647500. Policy #0 lag: (min: 2.0, avg: 39.6, max: 102.0) [2024-03-21 00:40:35,522][03784] Avg episode reward: [(0, '1.075')] [2024-03-21 00:40:38,629][04017] Updated weights for policy 0, policy_version 9874 (0.0014) [2024-03-21 00:40:40,521][03784] Fps is (10 sec: 58982.2, 60 sec: 48605.9, 300 sec: 46430.6). Total num frames: 323682304. Throughput: 0: 44817.7. Samples: 324921800. Policy #0 lag: (min: 0.0, avg: 38.1, max: 73.0) [2024-03-21 00:40:40,522][03784] Avg episode reward: [(0, '0.813')] [2024-03-21 00:40:43,362][04017] Updated weights for policy 0, policy_version 9884 (0.0013) [2024-03-21 00:40:45,521][03784] Fps is (10 sec: 52429.4, 60 sec: 45329.2, 300 sec: 46541.7). Total num frames: 323878912. Throughput: 0: 44673.4. Samples: 325060200. Policy #0 lag: (min: 0.0, avg: 38.1, max: 73.0) [2024-03-21 00:40:45,522][03784] Avg episode reward: [(0, '0.717')] [2024-03-21 00:40:50,521][03784] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 45986.3). Total num frames: 324075520. Throughput: 0: 44684.4. Samples: 325342800. Policy #0 lag: (min: 0.0, avg: 42.7, max: 72.0) [2024-03-21 00:40:50,522][03784] Avg episode reward: [(0, '1.057')] [2024-03-21 00:40:51,691][04017] Updated weights for policy 0, policy_version 9894 (0.0020) [2024-03-21 00:40:53,860][03995] Signal inference workers to stop experience collection... (6550 times) [2024-03-21 00:40:53,861][03995] Signal inference workers to resume experience collection... (6550 times) [2024-03-21 00:40:53,964][04017] InferenceWorker_p0-w0: stopping experience collection (6550 times) [2024-03-21 00:40:53,964][04017] InferenceWorker_p0-w0: resuming experience collection (6550 times) [2024-03-21 00:40:55,521][03784] Fps is (10 sec: 52428.6, 60 sec: 45329.1, 300 sec: 46541.7). Total num frames: 324403200. Throughput: 0: 44033.5. Samples: 325577800. Policy #0 lag: (min: 0.0, avg: 42.7, max: 72.0) [2024-03-21 00:40:55,522][03784] Avg episode reward: [(0, '1.036')] [2024-03-21 00:40:58,875][04017] Updated weights for policy 0, policy_version 9904 (0.0012) [2024-03-21 00:41:00,521][03784] Fps is (10 sec: 52429.4, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 324599808. Throughput: 0: 43675.6. Samples: 325703400. Policy #0 lag: (min: 0.0, avg: 46.3, max: 80.0) [2024-03-21 00:41:00,522][03784] Avg episode reward: [(0, '0.708')] [2024-03-21 00:41:05,521][03784] Fps is (10 sec: 22937.6, 60 sec: 42598.4, 300 sec: 45764.1). Total num frames: 324632576. Throughput: 0: 44073.3. Samples: 326003300. Policy #0 lag: (min: 0.0, avg: 46.3, max: 80.0) [2024-03-21 00:41:05,522][03784] Avg episode reward: [(0, '0.823')] [2024-03-21 00:41:09,032][04017] Updated weights for policy 0, policy_version 9914 (0.0011) [2024-03-21 00:41:10,521][03784] Fps is (10 sec: 26214.1, 60 sec: 43144.6, 300 sec: 45653.0). Total num frames: 324861952. Throughput: 0: 43929.0. Samples: 326278000. Policy #0 lag: (min: 0.0, avg: 36.6, max: 70.0) [2024-03-21 00:41:10,522][03784] Avg episode reward: [(0, '0.597')] [2024-03-21 00:41:15,521][03784] Fps is (10 sec: 36044.8, 60 sec: 41506.2, 300 sec: 45430.9). Total num frames: 324993024. Throughput: 0: 44711.1. Samples: 326412100. Policy #0 lag: (min: 0.0, avg: 27.2, max: 71.0) [2024-03-21 00:41:15,522][03784] Avg episode reward: [(0, '1.131')] [2024-03-21 00:41:19,632][04017] Updated weights for policy 0, policy_version 9924 (0.0013) [2024-03-21 00:41:20,521][03784] Fps is (10 sec: 39322.0, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 325255168. Throughput: 0: 45182.3. Samples: 326680700. Policy #0 lag: (min: 0.0, avg: 27.2, max: 71.0) [2024-03-21 00:41:20,522][03784] Avg episode reward: [(0, '0.777')] [2024-03-21 00:41:25,521][03784] Fps is (10 sec: 45874.9, 60 sec: 46967.6, 300 sec: 45097.7). Total num frames: 325451776. Throughput: 0: 45100.0. Samples: 326951300. Policy #0 lag: (min: 0.0, avg: 35.9, max: 77.0) [2024-03-21 00:41:25,522][03784] Avg episode reward: [(0, '0.921')] [2024-03-21 00:41:26,422][04017] Updated weights for policy 0, policy_version 9934 (0.0011) [2024-03-21 00:41:29,997][04017] Updated weights for policy 0, policy_version 9944 (0.0015) [2024-03-21 00:41:30,521][03784] Fps is (10 sec: 62258.9, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 325877760. Throughput: 0: 45124.4. Samples: 327090800. Policy #0 lag: (min: 0.0, avg: 35.9, max: 77.0) [2024-03-21 00:41:30,522][03784] Avg episode reward: [(0, '0.777')] [2024-03-21 00:41:35,521][03784] Fps is (10 sec: 68812.2, 60 sec: 46421.3, 300 sec: 46541.7). Total num frames: 326139904. Throughput: 0: 44979.9. Samples: 327366900. Policy #0 lag: (min: 1.0, avg: 34.9, max: 67.0) [2024-03-21 00:41:35,522][03784] Avg episode reward: [(0, '1.013')] [2024-03-21 00:41:35,728][04017] Updated weights for policy 0, policy_version 9954 (0.0012) [2024-03-21 00:41:40,521][03784] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 46208.4). Total num frames: 326303744. Throughput: 0: 45868.8. Samples: 327641900. Policy #0 lag: (min: 0.0, avg: 39.5, max: 114.0) [2024-03-21 00:41:40,522][03784] Avg episode reward: [(0, '1.354')] [2024-03-21 00:41:43,172][04017] Updated weights for policy 0, policy_version 9964 (0.0016) [2024-03-21 00:41:45,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45328.9, 300 sec: 46319.5). Total num frames: 326598656. Throughput: 0: 45768.7. Samples: 327763000. Policy #0 lag: (min: 0.0, avg: 39.5, max: 114.0) [2024-03-21 00:41:45,522][03784] Avg episode reward: [(0, '1.250')] [2024-03-21 00:41:47,071][03995] Signal inference workers to stop experience collection... (6600 times) [2024-03-21 00:41:47,072][03995] Signal inference workers to resume experience collection... (6600 times) [2024-03-21 00:41:47,244][04017] InferenceWorker_p0-w0: stopping experience collection (6600 times) [2024-03-21 00:41:47,331][04017] InferenceWorker_p0-w0: resuming experience collection (6600 times) [2024-03-21 00:41:50,521][03784] Fps is (10 sec: 49152.1, 60 sec: 45329.1, 300 sec: 46208.4). Total num frames: 326795264. Throughput: 0: 44988.8. Samples: 328027800. Policy #0 lag: (min: 0.0, avg: 40.6, max: 81.0) [2024-03-21 00:41:50,522][03784] Avg episode reward: [(0, '0.704')] [2024-03-21 00:41:50,737][04017] Updated weights for policy 0, policy_version 9974 (0.0011) [2024-03-21 00:41:55,521][03784] Fps is (10 sec: 29491.8, 60 sec: 41506.1, 300 sec: 45430.9). Total num frames: 326893568. Throughput: 0: 44942.3. Samples: 328300400. Policy #0 lag: (min: 0.0, avg: 40.6, max: 81.0) [2024-03-21 00:41:55,522][03784] Avg episode reward: [(0, '1.231')] [2024-03-21 00:41:59,786][04017] Updated weights for policy 0, policy_version 9984 (0.0011) [2024-03-21 00:42:00,521][03784] Fps is (10 sec: 36044.9, 60 sec: 42598.4, 300 sec: 45653.0). Total num frames: 327155712. Throughput: 0: 45286.6. Samples: 328450000. Policy #0 lag: (min: 0.0, avg: 40.5, max: 108.0) [2024-03-21 00:42:00,522][03784] Avg episode reward: [(0, '1.261')] [2024-03-21 00:42:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000009984_327155712.pth... [2024-03-21 00:42:00,695][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000009652_316276736.pth [2024-03-21 00:42:05,521][03784] Fps is (10 sec: 42598.4, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 327319552. Throughput: 0: 45486.7. Samples: 328727600. Policy #0 lag: (min: 0.0, avg: 40.5, max: 108.0) [2024-03-21 00:42:05,522][03784] Avg episode reward: [(0, '1.106')] [2024-03-21 00:42:08,205][04017] Updated weights for policy 0, policy_version 9994 (0.0019) [2024-03-21 00:42:10,521][03784] Fps is (10 sec: 45875.4, 60 sec: 45875.3, 300 sec: 45653.1). Total num frames: 327614464. Throughput: 0: 45526.8. Samples: 329000000. Policy #0 lag: (min: 0.0, avg: 29.2, max: 68.0) [2024-03-21 00:42:10,522][03784] Avg episode reward: [(0, '0.606')] [2024-03-21 00:42:14,117][04017] Updated weights for policy 0, policy_version 10004 (0.0011) [2024-03-21 00:42:15,521][03784] Fps is (10 sec: 55705.1, 60 sec: 48059.7, 300 sec: 45986.6). Total num frames: 327876608. Throughput: 0: 45366.6. Samples: 329132300. Policy #0 lag: (min: 1.0, avg: 39.4, max: 77.0) [2024-03-21 00:42:15,522][03784] Avg episode reward: [(0, '1.265')] [2024-03-21 00:42:19,078][04017] Updated weights for policy 0, policy_version 10014 (0.0033) [2024-03-21 00:42:20,521][03784] Fps is (10 sec: 65536.0, 60 sec: 50244.3, 300 sec: 46652.8). Total num frames: 328269824. Throughput: 0: 45489.1. Samples: 329413900. Policy #0 lag: (min: 1.0, avg: 39.4, max: 77.0) [2024-03-21 00:42:20,522][03784] Avg episode reward: [(0, '0.847')] [2024-03-21 00:42:25,521][03784] Fps is (10 sec: 49152.4, 60 sec: 48605.9, 300 sec: 45875.2). Total num frames: 328368128. Throughput: 0: 45937.9. Samples: 329709100. Policy #0 lag: (min: 0.0, avg: 39.8, max: 83.0) [2024-03-21 00:42:25,522][03784] Avg episode reward: [(0, '0.714')] [2024-03-21 00:42:26,525][04017] Updated weights for policy 0, policy_version 10024 (0.0013) [2024-03-21 00:42:30,521][03784] Fps is (10 sec: 32767.7, 60 sec: 45329.1, 300 sec: 46208.4). Total num frames: 328597504. Throughput: 0: 46397.9. Samples: 329850900. Policy #0 lag: (min: 0.0, avg: 39.8, max: 83.0) [2024-03-21 00:42:30,522][03784] Avg episode reward: [(0, '1.035')] [2024-03-21 00:42:33,629][04017] Updated weights for policy 0, policy_version 10034 (0.0010) [2024-03-21 00:42:35,521][03784] Fps is (10 sec: 42598.4, 60 sec: 44236.9, 300 sec: 46319.5). Total num frames: 328794112. Throughput: 0: 46953.4. Samples: 330140700. Policy #0 lag: (min: 1.0, avg: 47.9, max: 98.0) [2024-03-21 00:42:35,522][03784] Avg episode reward: [(0, '1.143')] [2024-03-21 00:42:39,530][03995] Signal inference workers to stop experience collection... (6650 times) [2024-03-21 00:42:39,602][04017] InferenceWorker_p0-w0: stopping experience collection (6650 times) [2024-03-21 00:42:39,758][03995] Signal inference workers to resume experience collection... (6650 times) [2024-03-21 00:42:39,758][04017] InferenceWorker_p0-w0: resuming experience collection (6650 times) [2024-03-21 00:42:39,760][04017] Updated weights for policy 0, policy_version 10044 (0.0017) [2024-03-21 00:42:40,521][03784] Fps is (10 sec: 58982.7, 60 sec: 48059.8, 300 sec: 46763.8). Total num frames: 329187328. Throughput: 0: 47286.6. Samples: 330428300. Policy #0 lag: (min: 1.0, avg: 47.9, max: 98.0) [2024-03-21 00:42:40,522][03784] Avg episode reward: [(0, '1.164')] [2024-03-21 00:42:45,521][03784] Fps is (10 sec: 52428.8, 60 sec: 45329.2, 300 sec: 46430.6). Total num frames: 329318400. Throughput: 0: 47111.1. Samples: 330570000. Policy #0 lag: (min: 0.0, avg: 45.3, max: 87.0) [2024-03-21 00:42:45,522][03784] Avg episode reward: [(0, '1.101')] [2024-03-21 00:42:46,880][04017] Updated weights for policy 0, policy_version 10054 (0.0011) [2024-03-21 00:42:50,521][03784] Fps is (10 sec: 42598.6, 60 sec: 46967.5, 300 sec: 46430.6). Total num frames: 329613312. Throughput: 0: 47253.3. Samples: 330854000. Policy #0 lag: (min: 2.0, avg: 46.5, max: 93.0) [2024-03-21 00:42:50,522][03784] Avg episode reward: [(0, '1.275')] [2024-03-21 00:42:54,739][04017] Updated weights for policy 0, policy_version 10064 (0.0018) [2024-03-21 00:42:55,521][03784] Fps is (10 sec: 49151.1, 60 sec: 48605.7, 300 sec: 45986.2). Total num frames: 329809920. Throughput: 0: 47495.3. Samples: 331137300. Policy #0 lag: (min: 2.0, avg: 46.5, max: 93.0) [2024-03-21 00:42:55,522][03784] Avg episode reward: [(0, '0.530')] [2024-03-21 00:43:00,522][03784] Fps is (10 sec: 36043.2, 60 sec: 46967.2, 300 sec: 45986.2). Total num frames: 329973760. Throughput: 0: 47739.6. Samples: 331280600. Policy #0 lag: (min: 0.0, avg: 40.0, max: 88.0) [2024-03-21 00:43:00,522][03784] Avg episode reward: [(0, '0.556')] [2024-03-21 00:43:02,470][04017] Updated weights for policy 0, policy_version 10074 (0.0012) [2024-03-21 00:43:05,521][03784] Fps is (10 sec: 52429.9, 60 sec: 50244.3, 300 sec: 46874.9). Total num frames: 330334208. Throughput: 0: 47317.8. Samples: 331543200. Policy #0 lag: (min: 0.0, avg: 40.0, max: 88.0) [2024-03-21 00:43:05,522][03784] Avg episode reward: [(0, '0.573')] [2024-03-21 00:43:06,513][04017] Updated weights for policy 0, policy_version 10084 (0.0012) [2024-03-21 00:43:10,521][03784] Fps is (10 sec: 62261.5, 60 sec: 49698.1, 300 sec: 46652.8). Total num frames: 330596352. Throughput: 0: 46871.1. Samples: 331818300. Policy #0 lag: (min: 0.0, avg: 44.2, max: 93.0) [2024-03-21 00:43:10,522][03784] Avg episode reward: [(0, '0.430')] [2024-03-21 00:43:14,725][04017] Updated weights for policy 0, policy_version 10094 (0.0010) [2024-03-21 00:43:15,521][03784] Fps is (10 sec: 45874.6, 60 sec: 48605.8, 300 sec: 45986.3). Total num frames: 330792960. Throughput: 0: 46960.0. Samples: 331964100. Policy #0 lag: (min: 0.0, avg: 44.2, max: 93.0) [2024-03-21 00:43:15,522][03784] Avg episode reward: [(0, '1.242')] [2024-03-21 00:43:20,521][03784] Fps is (10 sec: 45874.9, 60 sec: 46421.2, 300 sec: 46208.4). Total num frames: 331055104. Throughput: 0: 46617.7. Samples: 332238500. Policy #0 lag: (min: 2.0, avg: 32.1, max: 65.0) [2024-03-21 00:43:20,522][03784] Avg episode reward: [(0, '0.625')] [2024-03-21 00:43:20,754][04017] Updated weights for policy 0, policy_version 10104 (0.0010) [2024-03-21 00:43:25,521][03784] Fps is (10 sec: 49152.2, 60 sec: 48605.8, 300 sec: 46652.8). Total num frames: 331284480. Throughput: 0: 46697.7. Samples: 332529700. Policy #0 lag: (min: 1.0, avg: 58.2, max: 114.0) [2024-03-21 00:43:25,522][03784] Avg episode reward: [(0, '1.061')] [2024-03-21 00:43:28,507][04017] Updated weights for policy 0, policy_version 10114 (0.0015) [2024-03-21 00:43:30,521][03784] Fps is (10 sec: 39321.8, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 331448320. Throughput: 0: 46657.7. Samples: 332669600. Policy #0 lag: (min: 1.0, avg: 58.2, max: 114.0) [2024-03-21 00:43:30,522][03784] Avg episode reward: [(0, '0.572')] [2024-03-21 00:43:35,521][03784] Fps is (10 sec: 26214.6, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 331546624. Throughput: 0: 47120.0. Samples: 332974400. Policy #0 lag: (min: 1.0, avg: 24.3, max: 54.0) [2024-03-21 00:43:35,522][03784] Avg episode reward: [(0, '0.572')] [2024-03-21 00:43:38,201][04017] Updated weights for policy 0, policy_version 10124 (0.0015) [2024-03-21 00:43:40,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45329.0, 300 sec: 46541.7). Total num frames: 331907072. Throughput: 0: 46689.0. Samples: 333238300. Policy #0 lag: (min: 1.0, avg: 24.3, max: 54.0) [2024-03-21 00:43:40,522][03784] Avg episode reward: [(0, '0.896')] [2024-03-21 00:43:42,527][03995] Signal inference workers to stop experience collection... (6700 times) [2024-03-21 00:43:42,624][04017] InferenceWorker_p0-w0: stopping experience collection (6700 times) [2024-03-21 00:43:42,705][03995] Signal inference workers to resume experience collection... (6700 times) [2024-03-21 00:43:42,706][04017] InferenceWorker_p0-w0: resuming experience collection (6700 times) [2024-03-21 00:43:43,391][04017] Updated weights for policy 0, policy_version 10134 (0.0019) [2024-03-21 00:43:45,521][03784] Fps is (10 sec: 52428.7, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 332070912. Throughput: 0: 46378.2. Samples: 333367600. Policy #0 lag: (min: 0.0, avg: 45.2, max: 94.0) [2024-03-21 00:43:45,522][03784] Avg episode reward: [(0, '0.759')] [2024-03-21 00:43:50,521][03784] Fps is (10 sec: 22937.5, 60 sec: 42052.2, 300 sec: 45541.9). Total num frames: 332136448. Throughput: 0: 47635.4. Samples: 333686800. Policy #0 lag: (min: 0.0, avg: 45.2, max: 94.0) [2024-03-21 00:43:50,522][03784] Avg episode reward: [(0, '1.163')] [2024-03-21 00:43:53,160][04017] Updated weights for policy 0, policy_version 10144 (0.0015) [2024-03-21 00:43:55,521][03784] Fps is (10 sec: 55705.3, 60 sec: 46967.6, 300 sec: 46652.7). Total num frames: 332627968. Throughput: 0: 46940.0. Samples: 333930600. Policy #0 lag: (min: 1.0, avg: 38.7, max: 87.0) [2024-03-21 00:43:55,522][03784] Avg episode reward: [(0, '0.444')] [2024-03-21 00:43:56,348][04017] Updated weights for policy 0, policy_version 10154 (0.0012) [2024-03-21 00:44:00,521][03784] Fps is (10 sec: 78644.5, 60 sec: 49152.4, 300 sec: 46541.7). Total num frames: 332922880. Throughput: 0: 46706.8. Samples: 334065900. Policy #0 lag: (min: 1.0, avg: 39.5, max: 73.0) [2024-03-21 00:44:00,522][03784] Avg episode reward: [(0, '1.060')] [2024-03-21 00:44:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000010160_332922880.pth... [2024-03-21 00:44:00,659][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000009823_321880064.pth [2024-03-21 00:44:01,996][04017] Updated weights for policy 0, policy_version 10164 (0.0019) [2024-03-21 00:44:05,521][03784] Fps is (10 sec: 58982.3, 60 sec: 48059.7, 300 sec: 46208.4). Total num frames: 333217792. Throughput: 0: 46906.7. Samples: 334349300. Policy #0 lag: (min: 1.0, avg: 39.5, max: 73.0) [2024-03-21 00:44:05,522][03784] Avg episode reward: [(0, '1.055')] [2024-03-21 00:44:09,457][04017] Updated weights for policy 0, policy_version 10174 (0.0016) [2024-03-21 00:44:10,521][03784] Fps is (10 sec: 52428.2, 60 sec: 47513.6, 300 sec: 46208.4). Total num frames: 333447168. Throughput: 0: 46686.6. Samples: 334630600. Policy #0 lag: (min: 0.0, avg: 36.0, max: 75.0) [2024-03-21 00:44:10,522][03784] Avg episode reward: [(0, '1.011')] [2024-03-21 00:44:15,521][03784] Fps is (10 sec: 39322.0, 60 sec: 46967.6, 300 sec: 46319.5). Total num frames: 333611008. Throughput: 0: 46815.6. Samples: 334776300. Policy #0 lag: (min: 0.0, avg: 36.0, max: 75.0) [2024-03-21 00:44:15,522][03784] Avg episode reward: [(0, '0.728')] [2024-03-21 00:44:19,147][04017] Updated weights for policy 0, policy_version 10184 (0.0011) [2024-03-21 00:44:20,521][03784] Fps is (10 sec: 29491.2, 60 sec: 44782.9, 300 sec: 46097.3). Total num frames: 333742080. Throughput: 0: 46362.1. Samples: 335060700. Policy #0 lag: (min: 0.0, avg: 51.7, max: 113.0) [2024-03-21 00:44:20,522][03784] Avg episode reward: [(0, '0.888')] [2024-03-21 00:44:25,521][03784] Fps is (10 sec: 29491.3, 60 sec: 43690.8, 300 sec: 45986.3). Total num frames: 333905920. Throughput: 0: 46844.6. Samples: 335346300. Policy #0 lag: (min: 0.0, avg: 42.4, max: 88.0) [2024-03-21 00:44:25,521][03784] Avg episode reward: [(0, '0.888')] [2024-03-21 00:44:27,403][04017] Updated weights for policy 0, policy_version 10194 (0.0015) [2024-03-21 00:44:30,521][03784] Fps is (10 sec: 55706.0, 60 sec: 47513.6, 300 sec: 46652.7). Total num frames: 334299136. Throughput: 0: 46771.1. Samples: 335472300. Policy #0 lag: (min: 0.0, avg: 42.4, max: 88.0) [2024-03-21 00:44:30,522][03784] Avg episode reward: [(0, '0.838')] [2024-03-21 00:44:32,140][04017] Updated weights for policy 0, policy_version 10204 (0.0015) [2024-03-21 00:44:35,527][03784] Fps is (10 sec: 45849.4, 60 sec: 46963.1, 300 sec: 46096.5). Total num frames: 334364672. Throughput: 0: 45519.0. Samples: 335735400. Policy #0 lag: (min: 0.0, avg: 42.4, max: 88.0) [2024-03-21 00:44:35,527][03784] Avg episode reward: [(0, '0.882')] [2024-03-21 00:44:39,104][03995] Signal inference workers to stop experience collection... (6750 times) [2024-03-21 00:44:39,183][04017] InferenceWorker_p0-w0: stopping experience collection (6750 times) [2024-03-21 00:44:39,336][03995] Signal inference workers to resume experience collection... (6750 times) [2024-03-21 00:44:39,336][04017] InferenceWorker_p0-w0: resuming experience collection (6750 times) [2024-03-21 00:44:40,521][03784] Fps is (10 sec: 32767.6, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 334626816. Throughput: 0: 46148.8. Samples: 336007300. Policy #0 lag: (min: 0.0, avg: 36.0, max: 82.0) [2024-03-21 00:44:40,522][03784] Avg episode reward: [(0, '0.480')] [2024-03-21 00:44:42,532][04017] Updated weights for policy 0, policy_version 10214 (0.0014) [2024-03-21 00:44:45,521][03784] Fps is (10 sec: 55736.1, 60 sec: 47513.5, 300 sec: 45653.0). Total num frames: 334921728. Throughput: 0: 46126.5. Samples: 336141600. Policy #0 lag: (min: 3.0, avg: 34.0, max: 81.0) [2024-03-21 00:44:45,522][03784] Avg episode reward: [(0, '0.956')] [2024-03-21 00:44:47,275][04017] Updated weights for policy 0, policy_version 10224 (0.0015) [2024-03-21 00:44:50,521][03784] Fps is (10 sec: 55706.1, 60 sec: 50790.5, 300 sec: 45764.1). Total num frames: 335183872. Throughput: 0: 45513.4. Samples: 336397400. Policy #0 lag: (min: 3.0, avg: 34.0, max: 81.0) [2024-03-21 00:44:50,522][03784] Avg episode reward: [(0, '0.808')] [2024-03-21 00:44:55,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44236.8, 300 sec: 45430.9). Total num frames: 335282176. Throughput: 0: 45628.9. Samples: 336683900. Policy #0 lag: (min: 3.0, avg: 34.0, max: 81.0) [2024-03-21 00:44:55,522][03784] Avg episode reward: [(0, '0.640')] [2024-03-21 00:44:56,097][04017] Updated weights for policy 0, policy_version 10234 (0.0011) [2024-03-21 00:45:00,521][03784] Fps is (10 sec: 39321.0, 60 sec: 44236.6, 300 sec: 45764.1). Total num frames: 335577088. Throughput: 0: 45422.0. Samples: 336820300. Policy #0 lag: (min: 0.0, avg: 37.3, max: 93.0) [2024-03-21 00:45:00,523][03784] Avg episode reward: [(0, '0.925')] [2024-03-21 00:45:01,476][04017] Updated weights for policy 0, policy_version 10244 (0.0011) [2024-03-21 00:45:05,521][03784] Fps is (10 sec: 52429.2, 60 sec: 43144.6, 300 sec: 45875.2). Total num frames: 335806464. Throughput: 0: 44846.7. Samples: 337078800. Policy #0 lag: (min: 0.0, avg: 37.3, max: 93.0) [2024-03-21 00:45:05,522][03784] Avg episode reward: [(0, '0.861')] [2024-03-21 00:45:10,521][03784] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 45653.0). Total num frames: 335970304. Throughput: 0: 44668.7. Samples: 337356400. Policy #0 lag: (min: 0.0, avg: 37.6, max: 72.0) [2024-03-21 00:45:10,522][03784] Avg episode reward: [(0, '1.140')] [2024-03-21 00:45:10,945][04017] Updated weights for policy 0, policy_version 10254 (0.0011) [2024-03-21 00:45:15,521][03784] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 46208.4). Total num frames: 336199680. Throughput: 0: 45004.5. Samples: 337497500. Policy #0 lag: (min: 0.0, avg: 37.6, max: 72.0) [2024-03-21 00:45:15,522][03784] Avg episode reward: [(0, '1.074')] [2024-03-21 00:45:16,652][04017] Updated weights for policy 0, policy_version 10264 (0.0012) [2024-03-21 00:45:20,521][03784] Fps is (10 sec: 55706.2, 60 sec: 46421.4, 300 sec: 47097.1). Total num frames: 336527360. Throughput: 0: 44967.8. Samples: 337758700. Policy #0 lag: (min: 3.0, avg: 52.9, max: 118.0) [2024-03-21 00:45:20,522][03784] Avg episode reward: [(0, '0.724')] [2024-03-21 00:45:24,120][04017] Updated weights for policy 0, policy_version 10274 (0.0019) [2024-03-21 00:45:25,521][03784] Fps is (10 sec: 52428.9, 60 sec: 46967.5, 300 sec: 46208.4). Total num frames: 336723968. Throughput: 0: 44986.8. Samples: 338031700. Policy #0 lag: (min: 3.0, avg: 52.9, max: 118.0) [2024-03-21 00:45:25,522][03784] Avg episode reward: [(0, '0.836')] [2024-03-21 00:45:30,521][03784] Fps is (10 sec: 39321.5, 60 sec: 43690.7, 300 sec: 45986.3). Total num frames: 336920576. Throughput: 0: 45131.2. Samples: 338172500. Policy #0 lag: (min: 0.0, avg: 29.5, max: 69.0) [2024-03-21 00:45:30,522][03784] Avg episode reward: [(0, '1.124')] [2024-03-21 00:45:32,321][04017] Updated weights for policy 0, policy_version 10284 (0.0012) [2024-03-21 00:45:33,807][03995] Signal inference workers to stop experience collection... (6800 times) [2024-03-21 00:45:33,881][03995] Signal inference workers to resume experience collection... (6800 times) [2024-03-21 00:45:33,893][04017] InferenceWorker_p0-w0: stopping experience collection (6800 times) [2024-03-21 00:45:33,936][04017] InferenceWorker_p0-w0: resuming experience collection (6800 times) [2024-03-21 00:45:35,521][03784] Fps is (10 sec: 49151.3, 60 sec: 47517.9, 300 sec: 45875.2). Total num frames: 337215488. Throughput: 0: 45437.7. Samples: 338442100. Policy #0 lag: (min: 0.0, avg: 43.5, max: 109.0) [2024-03-21 00:45:35,522][03784] Avg episode reward: [(0, '1.124')] [2024-03-21 00:45:38,104][04017] Updated weights for policy 0, policy_version 10294 (0.0016) [2024-03-21 00:45:40,521][03784] Fps is (10 sec: 52428.7, 60 sec: 46967.6, 300 sec: 45986.3). Total num frames: 337444864. Throughput: 0: 45340.1. Samples: 338724200. Policy #0 lag: (min: 0.0, avg: 43.5, max: 109.0) [2024-03-21 00:45:40,522][03784] Avg episode reward: [(0, '1.124')] [2024-03-21 00:45:44,949][04017] Updated weights for policy 0, policy_version 10304 (0.0011) [2024-03-21 00:45:45,521][03784] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 46097.4). Total num frames: 337674240. Throughput: 0: 45469.0. Samples: 338866400. Policy #0 lag: (min: 1.0, avg: 60.7, max: 117.0) [2024-03-21 00:45:45,522][03784] Avg episode reward: [(0, '0.741')] [2024-03-21 00:45:50,521][03784] Fps is (10 sec: 42598.5, 60 sec: 44783.0, 300 sec: 45653.0). Total num frames: 337870848. Throughput: 0: 45984.5. Samples: 339148100. Policy #0 lag: (min: 1.0, avg: 60.7, max: 117.0) [2024-03-21 00:45:50,522][03784] Avg episode reward: [(0, '0.885')] [2024-03-21 00:45:51,686][04017] Updated weights for policy 0, policy_version 10314 (0.0017) [2024-03-21 00:45:55,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46967.5, 300 sec: 45764.1). Total num frames: 338100224. Throughput: 0: 45424.5. Samples: 339400500. Policy #0 lag: (min: 1.0, avg: 37.4, max: 73.0) [2024-03-21 00:45:55,522][03784] Avg episode reward: [(0, '1.170')] [2024-03-21 00:45:59,563][04017] Updated weights for policy 0, policy_version 10324 (0.0013) [2024-03-21 00:46:00,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45875.3, 300 sec: 46430.6). Total num frames: 338329600. Throughput: 0: 45277.7. Samples: 339535000. Policy #0 lag: (min: 1.0, avg: 37.4, max: 73.0) [2024-03-21 00:46:00,522][03784] Avg episode reward: [(0, '1.126')] [2024-03-21 00:46:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000010325_338329600.pth... [2024-03-21 00:46:00,693][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000009984_327155712.pth [2024-03-21 00:46:05,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45329.0, 300 sec: 46319.5). Total num frames: 338526208. Throughput: 0: 45726.6. Samples: 339816400. Policy #0 lag: (min: 0.0, avg: 44.0, max: 91.0) [2024-03-21 00:46:05,522][03784] Avg episode reward: [(0, '0.563')] [2024-03-21 00:46:06,459][04017] Updated weights for policy 0, policy_version 10334 (0.0011) [2024-03-21 00:46:10,521][03784] Fps is (10 sec: 49151.7, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 338821120. Throughput: 0: 45904.3. Samples: 340097400. Policy #0 lag: (min: 0.0, avg: 44.0, max: 91.0) [2024-03-21 00:46:10,522][03784] Avg episode reward: [(0, '0.563')] [2024-03-21 00:46:15,521][03784] Fps is (10 sec: 36045.1, 60 sec: 44782.9, 300 sec: 46208.4). Total num frames: 338886656. Throughput: 0: 46166.7. Samples: 340250000. Policy #0 lag: (min: 0.0, avg: 41.9, max: 108.0) [2024-03-21 00:46:15,522][03784] Avg episode reward: [(0, '0.968')] [2024-03-21 00:46:17,162][04017] Updated weights for policy 0, policy_version 10344 (0.0020) [2024-03-21 00:46:20,521][03784] Fps is (10 sec: 42598.3, 60 sec: 45329.0, 300 sec: 46763.8). Total num frames: 339247104. Throughput: 0: 46346.6. Samples: 340527700. Policy #0 lag: (min: 0.0, avg: 41.9, max: 108.0) [2024-03-21 00:46:20,522][03784] Avg episode reward: [(0, '1.353')] [2024-03-21 00:46:20,693][04017] Updated weights for policy 0, policy_version 10354 (0.0015) [2024-03-21 00:46:25,521][03784] Fps is (10 sec: 65536.2, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 339542016. Throughput: 0: 45860.1. Samples: 340787900. Policy #0 lag: (min: 2.0, avg: 37.0, max: 75.0) [2024-03-21 00:46:25,522][03784] Avg episode reward: [(0, '1.293')] [2024-03-21 00:46:26,011][04017] Updated weights for policy 0, policy_version 10364 (0.0016) [2024-03-21 00:46:30,521][03784] Fps is (10 sec: 49152.6, 60 sec: 46967.5, 300 sec: 46097.4). Total num frames: 339738624. Throughput: 0: 45700.0. Samples: 340922900. Policy #0 lag: (min: 0.0, avg: 38.0, max: 114.0) [2024-03-21 00:46:30,522][03784] Avg episode reward: [(0, '0.581')] [2024-03-21 00:46:33,662][03995] Signal inference workers to stop experience collection... (6850 times) [2024-03-21 00:46:33,663][03995] Signal inference workers to resume experience collection... (6850 times) [2024-03-21 00:46:33,723][04017] InferenceWorker_p0-w0: stopping experience collection (6850 times) [2024-03-21 00:46:33,724][04017] InferenceWorker_p0-w0: resuming experience collection (6850 times) [2024-03-21 00:46:34,683][04017] Updated weights for policy 0, policy_version 10374 (0.0011) [2024-03-21 00:46:35,521][03784] Fps is (10 sec: 42598.2, 60 sec: 45875.3, 300 sec: 46319.5). Total num frames: 339968000. Throughput: 0: 46002.2. Samples: 341218200. Policy #0 lag: (min: 0.0, avg: 38.0, max: 114.0) [2024-03-21 00:46:35,522][03784] Avg episode reward: [(0, '0.815')] [2024-03-21 00:46:40,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 46097.4). Total num frames: 340197376. Throughput: 0: 46577.8. Samples: 341496500. Policy #0 lag: (min: 3.0, avg: 36.4, max: 70.0) [2024-03-21 00:46:40,522][03784] Avg episode reward: [(0, '1.190')] [2024-03-21 00:46:40,926][04017] Updated weights for policy 0, policy_version 10384 (0.0015) [2024-03-21 00:46:45,521][03784] Fps is (10 sec: 49151.6, 60 sec: 46421.3, 300 sec: 46319.5). Total num frames: 340459520. Throughput: 0: 46480.0. Samples: 341626600. Policy #0 lag: (min: 3.0, avg: 36.4, max: 70.0) [2024-03-21 00:46:45,522][03784] Avg episode reward: [(0, '0.693')] [2024-03-21 00:46:50,127][04017] Updated weights for policy 0, policy_version 10394 (0.0011) [2024-03-21 00:46:50,521][03784] Fps is (10 sec: 42598.1, 60 sec: 45875.1, 300 sec: 46541.7). Total num frames: 340623360. Throughput: 0: 46431.1. Samples: 341905800. Policy #0 lag: (min: 0.0, avg: 30.1, max: 72.0) [2024-03-21 00:46:50,522][03784] Avg episode reward: [(0, '1.117')] [2024-03-21 00:46:53,941][04017] Updated weights for policy 0, policy_version 10404 (0.0021) [2024-03-21 00:46:55,521][03784] Fps is (10 sec: 55705.9, 60 sec: 48605.9, 300 sec: 46986.0). Total num frames: 341016576. Throughput: 0: 46415.7. Samples: 342186100. Policy #0 lag: (min: 0.0, avg: 30.1, max: 72.0) [2024-03-21 00:46:55,522][03784] Avg episode reward: [(0, '1.190')] [2024-03-21 00:47:00,521][03784] Fps is (10 sec: 39321.3, 60 sec: 44782.8, 300 sec: 46430.6). Total num frames: 341016576. Throughput: 0: 46679.8. Samples: 342350600. Policy #0 lag: (min: 0.0, avg: 30.1, max: 72.0) [2024-03-21 00:47:00,528][03784] Avg episode reward: [(0, '1.190')] [2024-03-21 00:47:04,754][04017] Updated weights for policy 0, policy_version 10414 (0.0012) [2024-03-21 00:47:05,521][03784] Fps is (10 sec: 26214.0, 60 sec: 45875.1, 300 sec: 46319.5). Total num frames: 341278720. Throughput: 0: 47262.2. Samples: 342654500. Policy #0 lag: (min: 0.0, avg: 27.3, max: 59.0) [2024-03-21 00:47:05,522][03784] Avg episode reward: [(0, '0.883')] [2024-03-21 00:47:10,521][03784] Fps is (10 sec: 52429.0, 60 sec: 45329.1, 300 sec: 46319.5). Total num frames: 341540864. Throughput: 0: 47457.6. Samples: 342923500. Policy #0 lag: (min: 0.0, avg: 33.8, max: 85.0) [2024-03-21 00:47:10,522][03784] Avg episode reward: [(0, '0.810')] [2024-03-21 00:47:10,829][04017] Updated weights for policy 0, policy_version 10424 (0.0011) [2024-03-21 00:47:14,835][04017] Updated weights for policy 0, policy_version 10434 (0.0015) [2024-03-21 00:47:15,521][03784] Fps is (10 sec: 62259.9, 60 sec: 50244.2, 300 sec: 46208.4). Total num frames: 341901312. Throughput: 0: 47397.8. Samples: 343055800. Policy #0 lag: (min: 0.0, avg: 33.8, max: 85.0) [2024-03-21 00:47:15,522][03784] Avg episode reward: [(0, '0.806')] [2024-03-21 00:47:20,521][03784] Fps is (10 sec: 62259.9, 60 sec: 48606.0, 300 sec: 46763.8). Total num frames: 342163456. Throughput: 0: 46873.3. Samples: 343327500. Policy #0 lag: (min: 0.0, avg: 42.6, max: 89.0) [2024-03-21 00:47:20,522][03784] Avg episode reward: [(0, '1.059')] [2024-03-21 00:47:23,766][04017] Updated weights for policy 0, policy_version 10444 (0.0013) [2024-03-21 00:47:25,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45329.0, 300 sec: 46319.5). Total num frames: 342261760. Throughput: 0: 47351.2. Samples: 343627300. Policy #0 lag: (min: 0.0, avg: 42.6, max: 89.0) [2024-03-21 00:47:25,530][03784] Avg episode reward: [(0, '1.181')] [2024-03-21 00:47:28,788][03995] Signal inference workers to stop experience collection... (6900 times) [2024-03-21 00:47:28,789][03995] Signal inference workers to resume experience collection... (6900 times) [2024-03-21 00:47:28,856][04017] InferenceWorker_p0-w0: stopping experience collection (6900 times) [2024-03-21 00:47:28,856][04017] InferenceWorker_p0-w0: resuming experience collection (6900 times) [2024-03-21 00:47:30,521][03784] Fps is (10 sec: 32767.6, 60 sec: 45875.1, 300 sec: 46430.6). Total num frames: 342491136. Throughput: 0: 47926.6. Samples: 343783300. Policy #0 lag: (min: 0.0, avg: 35.2, max: 91.0) [2024-03-21 00:47:30,531][03784] Avg episode reward: [(0, '0.599')] [2024-03-21 00:47:31,749][04017] Updated weights for policy 0, policy_version 10454 (0.0011) [2024-03-21 00:47:35,521][03784] Fps is (10 sec: 55705.0, 60 sec: 47513.5, 300 sec: 46208.4). Total num frames: 342818816. Throughput: 0: 47555.6. Samples: 344045800. Policy #0 lag: (min: 0.0, avg: 35.2, max: 91.0) [2024-03-21 00:47:35,522][03784] Avg episode reward: [(0, '0.811')] [2024-03-21 00:47:37,083][04017] Updated weights for policy 0, policy_version 10464 (0.0029) [2024-03-21 00:47:40,521][03784] Fps is (10 sec: 58983.2, 60 sec: 48059.8, 300 sec: 46652.8). Total num frames: 343080960. Throughput: 0: 47780.0. Samples: 344336200. Policy #0 lag: (min: 0.0, avg: 37.1, max: 78.0) [2024-03-21 00:47:40,522][03784] Avg episode reward: [(0, '0.811')] [2024-03-21 00:47:42,321][04017] Updated weights for policy 0, policy_version 10474 (0.0013) [2024-03-21 00:47:45,521][03784] Fps is (10 sec: 55705.7, 60 sec: 48605.9, 300 sec: 46652.7). Total num frames: 343375872. Throughput: 0: 47171.2. Samples: 344473300. Policy #0 lag: (min: 2.0, avg: 44.2, max: 80.0) [2024-03-21 00:47:45,522][03784] Avg episode reward: [(0, '1.505')] [2024-03-21 00:47:50,248][04017] Updated weights for policy 0, policy_version 10484 (0.0015) [2024-03-21 00:47:50,521][03784] Fps is (10 sec: 45875.0, 60 sec: 48605.9, 300 sec: 46541.7). Total num frames: 343539712. Throughput: 0: 46562.3. Samples: 344749800. Policy #0 lag: (min: 2.0, avg: 44.2, max: 80.0) [2024-03-21 00:47:50,522][03784] Avg episode reward: [(0, '0.410')] [2024-03-21 00:47:55,521][03784] Fps is (10 sec: 32768.2, 60 sec: 44782.9, 300 sec: 46541.7). Total num frames: 343703552. Throughput: 0: 46817.9. Samples: 345030300. Policy #0 lag: (min: 1.0, avg: 25.6, max: 67.0) [2024-03-21 00:47:55,522][03784] Avg episode reward: [(0, '0.958')] [2024-03-21 00:47:57,974][04017] Updated weights for policy 0, policy_version 10494 (0.0015) [2024-03-21 00:48:00,521][03784] Fps is (10 sec: 45874.9, 60 sec: 49698.2, 300 sec: 46319.5). Total num frames: 343998464. Throughput: 0: 47042.2. Samples: 345172700. Policy #0 lag: (min: 1.0, avg: 25.6, max: 67.0) [2024-03-21 00:48:00,522][03784] Avg episode reward: [(0, '1.036')] [2024-03-21 00:48:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000010498_343998464.pth... [2024-03-21 00:48:00,652][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000010160_332922880.pth [2024-03-21 00:48:05,253][04017] Updated weights for policy 0, policy_version 10504 (0.0011) [2024-03-21 00:48:05,521][03784] Fps is (10 sec: 49152.1, 60 sec: 48606.0, 300 sec: 46097.4). Total num frames: 344195072. Throughput: 0: 47120.0. Samples: 345447900. Policy #0 lag: (min: 1.0, avg: 46.0, max: 90.0) [2024-03-21 00:48:05,522][03784] Avg episode reward: [(0, '0.776')] [2024-03-21 00:48:10,521][03784] Fps is (10 sec: 36044.9, 60 sec: 46967.5, 300 sec: 45986.3). Total num frames: 344358912. Throughput: 0: 46597.7. Samples: 345724200. Policy #0 lag: (min: 1.0, avg: 46.0, max: 90.0) [2024-03-21 00:48:10,522][03784] Avg episode reward: [(0, '0.626')] [2024-03-21 00:48:13,692][04017] Updated weights for policy 0, policy_version 10514 (0.0010) [2024-03-21 00:48:15,521][03784] Fps is (10 sec: 32767.8, 60 sec: 43690.7, 300 sec: 45653.1). Total num frames: 344522752. Throughput: 0: 46151.2. Samples: 345860100. Policy #0 lag: (min: 0.0, avg: 35.4, max: 91.0) [2024-03-21 00:48:15,522][03784] Avg episode reward: [(0, '1.186')] [2024-03-21 00:48:20,521][03784] Fps is (10 sec: 36045.0, 60 sec: 42598.4, 300 sec: 45542.0). Total num frames: 344719360. Throughput: 0: 47055.6. Samples: 346163300. Policy #0 lag: (min: 0.0, avg: 35.4, max: 91.0) [2024-03-21 00:48:20,522][03784] Avg episode reward: [(0, '1.186')] [2024-03-21 00:48:23,808][04017] Updated weights for policy 0, policy_version 10524 (0.0010) [2024-03-21 00:48:23,817][03995] Signal inference workers to stop experience collection... (6950 times) [2024-03-21 00:48:23,819][03995] Signal inference workers to resume experience collection... (6950 times) [2024-03-21 00:48:23,884][04017] InferenceWorker_p0-w0: stopping experience collection (6950 times) [2024-03-21 00:48:23,885][04017] InferenceWorker_p0-w0: resuming experience collection (6950 times) [2024-03-21 00:48:25,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45329.0, 300 sec: 45875.2). Total num frames: 344981504. Throughput: 0: 47239.9. Samples: 346462000. Policy #0 lag: (min: 0.0, avg: 30.7, max: 73.0) [2024-03-21 00:48:25,522][03784] Avg episode reward: [(0, '1.186')] [2024-03-21 00:48:27,382][04017] Updated weights for policy 0, policy_version 10534 (0.0013) [2024-03-21 00:48:30,521][03784] Fps is (10 sec: 62258.9, 60 sec: 47513.7, 300 sec: 46763.8). Total num frames: 345341952. Throughput: 0: 47288.9. Samples: 346601300. Policy #0 lag: (min: 0.0, avg: 30.7, max: 73.0) [2024-03-21 00:48:30,522][03784] Avg episode reward: [(0, '0.729')] [2024-03-21 00:48:32,323][04017] Updated weights for policy 0, policy_version 10544 (0.0010) [2024-03-21 00:48:35,521][03784] Fps is (10 sec: 81920.3, 60 sec: 49698.2, 300 sec: 47097.1). Total num frames: 345800704. Throughput: 0: 46433.3. Samples: 346839300. Policy #0 lag: (min: 1.0, avg: 31.2, max: 58.0) [2024-03-21 00:48:35,522][03784] Avg episode reward: [(0, '1.153')] [2024-03-21 00:48:37,183][04017] Updated weights for policy 0, policy_version 10554 (0.0017) [2024-03-21 00:48:40,521][03784] Fps is (10 sec: 68813.7, 60 sec: 49152.0, 300 sec: 47319.2). Total num frames: 346030080. Throughput: 0: 46433.4. Samples: 347119800. Policy #0 lag: (min: 0.0, avg: 44.0, max: 90.0) [2024-03-21 00:48:40,522][03784] Avg episode reward: [(0, '0.931')] [2024-03-21 00:48:41,760][04017] Updated weights for policy 0, policy_version 10564 (0.0013) [2024-03-21 00:48:45,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48059.8, 300 sec: 47874.6). Total num frames: 346259456. Throughput: 0: 46066.7. Samples: 347245700. Policy #0 lag: (min: 0.0, avg: 44.0, max: 90.0) [2024-03-21 00:48:45,522][03784] Avg episode reward: [(0, '0.462')] [2024-03-21 00:48:50,521][03784] Fps is (10 sec: 36044.3, 60 sec: 47513.6, 300 sec: 46652.7). Total num frames: 346390528. Throughput: 0: 46333.2. Samples: 347532900. Policy #0 lag: (min: 0.0, avg: 41.3, max: 86.0) [2024-03-21 00:48:50,530][03784] Avg episode reward: [(0, '0.462')] [2024-03-21 00:48:55,521][03784] Fps is (10 sec: 19660.8, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 346456064. Throughput: 0: 46740.1. Samples: 347827500. Policy #0 lag: (min: 0.0, avg: 41.3, max: 86.0) [2024-03-21 00:48:55,522][03784] Avg episode reward: [(0, '1.120')] [2024-03-21 00:48:56,715][04017] Updated weights for policy 0, policy_version 10574 (0.0019) [2024-03-21 00:49:00,521][03784] Fps is (10 sec: 22937.6, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 346619904. Throughput: 0: 47006.6. Samples: 347975400. Policy #0 lag: (min: 1.0, avg: 29.9, max: 65.0) [2024-03-21 00:49:00,522][03784] Avg episode reward: [(0, '1.354')] [2024-03-21 00:49:03,110][04017] Updated weights for policy 0, policy_version 10584 (0.0010) [2024-03-21 00:49:05,521][03784] Fps is (10 sec: 42598.4, 60 sec: 44782.9, 300 sec: 45542.0). Total num frames: 346882048. Throughput: 0: 46448.9. Samples: 348253500. Policy #0 lag: (min: 1.0, avg: 29.9, max: 65.0) [2024-03-21 00:49:05,522][03784] Avg episode reward: [(0, '0.960')] [2024-03-21 00:49:10,521][03784] Fps is (10 sec: 36045.1, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 346980352. Throughput: 0: 45802.3. Samples: 348523100. Policy #0 lag: (min: 1.0, avg: 44.1, max: 90.0) [2024-03-21 00:49:10,522][03784] Avg episode reward: [(0, '0.769')] [2024-03-21 00:49:13,482][04017] Updated weights for policy 0, policy_version 10594 (0.0011) [2024-03-21 00:49:13,836][03995] Signal inference workers to stop experience collection... (7000 times) [2024-03-21 00:49:13,910][04017] InferenceWorker_p0-w0: stopping experience collection (7000 times) [2024-03-21 00:49:14,132][03995] Signal inference workers to resume experience collection... (7000 times) [2024-03-21 00:49:14,132][04017] InferenceWorker_p0-w0: resuming experience collection (7000 times) [2024-03-21 00:49:15,521][03784] Fps is (10 sec: 45874.9, 60 sec: 46967.4, 300 sec: 46097.4). Total num frames: 347340800. Throughput: 0: 45826.6. Samples: 348663500. Policy #0 lag: (min: 2.0, avg: 27.4, max: 69.0) [2024-03-21 00:49:15,522][03784] Avg episode reward: [(0, '1.112')] [2024-03-21 00:49:16,716][04017] Updated weights for policy 0, policy_version 10604 (0.0026) [2024-03-21 00:49:20,521][03784] Fps is (10 sec: 75365.3, 60 sec: 50244.2, 300 sec: 46874.9). Total num frames: 347734016. Throughput: 0: 45742.1. Samples: 348897700. Policy #0 lag: (min: 2.0, avg: 27.4, max: 69.0) [2024-03-21 00:49:20,522][03784] Avg episode reward: [(0, '1.042')] [2024-03-21 00:49:21,713][04017] Updated weights for policy 0, policy_version 10614 (0.0012) [2024-03-21 00:49:25,521][03784] Fps is (10 sec: 68813.5, 60 sec: 50790.4, 300 sec: 46541.7). Total num frames: 348028928. Throughput: 0: 45908.8. Samples: 349185700. Policy #0 lag: (min: 4.0, avg: 50.8, max: 102.0) [2024-03-21 00:49:25,522][03784] Avg episode reward: [(0, '0.537')] [2024-03-21 00:49:26,608][04017] Updated weights for policy 0, policy_version 10624 (0.0016) [2024-03-21 00:49:30,521][03784] Fps is (10 sec: 58983.2, 60 sec: 49698.2, 300 sec: 47320.1). Total num frames: 348323840. Throughput: 0: 46268.9. Samples: 349327800. Policy #0 lag: (min: 4.0, avg: 50.8, max: 102.0) [2024-03-21 00:49:30,522][03784] Avg episode reward: [(0, '0.537')] [2024-03-21 00:49:35,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43144.5, 300 sec: 46652.8). Total num frames: 348389376. Throughput: 0: 46682.3. Samples: 349633600. Policy #0 lag: (min: 4.0, avg: 50.8, max: 102.0) [2024-03-21 00:49:35,522][03784] Avg episode reward: [(0, '1.069')] [2024-03-21 00:49:37,773][04017] Updated weights for policy 0, policy_version 10634 (0.0011) [2024-03-21 00:49:40,521][03784] Fps is (10 sec: 16383.8, 60 sec: 40959.8, 300 sec: 45986.3). Total num frames: 348487680. Throughput: 0: 46946.5. Samples: 349940100. Policy #0 lag: (min: 0.0, avg: 43.8, max: 76.0) [2024-03-21 00:49:40,522][03784] Avg episode reward: [(0, '0.686')] [2024-03-21 00:49:45,521][03784] Fps is (10 sec: 26214.1, 60 sec: 39867.7, 300 sec: 45653.0). Total num frames: 348651520. Throughput: 0: 46417.7. Samples: 350064200. Policy #0 lag: (min: 0.0, avg: 43.8, max: 76.0) [2024-03-21 00:49:45,522][03784] Avg episode reward: [(0, '1.190')] [2024-03-21 00:49:46,678][04017] Updated weights for policy 0, policy_version 10644 (0.0012) [2024-03-21 00:49:50,521][03784] Fps is (10 sec: 58983.5, 60 sec: 44783.0, 300 sec: 46763.8). Total num frames: 349077504. Throughput: 0: 45775.6. Samples: 350313400. Policy #0 lag: (min: 1.0, avg: 46.1, max: 101.0) [2024-03-21 00:49:50,522][03784] Avg episode reward: [(0, '1.049')] [2024-03-21 00:49:51,963][04017] Updated weights for policy 0, policy_version 10654 (0.0015) [2024-03-21 00:49:55,521][03784] Fps is (10 sec: 65536.6, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 349306880. Throughput: 0: 45719.9. Samples: 350580500. Policy #0 lag: (min: 1.0, avg: 31.6, max: 100.0) [2024-03-21 00:49:55,522][03784] Avg episode reward: [(0, '1.301')] [2024-03-21 00:49:57,026][04017] Updated weights for policy 0, policy_version 10664 (0.0017) [2024-03-21 00:49:57,393][03995] Signal inference workers to stop experience collection... (7050 times) [2024-03-21 00:49:57,467][04017] InferenceWorker_p0-w0: stopping experience collection (7050 times) [2024-03-21 00:49:57,666][03995] Signal inference workers to resume experience collection... (7050 times) [2024-03-21 00:49:57,667][04017] InferenceWorker_p0-w0: resuming experience collection (7050 times) [2024-03-21 00:50:00,521][03784] Fps is (10 sec: 49151.2, 60 sec: 49151.9, 300 sec: 46652.7). Total num frames: 349569024. Throughput: 0: 44951.1. Samples: 350686300. Policy #0 lag: (min: 1.0, avg: 31.6, max: 100.0) [2024-03-21 00:50:00,522][03784] Avg episode reward: [(0, '1.307')] [2024-03-21 00:50:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000010668_349569024.pth... [2024-03-21 00:50:00,669][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000010325_338329600.pth [2024-03-21 00:50:05,521][03784] Fps is (10 sec: 36045.0, 60 sec: 46421.4, 300 sec: 46430.6). Total num frames: 349667328. Throughput: 0: 46497.9. Samples: 350990100. Policy #0 lag: (min: 1.0, avg: 32.2, max: 76.0) [2024-03-21 00:50:05,522][03784] Avg episode reward: [(0, '0.582')] [2024-03-21 00:50:06,583][04017] Updated weights for policy 0, policy_version 10674 (0.0013) [2024-03-21 00:50:10,521][03784] Fps is (10 sec: 45875.4, 60 sec: 50790.3, 300 sec: 46874.9). Total num frames: 350027776. Throughput: 0: 46268.8. Samples: 351267800. Policy #0 lag: (min: 1.0, avg: 32.2, max: 76.0) [2024-03-21 00:50:10,522][03784] Avg episode reward: [(0, '1.500')] [2024-03-21 00:50:11,065][04017] Updated weights for policy 0, policy_version 10684 (0.0016) [2024-03-21 00:50:15,521][03784] Fps is (10 sec: 62258.8, 60 sec: 49152.0, 300 sec: 46652.7). Total num frames: 350289920. Throughput: 0: 46277.7. Samples: 351410300. Policy #0 lag: (min: 0.0, avg: 39.5, max: 73.0) [2024-03-21 00:50:15,522][03784] Avg episode reward: [(0, '0.409')] [2024-03-21 00:50:18,367][04017] Updated weights for policy 0, policy_version 10694 (0.0011) [2024-03-21 00:50:20,521][03784] Fps is (10 sec: 45875.4, 60 sec: 45875.3, 300 sec: 46652.7). Total num frames: 350486528. Throughput: 0: 45560.0. Samples: 351683800. Policy #0 lag: (min: 0.0, avg: 39.5, max: 73.0) [2024-03-21 00:50:20,522][03784] Avg episode reward: [(0, '0.409')] [2024-03-21 00:50:25,521][03784] Fps is (10 sec: 32768.0, 60 sec: 43144.5, 300 sec: 46430.6). Total num frames: 350617600. Throughput: 0: 45086.8. Samples: 351969000. Policy #0 lag: (min: 0.0, avg: 46.0, max: 120.0) [2024-03-21 00:50:25,522][03784] Avg episode reward: [(0, '0.670')] [2024-03-21 00:50:29,024][04017] Updated weights for policy 0, policy_version 10704 (0.0015) [2024-03-21 00:50:30,521][03784] Fps is (10 sec: 32768.0, 60 sec: 41506.1, 300 sec: 46097.4). Total num frames: 350814208. Throughput: 0: 45360.1. Samples: 352105400. Policy #0 lag: (min: 0.0, avg: 46.0, max: 120.0) [2024-03-21 00:50:30,522][03784] Avg episode reward: [(0, '0.492')] [2024-03-21 00:50:35,302][04017] Updated weights for policy 0, policy_version 10714 (0.0031) [2024-03-21 00:50:35,521][03784] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 46208.4). Total num frames: 351076352. Throughput: 0: 45566.6. Samples: 352363900. Policy #0 lag: (min: 2.0, avg: 37.1, max: 98.0) [2024-03-21 00:50:35,522][03784] Avg episode reward: [(0, '1.133')] [2024-03-21 00:50:40,314][04017] Updated weights for policy 0, policy_version 10724 (0.0012) [2024-03-21 00:50:40,521][03784] Fps is (10 sec: 58982.3, 60 sec: 48605.9, 300 sec: 46541.7). Total num frames: 351404032. Throughput: 0: 45413.3. Samples: 352624100. Policy #0 lag: (min: 2.0, avg: 37.1, max: 98.0) [2024-03-21 00:50:40,522][03784] Avg episode reward: [(0, '0.854')] [2024-03-21 00:50:45,521][03784] Fps is (10 sec: 45875.4, 60 sec: 48059.8, 300 sec: 46319.5). Total num frames: 351535104. Throughput: 0: 46451.2. Samples: 352776600. Policy #0 lag: (min: 0.0, avg: 31.1, max: 63.0) [2024-03-21 00:50:45,522][03784] Avg episode reward: [(0, '0.680')] [2024-03-21 00:50:50,521][03784] Fps is (10 sec: 29491.0, 60 sec: 43690.5, 300 sec: 46097.3). Total num frames: 351698944. Throughput: 0: 46302.1. Samples: 353073700. Policy #0 lag: (min: 0.0, avg: 31.1, max: 63.0) [2024-03-21 00:50:50,522][03784] Avg episode reward: [(0, '0.570')] [2024-03-21 00:50:50,698][04017] Updated weights for policy 0, policy_version 10734 (0.0020) [2024-03-21 00:50:55,521][03784] Fps is (10 sec: 39321.2, 60 sec: 43690.6, 300 sec: 46097.3). Total num frames: 351928320. Throughput: 0: 46457.7. Samples: 353358400. Policy #0 lag: (min: 0.0, avg: 29.8, max: 88.0) [2024-03-21 00:50:55,522][03784] Avg episode reward: [(0, '0.573')] [2024-03-21 00:50:56,793][03995] Signal inference workers to stop experience collection... (7100 times) [2024-03-21 00:50:56,794][03995] Signal inference workers to resume experience collection... (7100 times) [2024-03-21 00:50:56,889][04017] InferenceWorker_p0-w0: stopping experience collection (7100 times) [2024-03-21 00:50:56,890][04017] InferenceWorker_p0-w0: resuming experience collection (7100 times) [2024-03-21 00:50:57,160][04017] Updated weights for policy 0, policy_version 10744 (0.0016) [2024-03-21 00:51:00,521][03784] Fps is (10 sec: 55706.3, 60 sec: 44783.0, 300 sec: 46541.7). Total num frames: 352256000. Throughput: 0: 45926.7. Samples: 353477000. Policy #0 lag: (min: 0.0, avg: 29.8, max: 88.0) [2024-03-21 00:51:00,522][03784] Avg episode reward: [(0, '0.917')] [2024-03-21 00:51:02,061][04017] Updated weights for policy 0, policy_version 10754 (0.0011) [2024-03-21 00:51:05,521][03784] Fps is (10 sec: 72090.3, 60 sec: 49698.1, 300 sec: 46874.9). Total num frames: 352649216. Throughput: 0: 45353.3. Samples: 353724700. Policy #0 lag: (min: 0.0, avg: 43.8, max: 92.0) [2024-03-21 00:51:05,522][03784] Avg episode reward: [(0, '0.493')] [2024-03-21 00:51:07,314][04017] Updated weights for policy 0, policy_version 10764 (0.0011) [2024-03-21 00:51:10,521][03784] Fps is (10 sec: 68812.9, 60 sec: 48605.9, 300 sec: 47652.4). Total num frames: 352944128. Throughput: 0: 44875.6. Samples: 353988400. Policy #0 lag: (min: 1.0, avg: 48.3, max: 128.0) [2024-03-21 00:51:10,522][03784] Avg episode reward: [(0, '0.961')] [2024-03-21 00:51:13,410][04017] Updated weights for policy 0, policy_version 10774 (0.0013) [2024-03-21 00:51:15,521][03784] Fps is (10 sec: 42598.6, 60 sec: 46421.4, 300 sec: 46874.9). Total num frames: 353075200. Throughput: 0: 44955.6. Samples: 354128400. Policy #0 lag: (min: 1.0, avg: 48.3, max: 128.0) [2024-03-21 00:51:15,522][03784] Avg episode reward: [(0, '1.203')] [2024-03-21 00:51:20,521][03784] Fps is (10 sec: 13107.1, 60 sec: 43144.5, 300 sec: 45875.2). Total num frames: 353075200. Throughput: 0: 46040.0. Samples: 354435700. Policy #0 lag: (min: 1.0, avg: 48.3, max: 128.0) [2024-03-21 00:51:20,522][03784] Avg episode reward: [(0, '1.236')] [2024-03-21 00:51:24,824][04017] Updated weights for policy 0, policy_version 10784 (0.0017) [2024-03-21 00:51:25,521][03784] Fps is (10 sec: 32767.7, 60 sec: 46421.3, 300 sec: 46319.5). Total num frames: 353402880. Throughput: 0: 46373.4. Samples: 354710900. Policy #0 lag: (min: 0.0, avg: 26.4, max: 55.0) [2024-03-21 00:51:25,522][03784] Avg episode reward: [(0, '0.954')] [2024-03-21 00:51:30,521][03784] Fps is (10 sec: 55705.2, 60 sec: 46967.4, 300 sec: 46319.5). Total num frames: 353632256. Throughput: 0: 46037.7. Samples: 354848300. Policy #0 lag: (min: 0.0, avg: 39.0, max: 83.0) [2024-03-21 00:51:30,522][03784] Avg episode reward: [(0, '1.011')] [2024-03-21 00:51:31,733][04017] Updated weights for policy 0, policy_version 10794 (0.0014) [2024-03-21 00:51:35,521][03784] Fps is (10 sec: 42598.6, 60 sec: 45875.2, 300 sec: 46208.4). Total num frames: 353828864. Throughput: 0: 45520.1. Samples: 355122100. Policy #0 lag: (min: 0.0, avg: 39.0, max: 83.0) [2024-03-21 00:51:35,522][03784] Avg episode reward: [(0, '0.958')] [2024-03-21 00:51:40,521][03784] Fps is (10 sec: 36045.2, 60 sec: 43144.6, 300 sec: 45875.2). Total num frames: 353992704. Throughput: 0: 45349.0. Samples: 355399100. Policy #0 lag: (min: 0.0, avg: 28.0, max: 66.0) [2024-03-21 00:51:40,522][03784] Avg episode reward: [(0, '0.364')] [2024-03-21 00:51:42,601][04017] Updated weights for policy 0, policy_version 10804 (0.0021) [2024-03-21 00:51:45,521][03784] Fps is (10 sec: 39321.2, 60 sec: 44782.9, 300 sec: 46097.4). Total num frames: 354222080. Throughput: 0: 45995.5. Samples: 355546800. Policy #0 lag: (min: 0.0, avg: 28.0, max: 66.0) [2024-03-21 00:51:45,522][03784] Avg episode reward: [(0, '0.613')] [2024-03-21 00:51:45,933][03995] Signal inference workers to stop experience collection... (7150 times) [2024-03-21 00:51:45,972][04017] InferenceWorker_p0-w0: stopping experience collection (7150 times) [2024-03-21 00:51:46,150][03995] Signal inference workers to resume experience collection... (7150 times) [2024-03-21 00:51:46,150][04017] InferenceWorker_p0-w0: resuming experience collection (7150 times) [2024-03-21 00:51:46,767][04017] Updated weights for policy 0, policy_version 10814 (0.0019) [2024-03-21 00:51:50,521][03784] Fps is (10 sec: 52428.9, 60 sec: 46967.6, 300 sec: 45764.1). Total num frames: 354516992. Throughput: 0: 46562.2. Samples: 355820000. Policy #0 lag: (min: 0.0, avg: 40.0, max: 86.0) [2024-03-21 00:51:50,522][03784] Avg episode reward: [(0, '0.432')] [2024-03-21 00:51:53,175][04017] Updated weights for policy 0, policy_version 10824 (0.0017) [2024-03-21 00:51:55,521][03784] Fps is (10 sec: 62259.8, 60 sec: 48606.0, 300 sec: 46874.9). Total num frames: 354844672. Throughput: 0: 46335.6. Samples: 356073500. Policy #0 lag: (min: 0.0, avg: 40.0, max: 86.0) [2024-03-21 00:51:55,522][03784] Avg episode reward: [(0, '1.049')] [2024-03-21 00:51:58,671][04017] Updated weights for policy 0, policy_version 10834 (0.0016) [2024-03-21 00:52:00,521][03784] Fps is (10 sec: 65535.4, 60 sec: 48605.8, 300 sec: 47097.1). Total num frames: 355172352. Throughput: 0: 46346.5. Samples: 356214000. Policy #0 lag: (min: 0.0, avg: 37.9, max: 72.0) [2024-03-21 00:52:00,522][03784] Avg episode reward: [(0, '1.155')] [2024-03-21 00:52:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000010839_355172352.pth... [2024-03-21 00:52:00,646][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000010498_343998464.pth [2024-03-21 00:52:05,521][03784] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 46541.7). Total num frames: 355270656. Throughput: 0: 45760.0. Samples: 356494900. Policy #0 lag: (min: 0.0, avg: 37.9, max: 72.0) [2024-03-21 00:52:05,522][03784] Avg episode reward: [(0, '0.467')] [2024-03-21 00:52:07,443][04017] Updated weights for policy 0, policy_version 10844 (0.0018) [2024-03-21 00:52:10,521][03784] Fps is (10 sec: 29491.1, 60 sec: 42052.2, 300 sec: 45986.3). Total num frames: 355467264. Throughput: 0: 46015.5. Samples: 356781600. Policy #0 lag: (min: 0.0, avg: 45.3, max: 92.0) [2024-03-21 00:52:10,522][03784] Avg episode reward: [(0, '0.687')] [2024-03-21 00:52:14,471][04017] Updated weights for policy 0, policy_version 10854 (0.0020) [2024-03-21 00:52:15,521][03784] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 45764.1). Total num frames: 355663872. Throughput: 0: 46184.6. Samples: 356926600. Policy #0 lag: (min: 2.0, avg: 38.2, max: 78.0) [2024-03-21 00:52:15,522][03784] Avg episode reward: [(0, '0.930')] [2024-03-21 00:52:20,521][03784] Fps is (10 sec: 49151.5, 60 sec: 48059.6, 300 sec: 46430.5). Total num frames: 355958784. Throughput: 0: 46479.8. Samples: 357213700. Policy #0 lag: (min: 2.0, avg: 38.2, max: 78.0) [2024-03-21 00:52:20,522][03784] Avg episode reward: [(0, '0.930')] [2024-03-21 00:52:20,841][04017] Updated weights for policy 0, policy_version 10864 (0.0016) [2024-03-21 00:52:25,521][03784] Fps is (10 sec: 52427.6, 60 sec: 46421.2, 300 sec: 46430.6). Total num frames: 356188160. Throughput: 0: 46146.5. Samples: 357475700. Policy #0 lag: (min: 1.0, avg: 34.5, max: 79.0) [2024-03-21 00:52:25,522][03784] Avg episode reward: [(0, '0.689')] [2024-03-21 00:52:27,441][04017] Updated weights for policy 0, policy_version 10874 (0.0013) [2024-03-21 00:52:30,521][03784] Fps is (10 sec: 49152.6, 60 sec: 46967.5, 300 sec: 46208.4). Total num frames: 356450304. Throughput: 0: 46077.8. Samples: 357620300. Policy #0 lag: (min: 1.0, avg: 34.5, max: 79.0) [2024-03-21 00:52:30,522][03784] Avg episode reward: [(0, '0.551')] [2024-03-21 00:52:35,077][04017] Updated weights for policy 0, policy_version 10884 (0.0019) [2024-03-21 00:52:35,521][03784] Fps is (10 sec: 45876.1, 60 sec: 46967.5, 300 sec: 45986.3). Total num frames: 356646912. Throughput: 0: 46480.0. Samples: 357911600. Policy #0 lag: (min: 0.0, avg: 43.7, max: 87.0) [2024-03-21 00:52:35,522][03784] Avg episode reward: [(0, '1.216')] [2024-03-21 00:52:40,097][03995] Signal inference workers to stop experience collection... (7200 times) [2024-03-21 00:52:40,098][03995] Signal inference workers to resume experience collection... (7200 times) [2024-03-21 00:52:40,192][04017] InferenceWorker_p0-w0: stopping experience collection (7200 times) [2024-03-21 00:52:40,193][04017] InferenceWorker_p0-w0: resuming experience collection (7200 times) [2024-03-21 00:52:40,521][03784] Fps is (10 sec: 45875.7, 60 sec: 48605.9, 300 sec: 45875.2). Total num frames: 356909056. Throughput: 0: 47055.5. Samples: 358191000. Policy #0 lag: (min: 0.0, avg: 43.7, max: 87.0) [2024-03-21 00:52:40,522][03784] Avg episode reward: [(0, '0.576')] [2024-03-21 00:52:42,552][04017] Updated weights for policy 0, policy_version 10894 (0.0011) [2024-03-21 00:52:45,521][03784] Fps is (10 sec: 52427.3, 60 sec: 49151.8, 300 sec: 46208.4). Total num frames: 357171200. Throughput: 0: 47157.6. Samples: 358336100. Policy #0 lag: (min: 0.0, avg: 34.8, max: 89.0) [2024-03-21 00:52:45,523][03784] Avg episode reward: [(0, '0.712')] [2024-03-21 00:52:46,703][04017] Updated weights for policy 0, policy_version 10904 (0.0011) [2024-03-21 00:52:50,521][03784] Fps is (10 sec: 49151.9, 60 sec: 48059.7, 300 sec: 46430.6). Total num frames: 357400576. Throughput: 0: 47291.1. Samples: 358623000. Policy #0 lag: (min: 0.0, avg: 34.8, max: 89.0) [2024-03-21 00:52:50,522][03784] Avg episode reward: [(0, '0.781')] [2024-03-21 00:52:53,664][04017] Updated weights for policy 0, policy_version 10914 (0.0011) [2024-03-21 00:52:55,521][03784] Fps is (10 sec: 49153.5, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 357662720. Throughput: 0: 47369.0. Samples: 358913200. Policy #0 lag: (min: 0.0, avg: 46.9, max: 104.0) [2024-03-21 00:52:55,522][03784] Avg episode reward: [(0, '1.247')] [2024-03-21 00:53:00,521][03784] Fps is (10 sec: 36044.6, 60 sec: 43144.5, 300 sec: 45986.3). Total num frames: 357761024. Throughput: 0: 47453.2. Samples: 359062000. Policy #0 lag: (min: 0.0, avg: 46.9, max: 104.0) [2024-03-21 00:53:00,522][03784] Avg episode reward: [(0, '1.283')] [2024-03-21 00:53:03,518][04017] Updated weights for policy 0, policy_version 10924 (0.0015) [2024-03-21 00:53:05,521][03784] Fps is (10 sec: 45875.0, 60 sec: 47513.6, 300 sec: 46652.7). Total num frames: 358121472. Throughput: 0: 47686.9. Samples: 359359600. Policy #0 lag: (min: 0.0, avg: 33.7, max: 79.0) [2024-03-21 00:53:05,522][03784] Avg episode reward: [(0, '0.703')] [2024-03-21 00:53:10,098][04017] Updated weights for policy 0, policy_version 10934 (0.0014) [2024-03-21 00:53:10,521][03784] Fps is (10 sec: 55705.1, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 358318080. Throughput: 0: 47886.7. Samples: 359630600. Policy #0 lag: (min: 0.0, avg: 33.6, max: 70.0) [2024-03-21 00:53:10,522][03784] Avg episode reward: [(0, '1.060')] [2024-03-21 00:53:15,521][03784] Fps is (10 sec: 42598.4, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 358547456. Throughput: 0: 47895.6. Samples: 359775600. Policy #0 lag: (min: 0.0, avg: 33.6, max: 70.0) [2024-03-21 00:53:15,522][03784] Avg episode reward: [(0, '1.060')] [2024-03-21 00:53:16,427][04017] Updated weights for policy 0, policy_version 10944 (0.0016) [2024-03-21 00:53:20,521][03784] Fps is (10 sec: 49152.8, 60 sec: 47513.8, 300 sec: 46874.9). Total num frames: 358809600. Throughput: 0: 47968.9. Samples: 360070200. Policy #0 lag: (min: 1.0, avg: 39.9, max: 92.0) [2024-03-21 00:53:20,530][03784] Avg episode reward: [(0, '1.368')] [2024-03-21 00:53:23,835][04017] Updated weights for policy 0, policy_version 10954 (0.0011) [2024-03-21 00:53:25,521][03784] Fps is (10 sec: 55705.9, 60 sec: 48606.0, 300 sec: 46652.8). Total num frames: 359104512. Throughput: 0: 48048.9. Samples: 360353200. Policy #0 lag: (min: 1.0, avg: 39.9, max: 92.0) [2024-03-21 00:53:25,522][03784] Avg episode reward: [(0, '1.073')] [2024-03-21 00:53:27,555][04017] Updated weights for policy 0, policy_version 10964 (0.0023) [2024-03-21 00:53:30,373][03995] Signal inference workers to stop experience collection... (7250 times) [2024-03-21 00:53:30,373][03995] Signal inference workers to resume experience collection... (7250 times) [2024-03-21 00:53:30,422][04017] InferenceWorker_p0-w0: stopping experience collection (7250 times) [2024-03-21 00:53:30,423][04017] InferenceWorker_p0-w0: resuming experience collection (7250 times) [2024-03-21 00:53:30,521][03784] Fps is (10 sec: 62258.7, 60 sec: 49698.2, 300 sec: 46208.4). Total num frames: 359432192. Throughput: 0: 48031.3. Samples: 360497500. Policy #0 lag: (min: 2.0, avg: 39.4, max: 77.0) [2024-03-21 00:53:30,523][03784] Avg episode reward: [(0, '0.708')] [2024-03-21 00:53:32,868][04017] Updated weights for policy 0, policy_version 10974 (0.0010) [2024-03-21 00:53:35,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49698.1, 300 sec: 46097.3). Total num frames: 359628800. Throughput: 0: 48066.7. Samples: 360786000. Policy #0 lag: (min: 2.0, avg: 39.4, max: 77.0) [2024-03-21 00:53:35,522][03784] Avg episode reward: [(0, '0.708')] [2024-03-21 00:53:40,521][03784] Fps is (10 sec: 32768.0, 60 sec: 47513.5, 300 sec: 45764.1). Total num frames: 359759872. Throughput: 0: 48059.9. Samples: 361075900. Policy #0 lag: (min: 0.0, avg: 41.4, max: 83.0) [2024-03-21 00:53:40,522][03784] Avg episode reward: [(0, '1.150')] [2024-03-21 00:53:45,521][03784] Fps is (10 sec: 26214.0, 60 sec: 45329.2, 300 sec: 45764.1). Total num frames: 359890944. Throughput: 0: 47786.6. Samples: 361212400. Policy #0 lag: (min: 0.0, avg: 41.4, max: 83.0) [2024-03-21 00:53:45,522][03784] Avg episode reward: [(0, '0.772')] [2024-03-21 00:53:45,674][04017] Updated weights for policy 0, policy_version 10984 (0.0019) [2024-03-21 00:53:50,521][03784] Fps is (10 sec: 45875.6, 60 sec: 46967.5, 300 sec: 46652.7). Total num frames: 360218624. Throughput: 0: 46831.2. Samples: 361467000. Policy #0 lag: (min: 3.0, avg: 28.2, max: 73.0) [2024-03-21 00:53:50,522][03784] Avg episode reward: [(0, '0.849')] [2024-03-21 00:53:51,551][04017] Updated weights for policy 0, policy_version 10994 (0.0016) [2024-03-21 00:53:55,521][03784] Fps is (10 sec: 65537.3, 60 sec: 48059.8, 300 sec: 47208.2). Total num frames: 360546304. Throughput: 0: 46724.7. Samples: 361733200. Policy #0 lag: (min: 3.0, avg: 28.2, max: 73.0) [2024-03-21 00:53:55,522][03784] Avg episode reward: [(0, '0.992')] [2024-03-21 00:53:57,442][04017] Updated weights for policy 0, policy_version 11004 (0.0016) [2024-03-21 00:54:00,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 360677376. Throughput: 0: 47015.6. Samples: 361891300. Policy #0 lag: (min: 0.0, avg: 32.6, max: 89.0) [2024-03-21 00:54:00,522][03784] Avg episode reward: [(0, '0.992')] [2024-03-21 00:54:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000011007_360677376.pth... [2024-03-21 00:54:00,682][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000010668_349569024.pth [2024-03-21 00:54:03,323][04017] Updated weights for policy 0, policy_version 11014 (0.0012) [2024-03-21 00:54:05,521][03784] Fps is (10 sec: 45874.5, 60 sec: 48059.7, 300 sec: 47541.4). Total num frames: 361005056. Throughput: 0: 46771.0. Samples: 362174900. Policy #0 lag: (min: 0.0, avg: 32.6, max: 89.0) [2024-03-21 00:54:05,522][03784] Avg episode reward: [(0, '0.414')] [2024-03-21 00:54:10,521][03784] Fps is (10 sec: 42598.0, 60 sec: 46421.4, 300 sec: 46652.7). Total num frames: 361103360. Throughput: 0: 47095.4. Samples: 362472500. Policy #0 lag: (min: 0.0, avg: 31.7, max: 96.0) [2024-03-21 00:54:10,522][03784] Avg episode reward: [(0, '1.035')] [2024-03-21 00:54:13,518][04017] Updated weights for policy 0, policy_version 11024 (0.0012) [2024-03-21 00:54:15,521][03784] Fps is (10 sec: 36045.1, 60 sec: 46967.5, 300 sec: 46208.5). Total num frames: 361365504. Throughput: 0: 47009.0. Samples: 362612900. Policy #0 lag: (min: 0.0, avg: 31.7, max: 96.0) [2024-03-21 00:54:15,522][03784] Avg episode reward: [(0, '0.403')] [2024-03-21 00:54:20,521][03784] Fps is (10 sec: 39321.5, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 361496576. Throughput: 0: 46384.3. Samples: 362873300. Policy #0 lag: (min: 2.0, avg: 33.4, max: 82.0) [2024-03-21 00:54:20,522][03784] Avg episode reward: [(0, '0.912')] [2024-03-21 00:54:21,103][04017] Updated weights for policy 0, policy_version 11034 (0.0013) [2024-03-21 00:54:25,521][03784] Fps is (10 sec: 39321.5, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 361758720. Throughput: 0: 45124.5. Samples: 363106500. Policy #0 lag: (min: 2.0, avg: 33.4, max: 82.0) [2024-03-21 00:54:25,522][03784] Avg episode reward: [(0, '0.963')] [2024-03-21 00:54:27,473][04017] Updated weights for policy 0, policy_version 11044 (0.0011) [2024-03-21 00:54:29,072][03995] Signal inference workers to stop experience collection... (7300 times) [2024-03-21 00:54:29,143][03995] Signal inference workers to resume experience collection... (7300 times) [2024-03-21 00:54:29,177][04017] InferenceWorker_p0-w0: stopping experience collection (7300 times) [2024-03-21 00:54:29,227][04017] InferenceWorker_p0-w0: resuming experience collection (7300 times) [2024-03-21 00:54:30,521][03784] Fps is (10 sec: 62260.1, 60 sec: 44783.0, 300 sec: 46541.7). Total num frames: 362119168. Throughput: 0: 45126.8. Samples: 363243100. Policy #0 lag: (min: 0.0, avg: 53.0, max: 117.0) [2024-03-21 00:54:30,522][03784] Avg episode reward: [(0, '1.010')] [2024-03-21 00:54:32,577][04017] Updated weights for policy 0, policy_version 11054 (0.0024) [2024-03-21 00:54:35,521][03784] Fps is (10 sec: 65535.1, 60 sec: 46421.2, 300 sec: 47208.1). Total num frames: 362414080. Throughput: 0: 45171.0. Samples: 363499700. Policy #0 lag: (min: 1.0, avg: 39.8, max: 91.0) [2024-03-21 00:54:35,522][03784] Avg episode reward: [(0, '1.128')] [2024-03-21 00:54:39,849][04017] Updated weights for policy 0, policy_version 11064 (0.0012) [2024-03-21 00:54:40,521][03784] Fps is (10 sec: 42597.9, 60 sec: 46421.3, 300 sec: 47097.1). Total num frames: 362545152. Throughput: 0: 45715.4. Samples: 363790400. Policy #0 lag: (min: 1.0, avg: 39.8, max: 91.0) [2024-03-21 00:54:40,522][03784] Avg episode reward: [(0, '1.404')] [2024-03-21 00:54:45,521][03784] Fps is (10 sec: 32768.3, 60 sec: 47513.7, 300 sec: 46319.5). Total num frames: 362741760. Throughput: 0: 45593.3. Samples: 363943000. Policy #0 lag: (min: 1.0, avg: 26.0, max: 62.0) [2024-03-21 00:54:45,522][03784] Avg episode reward: [(0, '0.662')] [2024-03-21 00:54:48,176][04017] Updated weights for policy 0, policy_version 11074 (0.0010) [2024-03-21 00:54:50,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45875.1, 300 sec: 46319.5). Total num frames: 362971136. Throughput: 0: 45111.1. Samples: 364204900. Policy #0 lag: (min: 1.0, avg: 26.0, max: 62.0) [2024-03-21 00:54:50,522][03784] Avg episode reward: [(0, '0.949')] [2024-03-21 00:54:54,472][04017] Updated weights for policy 0, policy_version 11084 (0.0015) [2024-03-21 00:54:55,521][03784] Fps is (10 sec: 45874.9, 60 sec: 44236.7, 300 sec: 46208.4). Total num frames: 363200512. Throughput: 0: 44220.0. Samples: 364462400. Policy #0 lag: (min: 0.0, avg: 41.5, max: 87.0) [2024-03-21 00:54:55,522][03784] Avg episode reward: [(0, '0.795')] [2024-03-21 00:55:00,521][03784] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 46652.7). Total num frames: 363429888. Throughput: 0: 43715.5. Samples: 364580100. Policy #0 lag: (min: 0.0, avg: 41.5, max: 87.0) [2024-03-21 00:55:00,522][03784] Avg episode reward: [(0, '0.791')] [2024-03-21 00:55:05,521][03784] Fps is (10 sec: 29491.4, 60 sec: 41506.2, 300 sec: 45653.1). Total num frames: 363495424. Throughput: 0: 43644.5. Samples: 364837300. Policy #0 lag: (min: 0.0, avg: 41.5, max: 87.0) [2024-03-21 00:55:05,522][03784] Avg episode reward: [(0, '1.239')] [2024-03-21 00:55:05,725][04017] Updated weights for policy 0, policy_version 11094 (0.0012) [2024-03-21 00:55:10,521][03784] Fps is (10 sec: 39321.7, 60 sec: 45329.1, 300 sec: 45875.2). Total num frames: 363823104. Throughput: 0: 43593.4. Samples: 365068200. Policy #0 lag: (min: 0.0, avg: 48.5, max: 106.0) [2024-03-21 00:55:10,522][03784] Avg episode reward: [(0, '0.970')] [2024-03-21 00:55:11,779][04017] Updated weights for policy 0, policy_version 11104 (0.0012) [2024-03-21 00:55:15,521][03784] Fps is (10 sec: 49152.6, 60 sec: 43690.7, 300 sec: 45764.1). Total num frames: 363986944. Throughput: 0: 43668.9. Samples: 365208200. Policy #0 lag: (min: 0.0, avg: 38.8, max: 81.0) [2024-03-21 00:55:15,522][03784] Avg episode reward: [(0, '1.389')] [2024-03-21 00:55:17,958][04017] Updated weights for policy 0, policy_version 11114 (0.0014) [2024-03-21 00:55:20,521][03784] Fps is (10 sec: 55705.4, 60 sec: 48059.8, 300 sec: 46652.8). Total num frames: 364380160. Throughput: 0: 44155.7. Samples: 365486700. Policy #0 lag: (min: 0.0, avg: 38.8, max: 81.0) [2024-03-21 00:55:20,522][03784] Avg episode reward: [(0, '1.282')] [2024-03-21 00:55:25,521][03784] Fps is (10 sec: 49151.8, 60 sec: 45329.1, 300 sec: 46319.5). Total num frames: 364478464. Throughput: 0: 44155.7. Samples: 365777400. Policy #0 lag: (min: 0.0, avg: 31.7, max: 72.0) [2024-03-21 00:55:25,522][03784] Avg episode reward: [(0, '1.418')] [2024-03-21 00:55:27,025][03995] Signal inference workers to stop experience collection... (7350 times) [2024-03-21 00:55:27,020][04017] Updated weights for policy 0, policy_version 11124 (0.0011) [2024-03-21 00:55:27,027][03995] Signal inference workers to resume experience collection... (7350 times) [2024-03-21 00:55:27,087][04017] InferenceWorker_p0-w0: stopping experience collection (7350 times) [2024-03-21 00:55:27,088][04017] InferenceWorker_p0-w0: resuming experience collection (7350 times) [2024-03-21 00:55:30,521][03784] Fps is (10 sec: 29491.1, 60 sec: 42598.3, 300 sec: 46097.4). Total num frames: 364675072. Throughput: 0: 44075.5. Samples: 365926400. Policy #0 lag: (min: 0.0, avg: 31.7, max: 72.0) [2024-03-21 00:55:30,522][03784] Avg episode reward: [(0, '1.418')] [2024-03-21 00:55:33,806][04017] Updated weights for policy 0, policy_version 11134 (0.0011) [2024-03-21 00:55:35,521][03784] Fps is (10 sec: 42598.8, 60 sec: 41506.3, 300 sec: 45764.2). Total num frames: 364904448. Throughput: 0: 44420.2. Samples: 366203800. Policy #0 lag: (min: 1.0, avg: 25.6, max: 60.0) [2024-03-21 00:55:35,521][03784] Avg episode reward: [(0, '0.559')] [2024-03-21 00:55:40,521][03784] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 46097.3). Total num frames: 365133824. Throughput: 0: 44644.3. Samples: 366471400. Policy #0 lag: (min: 1.0, avg: 25.6, max: 60.0) [2024-03-21 00:55:40,523][03784] Avg episode reward: [(0, '0.428')] [2024-03-21 00:55:40,816][04017] Updated weights for policy 0, policy_version 11144 (0.0018) [2024-03-21 00:55:45,521][03784] Fps is (10 sec: 52428.6, 60 sec: 44783.0, 300 sec: 46541.7). Total num frames: 365428736. Throughput: 0: 44684.5. Samples: 366590900. Policy #0 lag: (min: 0.0, avg: 39.0, max: 80.0) [2024-03-21 00:55:45,522][03784] Avg episode reward: [(0, '1.156')] [2024-03-21 00:55:45,950][04017] Updated weights for policy 0, policy_version 11154 (0.0010) [2024-03-21 00:55:50,072][04017] Updated weights for policy 0, policy_version 11164 (0.0012) [2024-03-21 00:55:50,521][03784] Fps is (10 sec: 72091.0, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 365854720. Throughput: 0: 44737.8. Samples: 366850500. Policy #0 lag: (min: 0.0, avg: 39.0, max: 80.0) [2024-03-21 00:55:50,522][03784] Avg episode reward: [(0, '1.111')] [2024-03-21 00:55:55,521][03784] Fps is (10 sec: 55705.3, 60 sec: 46421.4, 300 sec: 46541.7). Total num frames: 365985792. Throughput: 0: 45626.7. Samples: 367121400. Policy #0 lag: (min: 0.0, avg: 57.0, max: 125.0) [2024-03-21 00:55:55,522][03784] Avg episode reward: [(0, '1.281')] [2024-03-21 00:56:00,521][03784] Fps is (10 sec: 19660.6, 60 sec: 43690.6, 300 sec: 45430.9). Total num frames: 366051328. Throughput: 0: 45742.0. Samples: 367266600. Policy #0 lag: (min: 0.0, avg: 57.0, max: 125.0) [2024-03-21 00:56:00,522][03784] Avg episode reward: [(0, '0.869')] [2024-03-21 00:56:00,841][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000011172_366084096.pth... [2024-03-21 00:56:00,895][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000010839_355172352.pth [2024-03-21 00:56:01,621][04017] Updated weights for policy 0, policy_version 11174 (0.0011) [2024-03-21 00:56:05,521][03784] Fps is (10 sec: 29491.1, 60 sec: 46421.4, 300 sec: 45208.7). Total num frames: 366280704. Throughput: 0: 46115.6. Samples: 367561900. Policy #0 lag: (min: 1.0, avg: 32.0, max: 65.0) [2024-03-21 00:56:05,522][03784] Avg episode reward: [(0, '1.133')] [2024-03-21 00:56:10,521][03784] Fps is (10 sec: 32768.5, 60 sec: 42598.4, 300 sec: 45097.6). Total num frames: 366379008. Throughput: 0: 46564.4. Samples: 367872800. Policy #0 lag: (min: 1.0, avg: 32.0, max: 65.0) [2024-03-21 00:56:10,522][03784] Avg episode reward: [(0, '1.133')] [2024-03-21 00:56:12,013][04017] Updated weights for policy 0, policy_version 11184 (0.0010) [2024-03-21 00:56:15,521][03784] Fps is (10 sec: 39321.6, 60 sec: 44782.9, 300 sec: 46097.4). Total num frames: 366673920. Throughput: 0: 46540.0. Samples: 368020700. Policy #0 lag: (min: 0.0, avg: 32.4, max: 89.0) [2024-03-21 00:56:15,522][03784] Avg episode reward: [(0, '1.133')] [2024-03-21 00:56:16,767][04017] Updated weights for policy 0, policy_version 11194 (0.0014) [2024-03-21 00:56:20,235][03995] Signal inference workers to stop experience collection... (7400 times) [2024-03-21 00:56:20,237][03995] Signal inference workers to resume experience collection... (7400 times) [2024-03-21 00:56:20,299][04017] InferenceWorker_p0-w0: stopping experience collection (7400 times) [2024-03-21 00:56:20,300][04017] InferenceWorker_p0-w0: resuming experience collection (7400 times) [2024-03-21 00:56:20,521][03784] Fps is (10 sec: 62259.3, 60 sec: 43690.7, 300 sec: 46097.4). Total num frames: 367001600. Throughput: 0: 46579.9. Samples: 368299900. Policy #0 lag: (min: 1.0, avg: 33.6, max: 75.0) [2024-03-21 00:56:20,522][03784] Avg episode reward: [(0, '1.163')] [2024-03-21 00:56:21,511][04017] Updated weights for policy 0, policy_version 11204 (0.0011) [2024-03-21 00:56:25,521][03784] Fps is (10 sec: 62259.2, 60 sec: 46967.4, 300 sec: 46319.5). Total num frames: 367296512. Throughput: 0: 47006.9. Samples: 368586700. Policy #0 lag: (min: 1.0, avg: 33.6, max: 75.0) [2024-03-21 00:56:25,522][03784] Avg episode reward: [(0, '1.163')] [2024-03-21 00:56:29,158][04017] Updated weights for policy 0, policy_version 11214 (0.0011) [2024-03-21 00:56:30,521][03784] Fps is (10 sec: 58981.6, 60 sec: 48605.8, 300 sec: 46652.7). Total num frames: 367591424. Throughput: 0: 47719.8. Samples: 368738300. Policy #0 lag: (min: 1.0, avg: 37.5, max: 83.0) [2024-03-21 00:56:30,522][03784] Avg episode reward: [(0, '1.197')] [2024-03-21 00:56:32,749][04017] Updated weights for policy 0, policy_version 11224 (0.0011) [2024-03-21 00:56:35,521][03784] Fps is (10 sec: 65536.3, 60 sec: 50790.3, 300 sec: 47319.2). Total num frames: 367951872. Throughput: 0: 47611.2. Samples: 368993000. Policy #0 lag: (min: 1.0, avg: 37.5, max: 83.0) [2024-03-21 00:56:35,522][03784] Avg episode reward: [(0, '1.113')] [2024-03-21 00:56:40,331][04017] Updated weights for policy 0, policy_version 11234 (0.0014) [2024-03-21 00:56:40,521][03784] Fps is (10 sec: 52429.7, 60 sec: 49698.3, 300 sec: 47097.1). Total num frames: 368115712. Throughput: 0: 48444.5. Samples: 369301400. Policy #0 lag: (min: 0.0, avg: 54.8, max: 115.0) [2024-03-21 00:56:40,522][03784] Avg episode reward: [(0, '0.842')] [2024-03-21 00:56:45,521][03784] Fps is (10 sec: 19660.7, 60 sec: 45329.0, 300 sec: 46208.4). Total num frames: 368148480. Throughput: 0: 48780.2. Samples: 369461700. Policy #0 lag: (min: 0.0, avg: 54.8, max: 115.0) [2024-03-21 00:56:45,522][03784] Avg episode reward: [(0, '1.001')] [2024-03-21 00:56:49,918][04017] Updated weights for policy 0, policy_version 11244 (0.0012) [2024-03-21 00:56:50,521][03784] Fps is (10 sec: 36044.3, 60 sec: 43690.6, 300 sec: 46208.4). Total num frames: 368476160. Throughput: 0: 48826.5. Samples: 369759100. Policy #0 lag: (min: 0.0, avg: 36.8, max: 89.0) [2024-03-21 00:56:50,522][03784] Avg episode reward: [(0, '1.218')] [2024-03-21 00:56:55,521][03784] Fps is (10 sec: 58982.3, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 368738304. Throughput: 0: 47955.6. Samples: 370030800. Policy #0 lag: (min: 0.0, avg: 36.8, max: 89.0) [2024-03-21 00:56:55,522][03784] Avg episode reward: [(0, '0.535')] [2024-03-21 00:56:56,913][04017] Updated weights for policy 0, policy_version 11254 (0.0011) [2024-03-21 00:57:00,521][03784] Fps is (10 sec: 45875.9, 60 sec: 48059.9, 300 sec: 46319.5). Total num frames: 368934912. Throughput: 0: 47988.9. Samples: 370180200. Policy #0 lag: (min: 0.0, avg: 47.1, max: 94.0) [2024-03-21 00:57:00,523][03784] Avg episode reward: [(0, '0.375')] [2024-03-21 00:57:04,135][04017] Updated weights for policy 0, policy_version 11264 (0.0018) [2024-03-21 00:57:05,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46967.5, 300 sec: 46208.4). Total num frames: 369098752. Throughput: 0: 48213.3. Samples: 370469500. Policy #0 lag: (min: 0.0, avg: 47.1, max: 94.0) [2024-03-21 00:57:05,522][03784] Avg episode reward: [(0, '1.032')] [2024-03-21 00:57:09,944][03995] Signal inference workers to stop experience collection... (7450 times) [2024-03-21 00:57:09,994][04017] InferenceWorker_p0-w0: stopping experience collection (7450 times) [2024-03-21 00:57:10,170][03995] Signal inference workers to resume experience collection... (7450 times) [2024-03-21 00:57:10,170][04017] InferenceWorker_p0-w0: resuming experience collection (7450 times) [2024-03-21 00:57:10,461][04017] Updated weights for policy 0, policy_version 11274 (0.0010) [2024-03-21 00:57:10,521][03784] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 46652.7). Total num frames: 369426432. Throughput: 0: 47631.1. Samples: 370730100. Policy #0 lag: (min: 0.0, avg: 49.4, max: 106.0) [2024-03-21 00:57:10,522][03784] Avg episode reward: [(0, '1.095')] [2024-03-21 00:57:15,443][04017] Updated weights for policy 0, policy_version 11284 (0.0022) [2024-03-21 00:57:15,521][03784] Fps is (10 sec: 65536.1, 60 sec: 51336.5, 300 sec: 46763.9). Total num frames: 369754112. Throughput: 0: 47089.1. Samples: 370857300. Policy #0 lag: (min: 0.0, avg: 49.4, max: 106.0) [2024-03-21 00:57:15,522][03784] Avg episode reward: [(0, '1.095')] [2024-03-21 00:57:20,521][03784] Fps is (10 sec: 55705.4, 60 sec: 49698.1, 300 sec: 46763.8). Total num frames: 369983488. Throughput: 0: 47671.0. Samples: 371138200. Policy #0 lag: (min: 0.0, avg: 39.7, max: 102.0) [2024-03-21 00:57:20,522][03784] Avg episode reward: [(0, '1.217')] [2024-03-21 00:57:23,101][04017] Updated weights for policy 0, policy_version 11294 (0.0012) [2024-03-21 00:57:25,521][03784] Fps is (10 sec: 45875.4, 60 sec: 48605.9, 300 sec: 46652.8). Total num frames: 370212864. Throughput: 0: 47486.7. Samples: 371438300. Policy #0 lag: (min: 0.0, avg: 39.7, max: 102.0) [2024-03-21 00:57:25,522][03784] Avg episode reward: [(0, '1.217')] [2024-03-21 00:57:28,212][04017] Updated weights for policy 0, policy_version 11304 (0.0015) [2024-03-21 00:57:30,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46967.6, 300 sec: 46652.7). Total num frames: 370409472. Throughput: 0: 47197.8. Samples: 371585600. Policy #0 lag: (min: 2.0, avg: 43.6, max: 87.0) [2024-03-21 00:57:30,522][03784] Avg episode reward: [(0, '1.217')] [2024-03-21 00:57:35,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 46430.6). Total num frames: 370606080. Throughput: 0: 47275.7. Samples: 371886500. Policy #0 lag: (min: 0.0, avg: 39.7, max: 85.0) [2024-03-21 00:57:35,522][03784] Avg episode reward: [(0, '0.485')] [2024-03-21 00:57:37,664][04017] Updated weights for policy 0, policy_version 11314 (0.0011) [2024-03-21 00:57:40,521][03784] Fps is (10 sec: 42598.3, 60 sec: 45329.0, 300 sec: 46319.5). Total num frames: 370835456. Throughput: 0: 46884.4. Samples: 372140600. Policy #0 lag: (min: 0.0, avg: 39.7, max: 85.0) [2024-03-21 00:57:40,522][03784] Avg episode reward: [(0, '1.071')] [2024-03-21 00:57:44,792][04017] Updated weights for policy 0, policy_version 11324 (0.0018) [2024-03-21 00:57:45,521][03784] Fps is (10 sec: 52428.3, 60 sec: 49698.1, 300 sec: 46541.7). Total num frames: 371130368. Throughput: 0: 46426.6. Samples: 372269400. Policy #0 lag: (min: 3.0, avg: 28.1, max: 61.0) [2024-03-21 00:57:45,522][03784] Avg episode reward: [(0, '1.086')] [2024-03-21 00:57:50,522][03784] Fps is (10 sec: 39319.7, 60 sec: 45874.9, 300 sec: 45986.2). Total num frames: 371228672. Throughput: 0: 45915.0. Samples: 372535700. Policy #0 lag: (min: 3.0, avg: 28.1, max: 61.0) [2024-03-21 00:57:50,522][03784] Avg episode reward: [(0, '0.799')] [2024-03-21 00:57:53,932][04017] Updated weights for policy 0, policy_version 11334 (0.0015) [2024-03-21 00:57:55,521][03784] Fps is (10 sec: 42598.6, 60 sec: 46967.5, 300 sec: 46763.8). Total num frames: 371556352. Throughput: 0: 45942.2. Samples: 372797500. Policy #0 lag: (min: 0.0, avg: 36.6, max: 95.0) [2024-03-21 00:57:55,522][03784] Avg episode reward: [(0, '0.995')] [2024-03-21 00:57:59,250][04017] Updated weights for policy 0, policy_version 11344 (0.0011) [2024-03-21 00:58:00,521][03784] Fps is (10 sec: 55708.4, 60 sec: 47513.6, 300 sec: 46319.5). Total num frames: 371785728. Throughput: 0: 46282.2. Samples: 372940000. Policy #0 lag: (min: 0.0, avg: 36.6, max: 95.0) [2024-03-21 00:58:00,522][03784] Avg episode reward: [(0, '1.281')] [2024-03-21 00:58:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000011346_371785728.pth... [2024-03-21 00:58:00,652][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000011007_360677376.pth [2024-03-21 00:58:05,521][03784] Fps is (10 sec: 49152.8, 60 sec: 49152.1, 300 sec: 46541.7). Total num frames: 372047872. Throughput: 0: 46669.1. Samples: 373238300. Policy #0 lag: (min: 0.0, avg: 45.6, max: 103.0) [2024-03-21 00:58:05,521][03784] Avg episode reward: [(0, '0.958')] [2024-03-21 00:58:05,555][04017] Updated weights for policy 0, policy_version 11354 (0.0014) [2024-03-21 00:58:10,419][03995] Signal inference workers to stop experience collection... (7500 times) [2024-03-21 00:58:10,427][03995] Signal inference workers to resume experience collection... (7500 times) [2024-03-21 00:58:10,454][04017] InferenceWorker_p0-w0: stopping experience collection (7500 times) [2024-03-21 00:58:10,497][04017] InferenceWorker_p0-w0: resuming experience collection (7500 times) [2024-03-21 00:58:10,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45329.1, 300 sec: 46097.4). Total num frames: 372146176. Throughput: 0: 46479.9. Samples: 373529900. Policy #0 lag: (min: 0.0, avg: 45.6, max: 103.0) [2024-03-21 00:58:10,522][03784] Avg episode reward: [(0, '0.792')] [2024-03-21 00:58:13,619][04017] Updated weights for policy 0, policy_version 11364 (0.0015) [2024-03-21 00:58:15,521][03784] Fps is (10 sec: 32767.3, 60 sec: 43690.6, 300 sec: 45986.3). Total num frames: 372375552. Throughput: 0: 45824.4. Samples: 373647700. Policy #0 lag: (min: 1.0, avg: 31.6, max: 57.0) [2024-03-21 00:58:15,522][03784] Avg episode reward: [(0, '0.736')] [2024-03-21 00:58:20,174][04017] Updated weights for policy 0, policy_version 11374 (0.0022) [2024-03-21 00:58:20,521][03784] Fps is (10 sec: 58982.5, 60 sec: 45875.2, 300 sec: 46208.4). Total num frames: 372736000. Throughput: 0: 44946.6. Samples: 373909100. Policy #0 lag: (min: 1.0, avg: 31.6, max: 57.0) [2024-03-21 00:58:20,522][03784] Avg episode reward: [(0, '0.614')] [2024-03-21 00:58:25,521][03784] Fps is (10 sec: 55705.6, 60 sec: 45329.0, 300 sec: 45764.1). Total num frames: 372932608. Throughput: 0: 45484.5. Samples: 374187400. Policy #0 lag: (min: 0.0, avg: 42.1, max: 82.0) [2024-03-21 00:58:25,522][03784] Avg episode reward: [(0, '0.692')] [2024-03-21 00:58:29,508][04017] Updated weights for policy 0, policy_version 11384 (0.0011) [2024-03-21 00:58:30,521][03784] Fps is (10 sec: 36044.6, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 373096448. Throughput: 0: 45640.0. Samples: 374323200. Policy #0 lag: (min: 0.0, avg: 42.1, max: 82.0) [2024-03-21 00:58:30,522][03784] Avg episode reward: [(0, '0.692')] [2024-03-21 00:58:32,880][04017] Updated weights for policy 0, policy_version 11394 (0.0015) [2024-03-21 00:58:35,521][03784] Fps is (10 sec: 62259.2, 60 sec: 49151.9, 300 sec: 46763.8). Total num frames: 373555200. Throughput: 0: 44644.9. Samples: 374544700. Policy #0 lag: (min: 3.0, avg: 29.9, max: 70.0) [2024-03-21 00:58:35,522][03784] Avg episode reward: [(0, '0.643')] [2024-03-21 00:58:40,521][03784] Fps is (10 sec: 55705.6, 60 sec: 46967.5, 300 sec: 46652.8). Total num frames: 373653504. Throughput: 0: 44919.9. Samples: 374818900. Policy #0 lag: (min: 3.0, avg: 29.9, max: 70.0) [2024-03-21 00:58:40,522][03784] Avg episode reward: [(0, '0.980')] [2024-03-21 00:58:43,416][04017] Updated weights for policy 0, policy_version 11404 (0.0011) [2024-03-21 00:58:45,521][03784] Fps is (10 sec: 19660.8, 60 sec: 43690.6, 300 sec: 45875.2). Total num frames: 373751808. Throughput: 0: 44811.1. Samples: 374956500. Policy #0 lag: (min: 0.0, avg: 43.2, max: 90.0) [2024-03-21 00:58:45,522][03784] Avg episode reward: [(0, '0.477')] [2024-03-21 00:58:50,014][04017] Updated weights for policy 0, policy_version 11414 (0.0012) [2024-03-21 00:58:50,521][03784] Fps is (10 sec: 36045.1, 60 sec: 46421.7, 300 sec: 45653.0). Total num frames: 374013952. Throughput: 0: 44473.2. Samples: 375239600. Policy #0 lag: (min: 0.0, avg: 43.2, max: 90.0) [2024-03-21 00:58:50,522][03784] Avg episode reward: [(0, '1.048')] [2024-03-21 00:58:55,521][03784] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 45653.0). Total num frames: 374145024. Throughput: 0: 44086.7. Samples: 375513800. Policy #0 lag: (min: 3.0, avg: 32.5, max: 69.0) [2024-03-21 00:58:55,522][03784] Avg episode reward: [(0, '1.163')] [2024-03-21 00:58:59,516][04017] Updated weights for policy 0, policy_version 11424 (0.0012) [2024-03-21 00:59:00,521][03784] Fps is (10 sec: 42598.5, 60 sec: 44236.9, 300 sec: 45542.0). Total num frames: 374439936. Throughput: 0: 44531.2. Samples: 375651600. Policy #0 lag: (min: 3.0, avg: 32.5, max: 69.0) [2024-03-21 00:59:00,522][03784] Avg episode reward: [(0, '1.147')] [2024-03-21 00:59:00,726][03995] Signal inference workers to stop experience collection... (7550 times) [2024-03-21 00:59:00,795][04017] InferenceWorker_p0-w0: stopping experience collection (7550 times) [2024-03-21 00:59:00,855][03995] Signal inference workers to resume experience collection... (7550 times) [2024-03-21 00:59:00,856][04017] InferenceWorker_p0-w0: resuming experience collection (7550 times) [2024-03-21 00:59:05,521][03784] Fps is (10 sec: 49152.1, 60 sec: 43144.4, 300 sec: 45875.2). Total num frames: 374636544. Throughput: 0: 44400.0. Samples: 375907100. Policy #0 lag: (min: 3.0, avg: 36.2, max: 103.0) [2024-03-21 00:59:05,522][03784] Avg episode reward: [(0, '0.589')] [2024-03-21 00:59:05,638][04017] Updated weights for policy 0, policy_version 11434 (0.0017) [2024-03-21 00:59:10,521][03784] Fps is (10 sec: 49151.6, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 374931456. Throughput: 0: 44073.3. Samples: 376170700. Policy #0 lag: (min: 3.0, avg: 36.2, max: 103.0) [2024-03-21 00:59:10,522][03784] Avg episode reward: [(0, '1.330')] [2024-03-21 00:59:14,370][04017] Updated weights for policy 0, policy_version 11444 (0.0011) [2024-03-21 00:59:15,521][03784] Fps is (10 sec: 42598.5, 60 sec: 44783.0, 300 sec: 45986.3). Total num frames: 375062528. Throughput: 0: 44586.8. Samples: 376329600. Policy #0 lag: (min: 0.0, avg: 30.9, max: 111.0) [2024-03-21 00:59:15,522][03784] Avg episode reward: [(0, '0.646')] [2024-03-21 00:59:18,335][04017] Updated weights for policy 0, policy_version 11454 (0.0020) [2024-03-21 00:59:20,521][03784] Fps is (10 sec: 52428.9, 60 sec: 45329.1, 300 sec: 46430.6). Total num frames: 375455744. Throughput: 0: 45795.6. Samples: 376605500. Policy #0 lag: (min: 0.0, avg: 30.9, max: 111.0) [2024-03-21 00:59:20,522][03784] Avg episode reward: [(0, '0.723')] [2024-03-21 00:59:23,467][04017] Updated weights for policy 0, policy_version 11464 (0.0017) [2024-03-21 00:59:25,521][03784] Fps is (10 sec: 72089.1, 60 sec: 47513.6, 300 sec: 46319.5). Total num frames: 375783424. Throughput: 0: 45708.9. Samples: 376875800. Policy #0 lag: (min: 0.0, avg: 45.8, max: 114.0) [2024-03-21 00:59:25,522][03784] Avg episode reward: [(0, '0.960')] [2024-03-21 00:59:30,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46421.4, 300 sec: 45653.1). Total num frames: 375881728. Throughput: 0: 45980.0. Samples: 377025600. Policy #0 lag: (min: 0.0, avg: 45.8, max: 114.0) [2024-03-21 00:59:30,522][03784] Avg episode reward: [(0, '0.789')] [2024-03-21 00:59:33,603][04017] Updated weights for policy 0, policy_version 11474 (0.0019) [2024-03-21 00:59:35,521][03784] Fps is (10 sec: 22937.7, 60 sec: 40960.0, 300 sec: 45653.1). Total num frames: 376012800. Throughput: 0: 46508.9. Samples: 377332500. Policy #0 lag: (min: 0.0, avg: 30.0, max: 67.0) [2024-03-21 00:59:35,522][03784] Avg episode reward: [(0, '1.145')] [2024-03-21 00:59:40,521][03784] Fps is (10 sec: 29491.0, 60 sec: 42052.2, 300 sec: 45542.0). Total num frames: 376176640. Throughput: 0: 46753.2. Samples: 377617700. Policy #0 lag: (min: 0.0, avg: 30.0, max: 67.0) [2024-03-21 00:59:40,522][03784] Avg episode reward: [(0, '1.079')] [2024-03-21 00:59:42,888][04017] Updated weights for policy 0, policy_version 11484 (0.0018) [2024-03-21 00:59:45,521][03784] Fps is (10 sec: 32767.3, 60 sec: 43144.4, 300 sec: 45319.8). Total num frames: 376340480. Throughput: 0: 46535.3. Samples: 377745700. Policy #0 lag: (min: 0.0, avg: 34.8, max: 78.0) [2024-03-21 00:59:45,522][03784] Avg episode reward: [(0, '0.453')] [2024-03-21 00:59:49,912][04017] Updated weights for policy 0, policy_version 11494 (0.0017) [2024-03-21 00:59:50,521][03784] Fps is (10 sec: 49152.4, 60 sec: 44236.8, 300 sec: 45653.1). Total num frames: 376668160. Throughput: 0: 47077.7. Samples: 378025600. Policy #0 lag: (min: 0.0, avg: 34.8, max: 78.0) [2024-03-21 00:59:50,522][03784] Avg episode reward: [(0, '0.772')] [2024-03-21 00:59:53,648][04017] Updated weights for policy 0, policy_version 11504 (0.0024) [2024-03-21 00:59:53,832][03995] Signal inference workers to stop experience collection... (7600 times) [2024-03-21 00:59:53,914][04017] InferenceWorker_p0-w0: stopping experience collection (7600 times) [2024-03-21 00:59:53,956][03995] Signal inference workers to resume experience collection... (7600 times) [2024-03-21 00:59:53,961][04017] InferenceWorker_p0-w0: resuming experience collection (7600 times) [2024-03-21 00:59:55,521][03784] Fps is (10 sec: 78644.9, 60 sec: 49698.1, 300 sec: 46430.6). Total num frames: 377126912. Throughput: 0: 47244.5. Samples: 378296700. Policy #0 lag: (min: 0.0, avg: 35.9, max: 101.0) [2024-03-21 00:59:55,522][03784] Avg episode reward: [(0, '0.892')] [2024-03-21 00:59:58,689][04017] Updated weights for policy 0, policy_version 11514 (0.0016) [2024-03-21 01:00:00,521][03784] Fps is (10 sec: 72089.9, 60 sec: 49152.0, 300 sec: 47097.1). Total num frames: 377389056. Throughput: 0: 46948.9. Samples: 378442300. Policy #0 lag: (min: 0.0, avg: 35.9, max: 101.0) [2024-03-21 01:00:00,522][03784] Avg episode reward: [(0, '0.401')] [2024-03-21 01:00:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000011517_377389056.pth... [2024-03-21 01:00:00,674][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000011172_366084096.pth [2024-03-21 01:00:04,795][04017] Updated weights for policy 0, policy_version 11524 (0.0015) [2024-03-21 01:00:05,521][03784] Fps is (10 sec: 52428.3, 60 sec: 50244.2, 300 sec: 46874.9). Total num frames: 377651200. Throughput: 0: 47033.3. Samples: 378722000. Policy #0 lag: (min: 0.0, avg: 40.9, max: 82.0) [2024-03-21 01:00:05,522][03784] Avg episode reward: [(0, '0.677')] [2024-03-21 01:00:10,521][03784] Fps is (10 sec: 49151.6, 60 sec: 49152.0, 300 sec: 47097.0). Total num frames: 377880576. Throughput: 0: 47384.4. Samples: 379008100. Policy #0 lag: (min: 0.0, avg: 40.9, max: 82.0) [2024-03-21 01:00:10,522][03784] Avg episode reward: [(0, '1.132')] [2024-03-21 01:00:15,521][03784] Fps is (10 sec: 26214.6, 60 sec: 47513.6, 300 sec: 45875.2). Total num frames: 377913344. Throughput: 0: 47669.0. Samples: 379170700. Policy #0 lag: (min: 0.0, avg: 36.8, max: 70.0) [2024-03-21 01:00:15,522][03784] Avg episode reward: [(0, '0.728')] [2024-03-21 01:00:15,610][04017] Updated weights for policy 0, policy_version 11534 (0.0015) [2024-03-21 01:00:20,521][03784] Fps is (10 sec: 29491.2, 60 sec: 45329.0, 300 sec: 46430.6). Total num frames: 378175488. Throughput: 0: 47037.7. Samples: 379449200. Policy #0 lag: (min: 0.0, avg: 36.8, max: 70.0) [2024-03-21 01:00:20,522][03784] Avg episode reward: [(0, '0.931')] [2024-03-21 01:00:24,112][04017] Updated weights for policy 0, policy_version 11544 (0.0010) [2024-03-21 01:00:25,521][03784] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 46430.6). Total num frames: 378372096. Throughput: 0: 46922.3. Samples: 379729200. Policy #0 lag: (min: 1.0, avg: 41.2, max: 84.0) [2024-03-21 01:00:25,522][03784] Avg episode reward: [(0, '0.678')] [2024-03-21 01:00:30,391][04017] Updated weights for policy 0, policy_version 11554 (0.0013) [2024-03-21 01:00:30,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45329.1, 300 sec: 46430.6). Total num frames: 378601472. Throughput: 0: 47260.1. Samples: 379872400. Policy #0 lag: (min: 1.0, avg: 41.2, max: 84.0) [2024-03-21 01:00:30,522][03784] Avg episode reward: [(0, '1.139')] [2024-03-21 01:00:35,521][03784] Fps is (10 sec: 52428.5, 60 sec: 48059.7, 300 sec: 46652.8). Total num frames: 378896384. Throughput: 0: 47133.3. Samples: 380146600. Policy #0 lag: (min: 0.0, avg: 44.4, max: 111.0) [2024-03-21 01:00:35,522][03784] Avg episode reward: [(0, '1.139')] [2024-03-21 01:00:35,608][04017] Updated weights for policy 0, policy_version 11564 (0.0011) [2024-03-21 01:00:40,521][03784] Fps is (10 sec: 52428.5, 60 sec: 49152.0, 300 sec: 46430.6). Total num frames: 379125760. Throughput: 0: 47537.6. Samples: 380435900. Policy #0 lag: (min: 0.0, avg: 44.4, max: 111.0) [2024-03-21 01:00:40,522][03784] Avg episode reward: [(0, '0.497')] [2024-03-21 01:00:42,348][04017] Updated weights for policy 0, policy_version 11574 (0.0011) [2024-03-21 01:00:45,521][03784] Fps is (10 sec: 58982.7, 60 sec: 52429.0, 300 sec: 46208.4). Total num frames: 379486208. Throughput: 0: 47500.0. Samples: 380579800. Policy #0 lag: (min: 2.0, avg: 56.9, max: 119.0) [2024-03-21 01:00:45,522][03784] Avg episode reward: [(0, '1.078')] [2024-03-21 01:00:48,323][04017] Updated weights for policy 0, policy_version 11584 (0.0016) [2024-03-21 01:00:50,521][03784] Fps is (10 sec: 45876.0, 60 sec: 48605.9, 300 sec: 46097.4). Total num frames: 379584512. Throughput: 0: 47686.8. Samples: 380867900. Policy #0 lag: (min: 2.0, avg: 56.9, max: 119.0) [2024-03-21 01:00:50,522][03784] Avg episode reward: [(0, '0.605')] [2024-03-21 01:00:54,194][03995] Signal inference workers to stop experience collection... (7650 times) [2024-03-21 01:00:54,282][04017] InferenceWorker_p0-w0: stopping experience collection (7650 times) [2024-03-21 01:00:54,416][03995] Signal inference workers to resume experience collection... (7650 times) [2024-03-21 01:00:54,417][04017] InferenceWorker_p0-w0: resuming experience collection (7650 times) [2024-03-21 01:00:55,521][03784] Fps is (10 sec: 29491.0, 60 sec: 44236.7, 300 sec: 46541.7). Total num frames: 379781120. Throughput: 0: 47557.8. Samples: 381148200. Policy #0 lag: (min: 0.0, avg: 30.5, max: 94.0) [2024-03-21 01:00:55,522][03784] Avg episode reward: [(0, '1.222')] [2024-03-21 01:00:57,246][04017] Updated weights for policy 0, policy_version 11594 (0.0015) [2024-03-21 01:01:00,521][03784] Fps is (10 sec: 49151.5, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 380076032. Throughput: 0: 46537.7. Samples: 381264900. Policy #0 lag: (min: 0.0, avg: 30.5, max: 94.0) [2024-03-21 01:01:00,522][03784] Avg episode reward: [(0, '0.688')] [2024-03-21 01:01:03,427][04017] Updated weights for policy 0, policy_version 11604 (0.0011) [2024-03-21 01:01:05,521][03784] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 46986.0). Total num frames: 380239872. Throughput: 0: 46746.8. Samples: 381552800. Policy #0 lag: (min: 0.0, avg: 51.8, max: 118.0) [2024-03-21 01:01:05,522][03784] Avg episode reward: [(0, '0.644')] [2024-03-21 01:01:10,521][03784] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 46763.8). Total num frames: 380469248. Throughput: 0: 46571.0. Samples: 381824900. Policy #0 lag: (min: 0.0, avg: 51.8, max: 118.0) [2024-03-21 01:01:10,522][03784] Avg episode reward: [(0, '0.506')] [2024-03-21 01:01:11,499][04017] Updated weights for policy 0, policy_version 11614 (0.0014) [2024-03-21 01:01:15,521][03784] Fps is (10 sec: 55705.5, 60 sec: 48059.7, 300 sec: 46763.8). Total num frames: 380796928. Throughput: 0: 46035.6. Samples: 381944000. Policy #0 lag: (min: 4.0, avg: 54.8, max: 122.0) [2024-03-21 01:01:15,522][03784] Avg episode reward: [(0, '1.054')] [2024-03-21 01:01:18,824][04017] Updated weights for policy 0, policy_version 11624 (0.0018) [2024-03-21 01:01:20,521][03784] Fps is (10 sec: 45875.9, 60 sec: 45875.3, 300 sec: 46208.4). Total num frames: 380928000. Throughput: 0: 46424.5. Samples: 382235700. Policy #0 lag: (min: 4.0, avg: 54.8, max: 122.0) [2024-03-21 01:01:20,522][03784] Avg episode reward: [(0, '0.692')] [2024-03-21 01:01:24,174][04017] Updated weights for policy 0, policy_version 11634 (0.0039) [2024-03-21 01:01:25,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48059.7, 300 sec: 46319.5). Total num frames: 381255680. Throughput: 0: 45969.0. Samples: 382504500. Policy #0 lag: (min: 1.0, avg: 38.3, max: 87.0) [2024-03-21 01:01:25,522][03784] Avg episode reward: [(0, '0.797')] [2024-03-21 01:01:30,521][03784] Fps is (10 sec: 58982.3, 60 sec: 48605.9, 300 sec: 45986.3). Total num frames: 381517824. Throughput: 0: 45802.2. Samples: 382640900. Policy #0 lag: (min: 1.0, avg: 38.3, max: 87.0) [2024-03-21 01:01:30,522][03784] Avg episode reward: [(0, '1.095')] [2024-03-21 01:01:31,787][04017] Updated weights for policy 0, policy_version 11644 (0.0014) [2024-03-21 01:01:35,521][03784] Fps is (10 sec: 36044.6, 60 sec: 45329.0, 300 sec: 45764.1). Total num frames: 381616128. Throughput: 0: 45648.8. Samples: 382922100. Policy #0 lag: (min: 0.0, avg: 43.2, max: 83.0) [2024-03-21 01:01:35,522][03784] Avg episode reward: [(0, '0.839')] [2024-03-21 01:01:39,487][04017] Updated weights for policy 0, policy_version 11654 (0.0011) [2024-03-21 01:01:40,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46421.4, 300 sec: 46652.7). Total num frames: 381911040. Throughput: 0: 45613.3. Samples: 383200800. Policy #0 lag: (min: 0.0, avg: 43.2, max: 83.0) [2024-03-21 01:01:40,522][03784] Avg episode reward: [(0, '0.758')] [2024-03-21 01:01:43,191][03995] Signal inference workers to stop experience collection... (7700 times) [2024-03-21 01:01:43,245][04017] InferenceWorker_p0-w0: stopping experience collection (7700 times) [2024-03-21 01:01:43,307][03995] Signal inference workers to resume experience collection... (7700 times) [2024-03-21 01:01:43,308][04017] InferenceWorker_p0-w0: resuming experience collection (7700 times) [2024-03-21 01:01:43,629][04017] Updated weights for policy 0, policy_version 11664 (0.0012) [2024-03-21 01:01:45,521][03784] Fps is (10 sec: 58982.9, 60 sec: 45329.1, 300 sec: 46541.7). Total num frames: 382205952. Throughput: 0: 45808.9. Samples: 383326300. Policy #0 lag: (min: 2.0, avg: 43.3, max: 92.0) [2024-03-21 01:01:45,522][03784] Avg episode reward: [(0, '0.992')] [2024-03-21 01:01:50,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45875.1, 300 sec: 46097.3). Total num frames: 382337024. Throughput: 0: 45675.5. Samples: 383608200. Policy #0 lag: (min: 2.0, avg: 43.3, max: 92.0) [2024-03-21 01:01:50,522][03784] Avg episode reward: [(0, '0.822')] [2024-03-21 01:01:55,521][03784] Fps is (10 sec: 29490.8, 60 sec: 45329.0, 300 sec: 45986.3). Total num frames: 382500864. Throughput: 0: 45977.8. Samples: 383893900. Policy #0 lag: (min: 0.0, avg: 35.4, max: 76.0) [2024-03-21 01:01:55,522][03784] Avg episode reward: [(0, '0.899')] [2024-03-21 01:01:55,865][04017] Updated weights for policy 0, policy_version 11674 (0.0019) [2024-03-21 01:02:00,521][03784] Fps is (10 sec: 42598.7, 60 sec: 44783.0, 300 sec: 46319.5). Total num frames: 382763008. Throughput: 0: 46246.7. Samples: 384025100. Policy #0 lag: (min: 0.0, avg: 35.4, max: 76.0) [2024-03-21 01:02:00,522][03784] Avg episode reward: [(0, '0.960')] [2024-03-21 01:02:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000011681_382763008.pth... [2024-03-21 01:02:00,659][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000011346_371785728.pth [2024-03-21 01:02:03,982][04017] Updated weights for policy 0, policy_version 11684 (0.0012) [2024-03-21 01:02:05,521][03784] Fps is (10 sec: 49152.4, 60 sec: 45875.1, 300 sec: 45986.3). Total num frames: 382992384. Throughput: 0: 46011.0. Samples: 384306200. Policy #0 lag: (min: 1.0, avg: 32.9, max: 88.0) [2024-03-21 01:02:05,522][03784] Avg episode reward: [(0, '1.292')] [2024-03-21 01:02:07,257][04017] Updated weights for policy 0, policy_version 11694 (0.0011) [2024-03-21 01:02:10,521][03784] Fps is (10 sec: 55705.6, 60 sec: 47513.7, 300 sec: 45986.3). Total num frames: 383320064. Throughput: 0: 45913.4. Samples: 384570600. Policy #0 lag: (min: 1.0, avg: 32.9, max: 88.0) [2024-03-21 01:02:10,522][03784] Avg episode reward: [(0, '1.286')] [2024-03-21 01:02:14,845][04017] Updated weights for policy 0, policy_version 11704 (0.0015) [2024-03-21 01:02:15,521][03784] Fps is (10 sec: 58982.9, 60 sec: 46421.4, 300 sec: 46097.4). Total num frames: 383582208. Throughput: 0: 46180.0. Samples: 384719000. Policy #0 lag: (min: 0.0, avg: 36.1, max: 70.0) [2024-03-21 01:02:15,522][03784] Avg episode reward: [(0, '0.437')] [2024-03-21 01:02:20,521][03784] Fps is (10 sec: 49151.4, 60 sec: 48059.6, 300 sec: 46097.3). Total num frames: 383811584. Throughput: 0: 45715.5. Samples: 384979300. Policy #0 lag: (min: 1.0, avg: 40.2, max: 83.0) [2024-03-21 01:02:20,522][03784] Avg episode reward: [(0, '0.673')] [2024-03-21 01:02:20,800][04017] Updated weights for policy 0, policy_version 11714 (0.0012) [2024-03-21 01:02:25,521][03784] Fps is (10 sec: 49152.2, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 384073728. Throughput: 0: 45935.7. Samples: 385267900. Policy #0 lag: (min: 1.0, avg: 40.2, max: 83.0) [2024-03-21 01:02:25,522][03784] Avg episode reward: [(0, '0.673')] [2024-03-21 01:02:28,714][04017] Updated weights for policy 0, policy_version 11724 (0.0024) [2024-03-21 01:02:30,521][03784] Fps is (10 sec: 42599.0, 60 sec: 45329.1, 300 sec: 46208.4). Total num frames: 384237568. Throughput: 0: 46402.3. Samples: 385414400. Policy #0 lag: (min: 1.0, avg: 40.1, max: 81.0) [2024-03-21 01:02:30,522][03784] Avg episode reward: [(0, '0.748')] [2024-03-21 01:02:35,521][03784] Fps is (10 sec: 32767.6, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 384401408. Throughput: 0: 46400.0. Samples: 385696200. Policy #0 lag: (min: 1.0, avg: 40.1, max: 81.0) [2024-03-21 01:02:35,522][03784] Avg episode reward: [(0, '0.436')] [2024-03-21 01:02:38,036][04017] Updated weights for policy 0, policy_version 11734 (0.0015) [2024-03-21 01:02:40,521][03784] Fps is (10 sec: 42598.1, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 384663552. Throughput: 0: 46384.6. Samples: 385981200. Policy #0 lag: (min: 0.0, avg: 35.2, max: 86.0) [2024-03-21 01:02:40,522][03784] Avg episode reward: [(0, '1.004')] [2024-03-21 01:02:42,669][03995] Signal inference workers to stop experience collection... (7750 times) [2024-03-21 01:02:42,747][03995] Signal inference workers to resume experience collection... (7750 times) [2024-03-21 01:02:42,751][04017] InferenceWorker_p0-w0: stopping experience collection (7750 times) [2024-03-21 01:02:42,753][04017] Updated weights for policy 0, policy_version 11744 (0.0020) [2024-03-21 01:02:42,794][04017] InferenceWorker_p0-w0: resuming experience collection (7750 times) [2024-03-21 01:02:45,521][03784] Fps is (10 sec: 62259.6, 60 sec: 46967.5, 300 sec: 46763.9). Total num frames: 385024000. Throughput: 0: 46311.1. Samples: 386109100. Policy #0 lag: (min: 0.0, avg: 35.2, max: 86.0) [2024-03-21 01:02:45,522][03784] Avg episode reward: [(0, '1.182')] [2024-03-21 01:02:47,743][04017] Updated weights for policy 0, policy_version 11754 (0.0016) [2024-03-21 01:02:50,521][03784] Fps is (10 sec: 55705.3, 60 sec: 48059.7, 300 sec: 46319.5). Total num frames: 385220608. Throughput: 0: 46282.2. Samples: 386388900. Policy #0 lag: (min: 4.0, avg: 41.1, max: 86.0) [2024-03-21 01:02:50,522][03784] Avg episode reward: [(0, '1.421')] [2024-03-21 01:02:55,312][04017] Updated weights for policy 0, policy_version 11764 (0.0021) [2024-03-21 01:02:55,521][03784] Fps is (10 sec: 45874.9, 60 sec: 49698.2, 300 sec: 46430.6). Total num frames: 385482752. Throughput: 0: 46986.6. Samples: 386685000. Policy #0 lag: (min: 4.0, avg: 41.1, max: 86.0) [2024-03-21 01:02:55,522][03784] Avg episode reward: [(0, '1.421')] [2024-03-21 01:03:00,521][03784] Fps is (10 sec: 45875.7, 60 sec: 48605.9, 300 sec: 46208.4). Total num frames: 385679360. Throughput: 0: 46864.4. Samples: 386827900. Policy #0 lag: (min: 0.0, avg: 36.4, max: 78.0) [2024-03-21 01:03:00,522][03784] Avg episode reward: [(0, '0.884')] [2024-03-21 01:03:05,409][04017] Updated weights for policy 0, policy_version 11774 (0.0015) [2024-03-21 01:03:05,521][03784] Fps is (10 sec: 32768.1, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 385810432. Throughput: 0: 47257.8. Samples: 387105900. Policy #0 lag: (min: 0.0, avg: 36.4, max: 78.0) [2024-03-21 01:03:05,522][03784] Avg episode reward: [(0, '1.255')] [2024-03-21 01:03:10,521][03784] Fps is (10 sec: 36044.6, 60 sec: 45329.0, 300 sec: 46319.5). Total num frames: 386039808. Throughput: 0: 46526.6. Samples: 387361600. Policy #0 lag: (min: 0.0, avg: 34.5, max: 72.0) [2024-03-21 01:03:10,522][03784] Avg episode reward: [(0, '1.309')] [2024-03-21 01:03:11,835][04017] Updated weights for policy 0, policy_version 11784 (0.0014) [2024-03-21 01:03:15,521][03784] Fps is (10 sec: 49152.2, 60 sec: 45329.0, 300 sec: 45986.3). Total num frames: 386301952. Throughput: 0: 46573.3. Samples: 387510200. Policy #0 lag: (min: 0.0, avg: 34.5, max: 72.0) [2024-03-21 01:03:15,522][03784] Avg episode reward: [(0, '0.499')] [2024-03-21 01:03:18,848][04017] Updated weights for policy 0, policy_version 11794 (0.0012) [2024-03-21 01:03:20,521][03784] Fps is (10 sec: 45875.3, 60 sec: 44783.0, 300 sec: 45986.3). Total num frames: 386498560. Throughput: 0: 46775.6. Samples: 387801100. Policy #0 lag: (min: 0.0, avg: 47.9, max: 115.0) [2024-03-21 01:03:20,522][03784] Avg episode reward: [(0, '0.499')] [2024-03-21 01:03:25,521][03784] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 46097.4). Total num frames: 386695168. Throughput: 0: 47166.7. Samples: 388103700. Policy #0 lag: (min: 0.0, avg: 47.9, max: 115.0) [2024-03-21 01:03:25,522][03784] Avg episode reward: [(0, '0.717')] [2024-03-21 01:03:26,265][04017] Updated weights for policy 0, policy_version 11804 (0.0015) [2024-03-21 01:03:30,521][03784] Fps is (10 sec: 55705.4, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 387055616. Throughput: 0: 47306.6. Samples: 388237900. Policy #0 lag: (min: 1.0, avg: 34.3, max: 71.0) [2024-03-21 01:03:30,522][03784] Avg episode reward: [(0, '0.717')] [2024-03-21 01:03:30,902][04017] Updated weights for policy 0, policy_version 11814 (0.0016) [2024-03-21 01:03:33,204][03995] Signal inference workers to stop experience collection... (7800 times) [2024-03-21 01:03:33,204][03995] Signal inference workers to resume experience collection... (7800 times) [2024-03-21 01:03:33,265][04017] InferenceWorker_p0-w0: stopping experience collection (7800 times) [2024-03-21 01:03:33,265][04017] InferenceWorker_p0-w0: resuming experience collection (7800 times) [2024-03-21 01:03:35,521][03784] Fps is (10 sec: 65535.5, 60 sec: 49152.0, 300 sec: 46430.6). Total num frames: 387350528. Throughput: 0: 46553.4. Samples: 388483800. Policy #0 lag: (min: 1.0, avg: 34.3, max: 71.0) [2024-03-21 01:03:35,530][03784] Avg episode reward: [(0, '1.145')] [2024-03-21 01:03:37,989][04017] Updated weights for policy 0, policy_version 11824 (0.0019) [2024-03-21 01:03:40,521][03784] Fps is (10 sec: 55706.0, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 387612672. Throughput: 0: 45711.2. Samples: 388742000. Policy #0 lag: (min: 1.0, avg: 50.5, max: 96.0) [2024-03-21 01:03:40,530][03784] Avg episode reward: [(0, '0.632')] [2024-03-21 01:03:44,986][04017] Updated weights for policy 0, policy_version 11834 (0.0011) [2024-03-21 01:03:45,521][03784] Fps is (10 sec: 45875.5, 60 sec: 46421.4, 300 sec: 46763.8). Total num frames: 387809280. Throughput: 0: 45773.3. Samples: 388887700. Policy #0 lag: (min: 1.0, avg: 50.5, max: 96.0) [2024-03-21 01:03:45,530][03784] Avg episode reward: [(0, '0.659')] [2024-03-21 01:03:50,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45875.3, 300 sec: 46874.9). Total num frames: 387973120. Throughput: 0: 45995.6. Samples: 389175700. Policy #0 lag: (min: 3.0, avg: 40.9, max: 72.0) [2024-03-21 01:03:50,522][03784] Avg episode reward: [(0, '0.493')] [2024-03-21 01:03:55,521][03784] Fps is (10 sec: 26214.4, 60 sec: 43144.6, 300 sec: 46208.4). Total num frames: 388071424. Throughput: 0: 46640.0. Samples: 389460400. Policy #0 lag: (min: 3.0, avg: 40.9, max: 72.0) [2024-03-21 01:03:55,522][03784] Avg episode reward: [(0, '1.047')] [2024-03-21 01:03:56,407][04017] Updated weights for policy 0, policy_version 11844 (0.0016) [2024-03-21 01:04:00,521][03784] Fps is (10 sec: 36044.6, 60 sec: 44236.7, 300 sec: 46430.6). Total num frames: 388333568. Throughput: 0: 46324.4. Samples: 389594800. Policy #0 lag: (min: 0.0, avg: 29.6, max: 65.0) [2024-03-21 01:04:00,522][03784] Avg episode reward: [(0, '1.004')] [2024-03-21 01:04:00,902][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000011852_388366336.pth... [2024-03-21 01:04:01,031][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000011517_377389056.pth [2024-03-21 01:04:01,642][04017] Updated weights for policy 0, policy_version 11854 (0.0016) [2024-03-21 01:04:05,521][03784] Fps is (10 sec: 55705.5, 60 sec: 46967.5, 300 sec: 46430.6). Total num frames: 388628480. Throughput: 0: 46015.6. Samples: 389871800. Policy #0 lag: (min: 0.0, avg: 29.6, max: 65.0) [2024-03-21 01:04:05,522][03784] Avg episode reward: [(0, '0.814')] [2024-03-21 01:04:08,805][04017] Updated weights for policy 0, policy_version 11864 (0.0010) [2024-03-21 01:04:10,521][03784] Fps is (10 sec: 55705.9, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 388890624. Throughput: 0: 45697.7. Samples: 390160100. Policy #0 lag: (min: 0.0, avg: 31.7, max: 76.0) [2024-03-21 01:04:10,522][03784] Avg episode reward: [(0, '0.780')] [2024-03-21 01:04:14,276][04017] Updated weights for policy 0, policy_version 11874 (0.0014) [2024-03-21 01:04:15,521][03784] Fps is (10 sec: 49151.7, 60 sec: 46967.4, 300 sec: 46319.5). Total num frames: 389120000. Throughput: 0: 46011.1. Samples: 390308400. Policy #0 lag: (min: 0.0, avg: 31.7, max: 76.0) [2024-03-21 01:04:15,522][03784] Avg episode reward: [(0, '0.775')] [2024-03-21 01:04:20,521][03784] Fps is (10 sec: 45875.1, 60 sec: 47513.6, 300 sec: 45986.3). Total num frames: 389349376. Throughput: 0: 46724.4. Samples: 390586400. Policy #0 lag: (min: 0.0, avg: 31.1, max: 58.0) [2024-03-21 01:04:20,522][03784] Avg episode reward: [(0, '1.246')] [2024-03-21 01:04:22,865][04017] Updated weights for policy 0, policy_version 11884 (0.0012) [2024-03-21 01:04:25,521][03784] Fps is (10 sec: 32767.7, 60 sec: 45875.0, 300 sec: 45986.3). Total num frames: 389447680. Throughput: 0: 47077.6. Samples: 390860500. Policy #0 lag: (min: 0.0, avg: 37.8, max: 111.0) [2024-03-21 01:04:25,522][03784] Avg episode reward: [(0, '1.038')] [2024-03-21 01:04:27,815][03995] Signal inference workers to stop experience collection... (7850 times) [2024-03-21 01:04:27,889][03995] Signal inference workers to resume experience collection... (7850 times) [2024-03-21 01:04:27,915][04017] InferenceWorker_p0-w0: stopping experience collection (7850 times) [2024-03-21 01:04:27,955][04017] InferenceWorker_p0-w0: resuming experience collection (7850 times) [2024-03-21 01:04:28,609][04017] Updated weights for policy 0, policy_version 11894 (0.0024) [2024-03-21 01:04:30,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45329.1, 300 sec: 46652.7). Total num frames: 389775360. Throughput: 0: 46291.1. Samples: 390970800. Policy #0 lag: (min: 0.0, avg: 37.8, max: 111.0) [2024-03-21 01:04:30,522][03784] Avg episode reward: [(0, '0.915')] [2024-03-21 01:04:35,521][03784] Fps is (10 sec: 58983.1, 60 sec: 44782.9, 300 sec: 46986.0). Total num frames: 390037504. Throughput: 0: 45486.6. Samples: 391222600. Policy #0 lag: (min: 0.0, avg: 40.7, max: 75.0) [2024-03-21 01:04:35,522][03784] Avg episode reward: [(0, '0.966')] [2024-03-21 01:04:35,790][04017] Updated weights for policy 0, policy_version 11904 (0.0015) [2024-03-21 01:04:40,521][03784] Fps is (10 sec: 49152.0, 60 sec: 44236.8, 300 sec: 47208.2). Total num frames: 390266880. Throughput: 0: 45144.5. Samples: 391491900. Policy #0 lag: (min: 0.0, avg: 40.7, max: 75.0) [2024-03-21 01:04:40,522][03784] Avg episode reward: [(0, '0.543')] [2024-03-21 01:04:43,742][04017] Updated weights for policy 0, policy_version 11914 (0.0011) [2024-03-21 01:04:45,521][03784] Fps is (10 sec: 39321.6, 60 sec: 43690.6, 300 sec: 46652.7). Total num frames: 390430720. Throughput: 0: 45304.5. Samples: 391633500. Policy #0 lag: (min: 0.0, avg: 38.6, max: 119.0) [2024-03-21 01:04:45,522][03784] Avg episode reward: [(0, '0.984')] [2024-03-21 01:04:50,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 390627328. Throughput: 0: 45557.8. Samples: 391921900. Policy #0 lag: (min: 0.0, avg: 38.6, max: 119.0) [2024-03-21 01:04:50,521][03784] Avg episode reward: [(0, '1.045')] [2024-03-21 01:04:51,304][04017] Updated weights for policy 0, policy_version 11924 (0.0011) [2024-03-21 01:04:55,521][03784] Fps is (10 sec: 45874.8, 60 sec: 46967.3, 300 sec: 45764.1). Total num frames: 390889472. Throughput: 0: 45171.0. Samples: 392192800. Policy #0 lag: (min: 0.0, avg: 47.4, max: 105.0) [2024-03-21 01:04:55,522][03784] Avg episode reward: [(0, '1.118')] [2024-03-21 01:04:57,289][04017] Updated weights for policy 0, policy_version 11934 (0.0012) [2024-03-21 01:05:00,521][03784] Fps is (10 sec: 58982.5, 60 sec: 48059.8, 300 sec: 45986.3). Total num frames: 391217152. Throughput: 0: 44613.5. Samples: 392316000. Policy #0 lag: (min: 0.0, avg: 47.4, max: 105.0) [2024-03-21 01:05:00,522][03784] Avg episode reward: [(0, '0.586')] [2024-03-21 01:05:04,433][04017] Updated weights for policy 0, policy_version 11944 (0.0011) [2024-03-21 01:05:05,521][03784] Fps is (10 sec: 58982.6, 60 sec: 47513.5, 300 sec: 46097.3). Total num frames: 391479296. Throughput: 0: 44286.6. Samples: 392579300. Policy #0 lag: (min: 0.0, avg: 31.3, max: 67.0) [2024-03-21 01:05:05,522][03784] Avg episode reward: [(0, '0.618')] [2024-03-21 01:05:10,521][03784] Fps is (10 sec: 45874.7, 60 sec: 46421.3, 300 sec: 46652.7). Total num frames: 391675904. Throughput: 0: 44400.1. Samples: 392858500. Policy #0 lag: (min: 0.0, avg: 31.3, max: 67.0) [2024-03-21 01:05:10,522][03784] Avg episode reward: [(0, '1.021')] [2024-03-21 01:05:12,432][04017] Updated weights for policy 0, policy_version 11954 (0.0018) [2024-03-21 01:05:15,521][03784] Fps is (10 sec: 42599.3, 60 sec: 46421.5, 300 sec: 46541.7). Total num frames: 391905280. Throughput: 0: 45135.6. Samples: 393001900. Policy #0 lag: (min: 1.0, avg: 25.5, max: 56.0) [2024-03-21 01:05:15,521][03784] Avg episode reward: [(0, '0.482')] [2024-03-21 01:05:17,219][04017] Updated weights for policy 0, policy_version 11964 (0.0014) [2024-03-21 01:05:20,521][03784] Fps is (10 sec: 39321.7, 60 sec: 45329.1, 300 sec: 46430.6). Total num frames: 392069120. Throughput: 0: 45824.5. Samples: 393284700. Policy #0 lag: (min: 1.0, avg: 25.5, max: 56.0) [2024-03-21 01:05:20,522][03784] Avg episode reward: [(0, '1.070')] [2024-03-21 01:05:25,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46967.7, 300 sec: 46319.5). Total num frames: 392265728. Throughput: 0: 45942.3. Samples: 393559300. Policy #0 lag: (min: 0.0, avg: 40.1, max: 117.0) [2024-03-21 01:05:25,521][03784] Avg episode reward: [(0, '1.072')] [2024-03-21 01:05:28,475][04017] Updated weights for policy 0, policy_version 11974 (0.0020) [2024-03-21 01:05:30,521][03784] Fps is (10 sec: 32768.1, 60 sec: 43690.7, 300 sec: 45764.1). Total num frames: 392396800. Throughput: 0: 46069.0. Samples: 393706600. Policy #0 lag: (min: 0.0, avg: 40.1, max: 117.0) [2024-03-21 01:05:30,522][03784] Avg episode reward: [(0, '0.954')] [2024-03-21 01:05:33,525][03995] Signal inference workers to stop experience collection... (7900 times) [2024-03-21 01:05:33,606][04017] InferenceWorker_p0-w0: stopping experience collection (7900 times) [2024-03-21 01:05:33,785][03995] Signal inference workers to resume experience collection... (7900 times) [2024-03-21 01:05:33,786][04017] InferenceWorker_p0-w0: resuming experience collection (7900 times) [2024-03-21 01:05:35,521][03784] Fps is (10 sec: 36044.6, 60 sec: 43144.6, 300 sec: 45764.1). Total num frames: 392626176. Throughput: 0: 45860.0. Samples: 393985600. Policy #0 lag: (min: 0.0, avg: 32.9, max: 88.0) [2024-03-21 01:05:35,522][03784] Avg episode reward: [(0, '0.368')] [2024-03-21 01:05:36,592][04017] Updated weights for policy 0, policy_version 11984 (0.0022) [2024-03-21 01:05:40,521][03784] Fps is (10 sec: 52428.2, 60 sec: 44236.7, 300 sec: 45542.0). Total num frames: 392921088. Throughput: 0: 46031.2. Samples: 394264200. Policy #0 lag: (min: 0.0, avg: 32.9, max: 88.0) [2024-03-21 01:05:40,522][03784] Avg episode reward: [(0, '0.548')] [2024-03-21 01:05:43,093][04017] Updated weights for policy 0, policy_version 11994 (0.0012) [2024-03-21 01:05:45,521][03784] Fps is (10 sec: 49151.4, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 393117696. Throughput: 0: 46384.3. Samples: 394403300. Policy #0 lag: (min: 0.0, avg: 32.8, max: 72.0) [2024-03-21 01:05:45,522][03784] Avg episode reward: [(0, '0.548')] [2024-03-21 01:05:48,402][04017] Updated weights for policy 0, policy_version 12004 (0.0011) [2024-03-21 01:05:50,521][03784] Fps is (10 sec: 52428.5, 60 sec: 46967.3, 300 sec: 46319.5). Total num frames: 393445376. Throughput: 0: 46988.8. Samples: 394693800. Policy #0 lag: (min: 0.0, avg: 32.8, max: 72.0) [2024-03-21 01:05:50,522][03784] Avg episode reward: [(0, '1.304')] [2024-03-21 01:05:54,880][04017] Updated weights for policy 0, policy_version 12014 (0.0020) [2024-03-21 01:05:55,521][03784] Fps is (10 sec: 58982.5, 60 sec: 46967.5, 300 sec: 46208.4). Total num frames: 393707520. Throughput: 0: 46468.9. Samples: 394949600. Policy #0 lag: (min: 0.0, avg: 43.5, max: 106.0) [2024-03-21 01:05:55,522][03784] Avg episode reward: [(0, '0.739')] [2024-03-21 01:05:59,556][04017] Updated weights for policy 0, policy_version 12024 (0.0017) [2024-03-21 01:06:00,521][03784] Fps is (10 sec: 58983.4, 60 sec: 46967.4, 300 sec: 46763.8). Total num frames: 394035200. Throughput: 0: 45839.9. Samples: 395064700. Policy #0 lag: (min: 0.0, avg: 43.5, max: 106.0) [2024-03-21 01:06:00,522][03784] Avg episode reward: [(0, '1.031')] [2024-03-21 01:06:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000012025_394035200.pth... [2024-03-21 01:06:00,648][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000011681_382763008.pth [2024-03-21 01:06:05,521][03784] Fps is (10 sec: 55705.8, 60 sec: 46421.4, 300 sec: 46763.8). Total num frames: 394264576. Throughput: 0: 46446.7. Samples: 395374800. Policy #0 lag: (min: 0.0, avg: 36.6, max: 77.0) [2024-03-21 01:06:05,522][03784] Avg episode reward: [(0, '1.047')] [2024-03-21 01:06:06,104][04017] Updated weights for policy 0, policy_version 12034 (0.0009) [2024-03-21 01:06:10,521][03784] Fps is (10 sec: 49151.8, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 394526720. Throughput: 0: 46893.2. Samples: 395669500. Policy #0 lag: (min: 0.0, avg: 36.6, max: 77.0) [2024-03-21 01:06:10,522][03784] Avg episode reward: [(0, '1.047')] [2024-03-21 01:06:15,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45329.0, 300 sec: 46430.6). Total num frames: 394625024. Throughput: 0: 46793.3. Samples: 395812300. Policy #0 lag: (min: 0.0, avg: 36.6, max: 77.0) [2024-03-21 01:06:15,522][03784] Avg episode reward: [(0, '1.047')] [2024-03-21 01:06:15,647][04017] Updated weights for policy 0, policy_version 12044 (0.0012) [2024-03-21 01:06:20,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46421.3, 300 sec: 46097.4). Total num frames: 394854400. Throughput: 0: 46933.2. Samples: 396097600. Policy #0 lag: (min: 0.0, avg: 42.1, max: 86.0) [2024-03-21 01:06:20,522][03784] Avg episode reward: [(0, '0.912')] [2024-03-21 01:06:25,521][03784] Fps is (10 sec: 26214.5, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 394887168. Throughput: 0: 47571.2. Samples: 396404900. Policy #0 lag: (min: 0.0, avg: 42.1, max: 86.0) [2024-03-21 01:06:25,522][03784] Avg episode reward: [(0, '0.786')] [2024-03-21 01:06:27,085][04017] Updated weights for policy 0, policy_version 12054 (0.0020) [2024-03-21 01:06:27,499][03995] Signal inference workers to stop experience collection... (7950 times) [2024-03-21 01:06:27,560][04017] InferenceWorker_p0-w0: stopping experience collection (7950 times) [2024-03-21 01:06:27,750][03995] Signal inference workers to resume experience collection... (7950 times) [2024-03-21 01:06:27,750][04017] InferenceWorker_p0-w0: resuming experience collection (7950 times) [2024-03-21 01:06:30,521][03784] Fps is (10 sec: 42598.2, 60 sec: 48059.6, 300 sec: 46319.5). Total num frames: 395280384. Throughput: 0: 47331.1. Samples: 396533200. Policy #0 lag: (min: 0.0, avg: 35.3, max: 83.0) [2024-03-21 01:06:30,522][03784] Avg episode reward: [(0, '1.156')] [2024-03-21 01:06:30,714][04017] Updated weights for policy 0, policy_version 12064 (0.0021) [2024-03-21 01:06:35,521][03784] Fps is (10 sec: 65535.5, 60 sec: 48605.8, 300 sec: 46208.4). Total num frames: 395542528. Throughput: 0: 47089.0. Samples: 396812800. Policy #0 lag: (min: 0.0, avg: 35.3, max: 83.0) [2024-03-21 01:06:35,522][03784] Avg episode reward: [(0, '1.056')] [2024-03-21 01:06:36,753][04017] Updated weights for policy 0, policy_version 12074 (0.0011) [2024-03-21 01:06:40,521][03784] Fps is (10 sec: 55706.1, 60 sec: 48605.9, 300 sec: 46208.4). Total num frames: 395837440. Throughput: 0: 47917.8. Samples: 397105900. Policy #0 lag: (min: 0.0, avg: 32.6, max: 70.0) [2024-03-21 01:06:40,522][03784] Avg episode reward: [(0, '0.875')] [2024-03-21 01:06:41,594][04017] Updated weights for policy 0, policy_version 12084 (0.0014) [2024-03-21 01:06:45,521][03784] Fps is (10 sec: 58982.9, 60 sec: 50244.4, 300 sec: 46763.8). Total num frames: 396132352. Throughput: 0: 48288.9. Samples: 397237700. Policy #0 lag: (min: 0.0, avg: 32.6, max: 70.0) [2024-03-21 01:06:45,522][03784] Avg episode reward: [(0, '0.875')] [2024-03-21 01:06:50,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46967.5, 300 sec: 46652.8). Total num frames: 396263424. Throughput: 0: 48062.2. Samples: 397537600. Policy #0 lag: (min: 0.0, avg: 35.3, max: 99.0) [2024-03-21 01:06:50,522][03784] Avg episode reward: [(0, '0.656')] [2024-03-21 01:06:51,118][04017] Updated weights for policy 0, policy_version 12094 (0.0012) [2024-03-21 01:06:55,521][03784] Fps is (10 sec: 39321.2, 60 sec: 46967.5, 300 sec: 46652.7). Total num frames: 396525568. Throughput: 0: 47700.0. Samples: 397816000. Policy #0 lag: (min: 0.0, avg: 35.3, max: 99.0) [2024-03-21 01:06:55,522][03784] Avg episode reward: [(0, '0.853')] [2024-03-21 01:06:57,018][04017] Updated weights for policy 0, policy_version 12104 (0.0024) [2024-03-21 01:07:00,521][03784] Fps is (10 sec: 52429.5, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 396787712. Throughput: 0: 47493.4. Samples: 397949500. Policy #0 lag: (min: 0.0, avg: 45.2, max: 95.0) [2024-03-21 01:07:00,522][03784] Avg episode reward: [(0, '1.357')] [2024-03-21 01:07:05,521][03784] Fps is (10 sec: 32768.3, 60 sec: 43144.6, 300 sec: 45875.2). Total num frames: 396853248. Throughput: 0: 47637.9. Samples: 398241300. Policy #0 lag: (min: 0.0, avg: 45.2, max: 95.0) [2024-03-21 01:07:05,522][03784] Avg episode reward: [(0, '0.455')] [2024-03-21 01:07:06,276][04017] Updated weights for policy 0, policy_version 12114 (0.0015) [2024-03-21 01:07:10,521][03784] Fps is (10 sec: 45874.7, 60 sec: 45329.1, 300 sec: 46319.5). Total num frames: 397246464. Throughput: 0: 46582.2. Samples: 398501100. Policy #0 lag: (min: 0.0, avg: 42.8, max: 87.0) [2024-03-21 01:07:10,522][03784] Avg episode reward: [(0, '0.502')] [2024-03-21 01:07:10,655][04017] Updated weights for policy 0, policy_version 12124 (0.0012) [2024-03-21 01:07:15,521][03784] Fps is (10 sec: 58982.0, 60 sec: 46967.4, 300 sec: 46208.4). Total num frames: 397443072. Throughput: 0: 46631.2. Samples: 398631600. Policy #0 lag: (min: 0.0, avg: 42.8, max: 87.0) [2024-03-21 01:07:15,522][03784] Avg episode reward: [(0, '1.183')] [2024-03-21 01:07:16,583][03995] Signal inference workers to stop experience collection... (8000 times) [2024-03-21 01:07:16,649][03995] Signal inference workers to resume experience collection... (8000 times) [2024-03-21 01:07:16,672][04017] InferenceWorker_p0-w0: stopping experience collection (8000 times) [2024-03-21 01:07:16,720][04017] InferenceWorker_p0-w0: resuming experience collection (8000 times) [2024-03-21 01:07:17,640][04017] Updated weights for policy 0, policy_version 12134 (0.0011) [2024-03-21 01:07:20,521][03784] Fps is (10 sec: 55705.0, 60 sec: 49151.9, 300 sec: 46541.6). Total num frames: 397803520. Throughput: 0: 47017.7. Samples: 398928600. Policy #0 lag: (min: 2.0, avg: 35.7, max: 76.0) [2024-03-21 01:07:20,522][03784] Avg episode reward: [(0, '0.790')] [2024-03-21 01:07:24,108][04017] Updated weights for policy 0, policy_version 12144 (0.0012) [2024-03-21 01:07:25,521][03784] Fps is (10 sec: 58982.3, 60 sec: 52428.7, 300 sec: 46763.8). Total num frames: 398032896. Throughput: 0: 46480.0. Samples: 399197500. Policy #0 lag: (min: 2.0, avg: 35.7, max: 76.0) [2024-03-21 01:07:25,522][03784] Avg episode reward: [(0, '1.521')] [2024-03-21 01:07:30,521][03784] Fps is (10 sec: 42599.1, 60 sec: 49152.1, 300 sec: 46874.9). Total num frames: 398229504. Throughput: 0: 46686.7. Samples: 399338600. Policy #0 lag: (min: 1.0, avg: 35.9, max: 67.0) [2024-03-21 01:07:30,522][03784] Avg episode reward: [(0, '0.678')] [2024-03-21 01:07:30,621][04017] Updated weights for policy 0, policy_version 12154 (0.0010) [2024-03-21 01:07:35,521][03784] Fps is (10 sec: 42598.6, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 398458880. Throughput: 0: 46706.7. Samples: 399639400. Policy #0 lag: (min: 1.0, avg: 35.9, max: 67.0) [2024-03-21 01:07:35,522][03784] Avg episode reward: [(0, '0.678')] [2024-03-21 01:07:38,992][04017] Updated weights for policy 0, policy_version 12164 (0.0017) [2024-03-21 01:07:40,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46421.3, 300 sec: 46097.4). Total num frames: 398622720. Throughput: 0: 46651.2. Samples: 399915300. Policy #0 lag: (min: 0.0, avg: 43.0, max: 91.0) [2024-03-21 01:07:40,522][03784] Avg episode reward: [(0, '0.461')] [2024-03-21 01:07:45,521][03784] Fps is (10 sec: 29491.2, 60 sec: 43690.6, 300 sec: 45875.2). Total num frames: 398753792. Throughput: 0: 46862.1. Samples: 400058300. Policy #0 lag: (min: 0.0, avg: 43.0, max: 91.0) [2024-03-21 01:07:45,522][03784] Avg episode reward: [(0, '0.461')] [2024-03-21 01:07:48,243][04017] Updated weights for policy 0, policy_version 12174 (0.0017) [2024-03-21 01:07:50,521][03784] Fps is (10 sec: 36044.3, 60 sec: 45329.0, 300 sec: 45764.1). Total num frames: 398983168. Throughput: 0: 45955.4. Samples: 400309300. Policy #0 lag: (min: 0.0, avg: 31.3, max: 86.0) [2024-03-21 01:07:50,523][03784] Avg episode reward: [(0, '0.544')] [2024-03-21 01:07:55,521][03784] Fps is (10 sec: 32768.1, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 399081472. Throughput: 0: 46273.4. Samples: 400583400. Policy #0 lag: (min: 0.0, avg: 31.3, max: 86.0) [2024-03-21 01:07:55,522][03784] Avg episode reward: [(0, '0.749')] [2024-03-21 01:07:58,295][04017] Updated weights for policy 0, policy_version 12184 (0.0019) [2024-03-21 01:08:00,521][03784] Fps is (10 sec: 42599.1, 60 sec: 43690.6, 300 sec: 46097.4). Total num frames: 399409152. Throughput: 0: 46224.5. Samples: 400711700. Policy #0 lag: (min: 1.0, avg: 38.1, max: 85.0) [2024-03-21 01:08:00,522][03784] Avg episode reward: [(0, '0.644')] [2024-03-21 01:08:00,909][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000012191_399474688.pth... [2024-03-21 01:08:01,031][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000011852_388366336.pth [2024-03-21 01:08:02,048][04017] Updated weights for policy 0, policy_version 12194 (0.0016) [2024-03-21 01:08:05,521][03784] Fps is (10 sec: 75366.3, 60 sec: 49698.1, 300 sec: 46763.8). Total num frames: 399835136. Throughput: 0: 45413.5. Samples: 400972200. Policy #0 lag: (min: 1.0, avg: 38.1, max: 85.0) [2024-03-21 01:08:05,522][03784] Avg episode reward: [(0, '0.630')] [2024-03-21 01:08:05,950][04017] Updated weights for policy 0, policy_version 12204 (0.0011) [2024-03-21 01:08:07,256][03995] Signal inference workers to stop experience collection... (8050 times) [2024-03-21 01:08:07,331][03995] Signal inference workers to resume experience collection... (8050 times) [2024-03-21 01:08:07,343][04017] InferenceWorker_p0-w0: stopping experience collection (8050 times) [2024-03-21 01:08:07,403][04017] InferenceWorker_p0-w0: resuming experience collection (8050 times) [2024-03-21 01:08:09,540][04017] Updated weights for policy 0, policy_version 12214 (0.0012) [2024-03-21 01:08:10,521][03784] Fps is (10 sec: 91750.4, 60 sec: 51336.6, 300 sec: 47541.4). Total num frames: 400326656. Throughput: 0: 44817.8. Samples: 401214300. Policy #0 lag: (min: 3.0, avg: 45.6, max: 102.0) [2024-03-21 01:08:10,522][03784] Avg episode reward: [(0, '1.280')] [2024-03-21 01:08:15,521][03784] Fps is (10 sec: 62258.9, 60 sec: 50244.2, 300 sec: 47319.2). Total num frames: 400457728. Throughput: 0: 44904.3. Samples: 401359300. Policy #0 lag: (min: 3.0, avg: 45.6, max: 102.0) [2024-03-21 01:08:15,522][03784] Avg episode reward: [(0, '1.236')] [2024-03-21 01:08:17,404][04017] Updated weights for policy 0, policy_version 12224 (0.0013) [2024-03-21 01:08:20,521][03784] Fps is (10 sec: 36044.4, 60 sec: 48059.8, 300 sec: 47430.3). Total num frames: 400687104. Throughput: 0: 43691.0. Samples: 401605500. Policy #0 lag: (min: 3.0, avg: 54.0, max: 99.0) [2024-03-21 01:08:20,522][03784] Avg episode reward: [(0, '0.956')] [2024-03-21 01:08:25,521][03784] Fps is (10 sec: 26214.3, 60 sec: 44782.9, 300 sec: 46319.5). Total num frames: 400719872. Throughput: 0: 43308.8. Samples: 401864200. Policy #0 lag: (min: 3.0, avg: 54.0, max: 99.0) [2024-03-21 01:08:25,522][03784] Avg episode reward: [(0, '0.916')] [2024-03-21 01:08:30,521][03784] Fps is (10 sec: 13107.4, 60 sec: 43144.5, 300 sec: 45653.1). Total num frames: 400818176. Throughput: 0: 43373.4. Samples: 402010100. Policy #0 lag: (min: 0.0, avg: 34.5, max: 72.0) [2024-03-21 01:08:30,522][03784] Avg episode reward: [(0, '0.890')] [2024-03-21 01:08:31,608][04017] Updated weights for policy 0, policy_version 12234 (0.0012) [2024-03-21 01:08:35,521][03784] Fps is (10 sec: 32768.3, 60 sec: 43144.6, 300 sec: 45542.0). Total num frames: 401047552. Throughput: 0: 44600.2. Samples: 402316300. Policy #0 lag: (min: 0.0, avg: 34.5, max: 72.0) [2024-03-21 01:08:35,522][03784] Avg episode reward: [(0, '1.042')] [2024-03-21 01:08:40,238][04017] Updated weights for policy 0, policy_version 12244 (0.0011) [2024-03-21 01:08:40,521][03784] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 45430.9). Total num frames: 401211392. Throughput: 0: 45237.8. Samples: 402619100. Policy #0 lag: (min: 0.0, avg: 30.4, max: 79.0) [2024-03-21 01:08:40,522][03784] Avg episode reward: [(0, '0.683')] [2024-03-21 01:08:45,062][04017] Updated weights for policy 0, policy_version 12254 (0.0011) [2024-03-21 01:08:45,521][03784] Fps is (10 sec: 49151.7, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 401539072. Throughput: 0: 45284.4. Samples: 402749500. Policy #0 lag: (min: 0.0, avg: 30.4, max: 79.0) [2024-03-21 01:08:45,522][03784] Avg episode reward: [(0, '0.590')] [2024-03-21 01:08:50,521][03784] Fps is (10 sec: 58982.4, 60 sec: 46967.6, 300 sec: 46541.7). Total num frames: 401801216. Throughput: 0: 45420.0. Samples: 403016100. Policy #0 lag: (min: 0.0, avg: 30.1, max: 79.0) [2024-03-21 01:08:50,522][03784] Avg episode reward: [(0, '0.672')] [2024-03-21 01:08:51,751][04017] Updated weights for policy 0, policy_version 12264 (0.0012) [2024-03-21 01:08:55,521][03784] Fps is (10 sec: 52429.0, 60 sec: 49698.1, 300 sec: 46541.7). Total num frames: 402063360. Throughput: 0: 45922.2. Samples: 403280800. Policy #0 lag: (min: 0.0, avg: 30.1, max: 79.0) [2024-03-21 01:08:55,522][03784] Avg episode reward: [(0, '0.754')] [2024-03-21 01:08:58,596][04017] Updated weights for policy 0, policy_version 12274 (0.0011) [2024-03-21 01:09:00,521][03784] Fps is (10 sec: 52428.8, 60 sec: 48605.9, 300 sec: 46430.6). Total num frames: 402325504. Throughput: 0: 45686.8. Samples: 403415200. Policy #0 lag: (min: 0.0, avg: 40.4, max: 94.0) [2024-03-21 01:09:00,522][03784] Avg episode reward: [(0, '0.364')] [2024-03-21 01:09:05,345][04017] Updated weights for policy 0, policy_version 12284 (0.0012) [2024-03-21 01:09:05,521][03784] Fps is (10 sec: 45875.1, 60 sec: 44782.9, 300 sec: 46208.4). Total num frames: 402522112. Throughput: 0: 46129.0. Samples: 403681300. Policy #0 lag: (min: 0.0, avg: 40.4, max: 94.0) [2024-03-21 01:09:05,522][03784] Avg episode reward: [(0, '1.172')] [2024-03-21 01:09:10,521][03784] Fps is (10 sec: 36044.9, 60 sec: 39321.6, 300 sec: 45986.3). Total num frames: 402685952. Throughput: 0: 46402.4. Samples: 403952300. Policy #0 lag: (min: 0.0, avg: 40.4, max: 94.0) [2024-03-21 01:09:10,522][03784] Avg episode reward: [(0, '1.075')] [2024-03-21 01:09:14,464][03995] Signal inference workers to stop experience collection... (8100 times) [2024-03-21 01:09:14,533][03995] Signal inference workers to resume experience collection... (8100 times) [2024-03-21 01:09:14,550][04017] InferenceWorker_p0-w0: stopping experience collection (8100 times) [2024-03-21 01:09:14,585][04017] InferenceWorker_p0-w0: resuming experience collection (8100 times) [2024-03-21 01:09:14,853][04017] Updated weights for policy 0, policy_version 12294 (0.0011) [2024-03-21 01:09:15,521][03784] Fps is (10 sec: 39322.3, 60 sec: 40960.1, 300 sec: 45986.3). Total num frames: 402915328. Throughput: 0: 46362.3. Samples: 404096400. Policy #0 lag: (min: 0.0, avg: 41.4, max: 79.0) [2024-03-21 01:09:15,521][03784] Avg episode reward: [(0, '0.527')] [2024-03-21 01:09:20,521][03784] Fps is (10 sec: 45874.9, 60 sec: 40960.0, 300 sec: 46430.6). Total num frames: 403144704. Throughput: 0: 45037.7. Samples: 404343000. Policy #0 lag: (min: 0.0, avg: 41.4, max: 79.0) [2024-03-21 01:09:20,522][03784] Avg episode reward: [(0, '0.554')] [2024-03-21 01:09:20,563][04017] Updated weights for policy 0, policy_version 12304 (0.0014) [2024-03-21 01:09:25,521][03784] Fps is (10 sec: 39320.8, 60 sec: 43144.6, 300 sec: 45875.2). Total num frames: 403308544. Throughput: 0: 44902.1. Samples: 404639700. Policy #0 lag: (min: 0.0, avg: 37.4, max: 83.0) [2024-03-21 01:09:25,522][03784] Avg episode reward: [(0, '0.436')] [2024-03-21 01:09:30,521][03784] Fps is (10 sec: 32767.9, 60 sec: 44236.7, 300 sec: 45542.0). Total num frames: 403472384. Throughput: 0: 45351.1. Samples: 404790300. Policy #0 lag: (min: 0.0, avg: 31.6, max: 69.0) [2024-03-21 01:09:30,522][03784] Avg episode reward: [(0, '0.436')] [2024-03-21 01:09:30,586][04017] Updated weights for policy 0, policy_version 12314 (0.0012) [2024-03-21 01:09:34,621][04017] Updated weights for policy 0, policy_version 12324 (0.0016) [2024-03-21 01:09:35,521][03784] Fps is (10 sec: 58982.7, 60 sec: 47513.6, 300 sec: 46208.4). Total num frames: 403898368. Throughput: 0: 45380.0. Samples: 405058200. Policy #0 lag: (min: 0.0, avg: 31.6, max: 69.0) [2024-03-21 01:09:35,522][03784] Avg episode reward: [(0, '0.436')] [2024-03-21 01:09:39,499][04017] Updated weights for policy 0, policy_version 12334 (0.0012) [2024-03-21 01:09:40,521][03784] Fps is (10 sec: 72089.5, 60 sec: 49698.1, 300 sec: 46652.7). Total num frames: 404193280. Throughput: 0: 45157.7. Samples: 405312900. Policy #0 lag: (min: 1.0, avg: 44.6, max: 88.0) [2024-03-21 01:09:40,522][03784] Avg episode reward: [(0, '0.719')] [2024-03-21 01:09:45,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 404357120. Throughput: 0: 45568.9. Samples: 405465800. Policy #0 lag: (min: 1.0, avg: 44.6, max: 88.0) [2024-03-21 01:09:45,522][03784] Avg episode reward: [(0, '1.205')] [2024-03-21 01:09:49,459][04017] Updated weights for policy 0, policy_version 12344 (0.0015) [2024-03-21 01:09:50,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 404553728. Throughput: 0: 46335.6. Samples: 405766400. Policy #0 lag: (min: 1.0, avg: 34.8, max: 79.0) [2024-03-21 01:09:50,522][03784] Avg episode reward: [(0, '0.800')] [2024-03-21 01:09:54,595][04017] Updated weights for policy 0, policy_version 12354 (0.0020) [2024-03-21 01:09:55,521][03784] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 46097.4). Total num frames: 404815872. Throughput: 0: 46822.2. Samples: 406059300. Policy #0 lag: (min: 1.0, avg: 34.8, max: 79.0) [2024-03-21 01:09:55,522][03784] Avg episode reward: [(0, '0.800')] [2024-03-21 01:10:00,521][03784] Fps is (10 sec: 52428.6, 60 sec: 45875.2, 300 sec: 46097.4). Total num frames: 405078016. Throughput: 0: 46915.4. Samples: 406207600. Policy #0 lag: (min: 1.0, avg: 38.7, max: 77.0) [2024-03-21 01:10:00,522][03784] Avg episode reward: [(0, '1.203')] [2024-03-21 01:10:00,544][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000012362_405078016.pth... [2024-03-21 01:10:00,683][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000012025_394035200.pth [2024-03-21 01:10:01,600][04017] Updated weights for policy 0, policy_version 12364 (0.0015) [2024-03-21 01:10:05,521][03784] Fps is (10 sec: 36044.6, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 405176320. Throughput: 0: 47544.5. Samples: 406482500. Policy #0 lag: (min: 1.0, avg: 38.7, max: 77.0) [2024-03-21 01:10:05,522][03784] Avg episode reward: [(0, '0.909')] [2024-03-21 01:10:09,187][03995] Signal inference workers to stop experience collection... (8150 times) [2024-03-21 01:10:09,232][04017] InferenceWorker_p0-w0: stopping experience collection (8150 times) [2024-03-21 01:10:09,487][03995] Signal inference workers to resume experience collection... (8150 times) [2024-03-21 01:10:09,487][04017] InferenceWorker_p0-w0: resuming experience collection (8150 times) [2024-03-21 01:10:10,521][03784] Fps is (10 sec: 26214.5, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 405340160. Throughput: 0: 47484.5. Samples: 406776500. Policy #0 lag: (min: 0.0, avg: 35.5, max: 78.0) [2024-03-21 01:10:10,522][03784] Avg episode reward: [(0, '0.784')] [2024-03-21 01:10:12,916][04017] Updated weights for policy 0, policy_version 12374 (0.0011) [2024-03-21 01:10:15,521][03784] Fps is (10 sec: 39321.4, 60 sec: 44236.6, 300 sec: 45764.1). Total num frames: 405569536. Throughput: 0: 47388.9. Samples: 406922800. Policy #0 lag: (min: 0.0, avg: 35.5, max: 78.0) [2024-03-21 01:10:15,522][03784] Avg episode reward: [(0, '0.784')] [2024-03-21 01:10:20,272][04017] Updated weights for policy 0, policy_version 12384 (0.0016) [2024-03-21 01:10:20,521][03784] Fps is (10 sec: 45875.2, 60 sec: 44236.8, 300 sec: 45875.2). Total num frames: 405798912. Throughput: 0: 47682.2. Samples: 407203900. Policy #0 lag: (min: 0.0, avg: 41.6, max: 91.0) [2024-03-21 01:10:20,522][03784] Avg episode reward: [(0, '1.273')] [2024-03-21 01:10:24,431][04017] Updated weights for policy 0, policy_version 12394 (0.0015) [2024-03-21 01:10:25,521][03784] Fps is (10 sec: 58982.8, 60 sec: 47513.6, 300 sec: 46652.7). Total num frames: 406159360. Throughput: 0: 48153.4. Samples: 407479800. Policy #0 lag: (min: 0.0, avg: 41.6, max: 91.0) [2024-03-21 01:10:25,522][03784] Avg episode reward: [(0, '1.107')] [2024-03-21 01:10:30,521][03784] Fps is (10 sec: 52428.5, 60 sec: 47513.6, 300 sec: 46430.6). Total num frames: 406323200. Throughput: 0: 48084.4. Samples: 407629600. Policy #0 lag: (min: 2.0, avg: 36.3, max: 81.0) [2024-03-21 01:10:30,522][03784] Avg episode reward: [(0, '1.125')] [2024-03-21 01:10:32,330][04017] Updated weights for policy 0, policy_version 12404 (0.0011) [2024-03-21 01:10:35,521][03784] Fps is (10 sec: 52428.7, 60 sec: 46421.3, 300 sec: 46652.8). Total num frames: 406683648. Throughput: 0: 47602.2. Samples: 407908500. Policy #0 lag: (min: 2.0, avg: 36.3, max: 81.0) [2024-03-21 01:10:35,522][03784] Avg episode reward: [(0, '0.586')] [2024-03-21 01:10:36,423][04017] Updated weights for policy 0, policy_version 12414 (0.0019) [2024-03-21 01:10:40,521][03784] Fps is (10 sec: 75367.5, 60 sec: 48059.9, 300 sec: 47319.2). Total num frames: 407076864. Throughput: 0: 46811.2. Samples: 408165800. Policy #0 lag: (min: 0.0, avg: 52.1, max: 115.0) [2024-03-21 01:10:40,522][03784] Avg episode reward: [(0, '1.210')] [2024-03-21 01:10:42,039][04017] Updated weights for policy 0, policy_version 12424 (0.0015) [2024-03-21 01:10:45,521][03784] Fps is (10 sec: 65536.1, 60 sec: 49698.1, 300 sec: 47097.1). Total num frames: 407339008. Throughput: 0: 46875.6. Samples: 408317000. Policy #0 lag: (min: 0.0, avg: 52.1, max: 115.0) [2024-03-21 01:10:45,522][03784] Avg episode reward: [(0, '0.677')] [2024-03-21 01:10:48,032][04017] Updated weights for policy 0, policy_version 12434 (0.0013) [2024-03-21 01:10:50,521][03784] Fps is (10 sec: 39321.0, 60 sec: 48605.8, 300 sec: 46652.7). Total num frames: 407470080. Throughput: 0: 46717.7. Samples: 408584800. Policy #0 lag: (min: 0.0, avg: 52.1, max: 115.0) [2024-03-21 01:10:50,522][03784] Avg episode reward: [(0, '0.806')] [2024-03-21 01:10:55,521][03784] Fps is (10 sec: 26214.3, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 407601152. Throughput: 0: 46733.3. Samples: 408879500. Policy #0 lag: (min: 0.0, avg: 44.8, max: 95.0) [2024-03-21 01:10:55,522][03784] Avg episode reward: [(0, '0.790')] [2024-03-21 01:11:00,521][03784] Fps is (10 sec: 26214.7, 60 sec: 44236.9, 300 sec: 45653.1). Total num frames: 407732224. Throughput: 0: 46853.5. Samples: 409031200. Policy #0 lag: (min: 0.0, avg: 44.8, max: 95.0) [2024-03-21 01:11:00,522][03784] Avg episode reward: [(0, '1.070')] [2024-03-21 01:11:00,591][03995] Signal inference workers to stop experience collection... (8200 times) [2024-03-21 01:11:00,591][03995] Signal inference workers to resume experience collection... (8200 times) [2024-03-21 01:11:00,595][04017] Updated weights for policy 0, policy_version 12444 (0.0011) [2024-03-21 01:11:00,628][04017] InferenceWorker_p0-w0: stopping experience collection (8200 times) [2024-03-21 01:11:00,637][04017] InferenceWorker_p0-w0: resuming experience collection (8200 times) [2024-03-21 01:11:05,521][03784] Fps is (10 sec: 39321.8, 60 sec: 46967.5, 300 sec: 45653.1). Total num frames: 407994368. Throughput: 0: 47284.4. Samples: 409331700. Policy #0 lag: (min: 0.0, avg: 35.4, max: 76.0) [2024-03-21 01:11:05,522][03784] Avg episode reward: [(0, '1.070')] [2024-03-21 01:11:08,377][04017] Updated weights for policy 0, policy_version 12454 (0.0015) [2024-03-21 01:11:10,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46967.5, 300 sec: 45875.2). Total num frames: 408158208. Throughput: 0: 47193.4. Samples: 409603500. Policy #0 lag: (min: 0.0, avg: 35.4, max: 76.0) [2024-03-21 01:11:10,522][03784] Avg episode reward: [(0, '1.075')] [2024-03-21 01:11:15,364][04017] Updated weights for policy 0, policy_version 12464 (0.0018) [2024-03-21 01:11:15,521][03784] Fps is (10 sec: 42598.3, 60 sec: 47513.6, 300 sec: 45986.3). Total num frames: 408420352. Throughput: 0: 46964.5. Samples: 409743000. Policy #0 lag: (min: 0.0, avg: 41.2, max: 91.0) [2024-03-21 01:11:15,522][03784] Avg episode reward: [(0, '0.525')] [2024-03-21 01:11:20,521][03784] Fps is (10 sec: 58982.4, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 408748032. Throughput: 0: 47009.0. Samples: 410023900. Policy #0 lag: (min: 2.0, avg: 37.0, max: 77.0) [2024-03-21 01:11:20,522][04017] Updated weights for policy 0, policy_version 12474 (0.0012) [2024-03-21 01:11:20,522][03784] Avg episode reward: [(0, '1.129')] [2024-03-21 01:11:25,521][03784] Fps is (10 sec: 58982.5, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 409010176. Throughput: 0: 47695.5. Samples: 410312100. Policy #0 lag: (min: 2.0, avg: 37.0, max: 77.0) [2024-03-21 01:11:25,522][03784] Avg episode reward: [(0, '1.129')] [2024-03-21 01:11:26,214][04017] Updated weights for policy 0, policy_version 12484 (0.0013) [2024-03-21 01:11:30,521][03784] Fps is (10 sec: 49151.4, 60 sec: 48605.8, 300 sec: 46430.6). Total num frames: 409239552. Throughput: 0: 47553.3. Samples: 410456900. Policy #0 lag: (min: 2.0, avg: 37.0, max: 77.0) [2024-03-21 01:11:30,522][03784] Avg episode reward: [(0, '1.129')] [2024-03-21 01:11:34,420][04017] Updated weights for policy 0, policy_version 12495 (0.0012) [2024-03-21 01:11:35,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46421.3, 300 sec: 46208.4). Total num frames: 409468928. Throughput: 0: 47080.0. Samples: 410703400. Policy #0 lag: (min: 0.0, avg: 43.4, max: 89.0) [2024-03-21 01:11:35,522][03784] Avg episode reward: [(0, '0.410')] [2024-03-21 01:11:40,035][04017] Updated weights for policy 0, policy_version 12505 (0.0020) [2024-03-21 01:11:40,521][03784] Fps is (10 sec: 55705.5, 60 sec: 45328.9, 300 sec: 46319.5). Total num frames: 409796608. Throughput: 0: 46157.7. Samples: 410956600. Policy #0 lag: (min: 0.0, avg: 43.4, max: 89.0) [2024-03-21 01:11:40,522][03784] Avg episode reward: [(0, '1.002')] [2024-03-21 01:11:45,521][03784] Fps is (10 sec: 58982.4, 60 sec: 45329.0, 300 sec: 46763.8). Total num frames: 410058752. Throughput: 0: 45917.7. Samples: 411097500. Policy #0 lag: (min: 2.0, avg: 47.2, max: 90.0) [2024-03-21 01:11:45,522][03784] Avg episode reward: [(0, '0.856')] [2024-03-21 01:11:45,593][04017] Updated weights for policy 0, policy_version 12515 (0.0011) [2024-03-21 01:11:50,521][03784] Fps is (10 sec: 39322.1, 60 sec: 45329.1, 300 sec: 46319.5). Total num frames: 410189824. Throughput: 0: 45606.7. Samples: 411384000. Policy #0 lag: (min: 2.0, avg: 47.2, max: 90.0) [2024-03-21 01:11:50,522][03784] Avg episode reward: [(0, '0.931')] [2024-03-21 01:11:53,983][03995] Signal inference workers to stop experience collection... (8250 times) [2024-03-21 01:11:54,037][04017] InferenceWorker_p0-w0: stopping experience collection (8250 times) [2024-03-21 01:11:54,059][03995] Signal inference workers to resume experience collection... (8250 times) [2024-03-21 01:11:54,087][04017] InferenceWorker_p0-w0: resuming experience collection (8250 times) [2024-03-21 01:11:54,650][04017] Updated weights for policy 0, policy_version 12525 (0.0010) [2024-03-21 01:11:55,521][03784] Fps is (10 sec: 36044.4, 60 sec: 46967.4, 300 sec: 46208.4). Total num frames: 410419200. Throughput: 0: 45984.2. Samples: 411672800. Policy #0 lag: (min: 0.0, avg: 38.5, max: 75.0) [2024-03-21 01:11:55,522][03784] Avg episode reward: [(0, '0.653')] [2024-03-21 01:12:00,521][03784] Fps is (10 sec: 42598.4, 60 sec: 48059.7, 300 sec: 46652.7). Total num frames: 410615808. Throughput: 0: 46333.4. Samples: 411828000. Policy #0 lag: (min: 0.0, avg: 38.5, max: 75.0) [2024-03-21 01:12:00,522][03784] Avg episode reward: [(0, '0.527')] [2024-03-21 01:12:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000012531_410615808.pth... [2024-03-21 01:12:00,691][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000012191_399474688.pth [2024-03-21 01:12:05,521][03784] Fps is (10 sec: 29491.7, 60 sec: 45329.1, 300 sec: 45653.1). Total num frames: 410714112. Throughput: 0: 46555.5. Samples: 412118900. Policy #0 lag: (min: 0.0, avg: 30.9, max: 79.0) [2024-03-21 01:12:05,522][03784] Avg episode reward: [(0, '0.942')] [2024-03-21 01:12:06,144][04017] Updated weights for policy 0, policy_version 12535 (0.0011) [2024-03-21 01:12:10,215][04017] Updated weights for policy 0, policy_version 12545 (0.0010) [2024-03-21 01:12:10,521][03784] Fps is (10 sec: 45874.7, 60 sec: 48605.7, 300 sec: 46208.4). Total num frames: 411074560. Throughput: 0: 45617.7. Samples: 412364900. Policy #0 lag: (min: 0.0, avg: 30.9, max: 79.0) [2024-03-21 01:12:10,522][03784] Avg episode reward: [(0, '1.046')] [2024-03-21 01:12:15,521][03784] Fps is (10 sec: 65536.2, 60 sec: 49152.1, 300 sec: 45986.3). Total num frames: 411369472. Throughput: 0: 45602.4. Samples: 412509000. Policy #0 lag: (min: 0.0, avg: 36.8, max: 72.0) [2024-03-21 01:12:15,521][03784] Avg episode reward: [(0, '1.087')] [2024-03-21 01:12:15,551][04017] Updated weights for policy 0, policy_version 12555 (0.0012) [2024-03-21 01:12:20,521][03784] Fps is (10 sec: 49152.4, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 411566080. Throughput: 0: 46011.1. Samples: 412773900. Policy #0 lag: (min: 0.0, avg: 36.8, max: 72.0) [2024-03-21 01:12:20,522][03784] Avg episode reward: [(0, '0.989')] [2024-03-21 01:12:25,363][04017] Updated weights for policy 0, policy_version 12565 (0.0011) [2024-03-21 01:12:25,521][03784] Fps is (10 sec: 36044.5, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 411729920. Throughput: 0: 46620.1. Samples: 413054500. Policy #0 lag: (min: 1.0, avg: 38.3, max: 69.0) [2024-03-21 01:12:25,522][03784] Avg episode reward: [(0, '0.642')] [2024-03-21 01:12:30,521][03784] Fps is (10 sec: 29491.3, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 411860992. Throughput: 0: 46688.9. Samples: 413198500. Policy #0 lag: (min: 1.0, avg: 38.3, max: 69.0) [2024-03-21 01:12:30,522][03784] Avg episode reward: [(0, '1.131')] [2024-03-21 01:12:33,111][04017] Updated weights for policy 0, policy_version 12575 (0.0016) [2024-03-21 01:12:35,521][03784] Fps is (10 sec: 52428.7, 60 sec: 46421.3, 300 sec: 46208.4). Total num frames: 412254208. Throughput: 0: 45919.9. Samples: 413450400. Policy #0 lag: (min: 1.0, avg: 40.6, max: 82.0) [2024-03-21 01:12:35,522][03784] Avg episode reward: [(0, '1.023')] [2024-03-21 01:12:38,517][04017] Updated weights for policy 0, policy_version 12585 (0.0023) [2024-03-21 01:12:40,521][03784] Fps is (10 sec: 68812.3, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 412549120. Throughput: 0: 45715.6. Samples: 413730000. Policy #0 lag: (min: 1.0, avg: 40.6, max: 82.0) [2024-03-21 01:12:40,522][03784] Avg episode reward: [(0, '1.023')] [2024-03-21 01:12:42,514][04017] Updated weights for policy 0, policy_version 12595 (0.0012) [2024-03-21 01:12:45,521][03784] Fps is (10 sec: 55706.1, 60 sec: 45875.3, 300 sec: 46874.9). Total num frames: 412811264. Throughput: 0: 44875.6. Samples: 413847400. Policy #0 lag: (min: 0.0, avg: 36.7, max: 96.0) [2024-03-21 01:12:45,522][03784] Avg episode reward: [(0, '1.248')] [2024-03-21 01:12:48,068][03995] Signal inference workers to stop experience collection... (8300 times) [2024-03-21 01:12:48,121][04017] InferenceWorker_p0-w0: stopping experience collection (8300 times) [2024-03-21 01:12:48,138][03995] Signal inference workers to resume experience collection... (8300 times) [2024-03-21 01:12:48,161][04017] InferenceWorker_p0-w0: resuming experience collection (8300 times) [2024-03-21 01:12:50,509][04017] Updated weights for policy 0, policy_version 12605 (0.0011) [2024-03-21 01:12:50,521][03784] Fps is (10 sec: 49151.9, 60 sec: 47513.5, 300 sec: 47319.2). Total num frames: 413040640. Throughput: 0: 44737.7. Samples: 414132100. Policy #0 lag: (min: 0.0, avg: 36.7, max: 96.0) [2024-03-21 01:12:50,522][03784] Avg episode reward: [(0, '1.179')] [2024-03-21 01:12:55,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46967.6, 300 sec: 46874.9). Total num frames: 413237248. Throughput: 0: 45506.8. Samples: 414412700. Policy #0 lag: (min: 1.0, avg: 41.2, max: 119.0) [2024-03-21 01:12:55,522][03784] Avg episode reward: [(0, '1.076')] [2024-03-21 01:12:56,922][04017] Updated weights for policy 0, policy_version 12615 (0.0011) [2024-03-21 01:13:00,521][03784] Fps is (10 sec: 49152.0, 60 sec: 48605.8, 300 sec: 46430.6). Total num frames: 413532160. Throughput: 0: 45284.3. Samples: 414546800. Policy #0 lag: (min: 1.0, avg: 41.2, max: 119.0) [2024-03-21 01:13:00,522][03784] Avg episode reward: [(0, '0.593')] [2024-03-21 01:13:03,547][04017] Updated weights for policy 0, policy_version 12625 (0.0011) [2024-03-21 01:13:05,521][03784] Fps is (10 sec: 45875.0, 60 sec: 49698.1, 300 sec: 45319.8). Total num frames: 413696000. Throughput: 0: 45737.8. Samples: 414832100. Policy #0 lag: (min: 1.0, avg: 41.2, max: 119.0) [2024-03-21 01:13:05,522][03784] Avg episode reward: [(0, '0.516')] [2024-03-21 01:13:10,521][03784] Fps is (10 sec: 19660.9, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 413728768. Throughput: 0: 46337.7. Samples: 415139700. Policy #0 lag: (min: 0.0, avg: 33.5, max: 75.0) [2024-03-21 01:13:10,522][03784] Avg episode reward: [(0, '0.432')] [2024-03-21 01:13:15,521][03784] Fps is (10 sec: 29491.3, 60 sec: 43690.7, 300 sec: 45097.7). Total num frames: 413990912. Throughput: 0: 46206.7. Samples: 415277800. Policy #0 lag: (min: 0.0, avg: 33.5, max: 75.0) [2024-03-21 01:13:15,522][03784] Avg episode reward: [(0, '0.408')] [2024-03-21 01:13:16,085][04017] Updated weights for policy 0, policy_version 12635 (0.0016) [2024-03-21 01:13:20,521][03784] Fps is (10 sec: 52428.9, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 414253056. Throughput: 0: 47255.6. Samples: 415576900. Policy #0 lag: (min: 0.0, avg: 30.0, max: 72.0) [2024-03-21 01:13:20,522][03784] Avg episode reward: [(0, '0.579')] [2024-03-21 01:13:21,363][04017] Updated weights for policy 0, policy_version 12645 (0.0017) [2024-03-21 01:13:25,521][03784] Fps is (10 sec: 55705.3, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 414547968. Throughput: 0: 47091.2. Samples: 415849100. Policy #0 lag: (min: 0.0, avg: 30.0, max: 72.0) [2024-03-21 01:13:25,522][03784] Avg episode reward: [(0, '0.537')] [2024-03-21 01:13:29,465][04017] Updated weights for policy 0, policy_version 12655 (0.0019) [2024-03-21 01:13:30,521][03784] Fps is (10 sec: 45875.4, 60 sec: 47513.6, 300 sec: 46319.5). Total num frames: 414711808. Throughput: 0: 47655.5. Samples: 415991900. Policy #0 lag: (min: 0.0, avg: 46.1, max: 115.0) [2024-03-21 01:13:30,522][03784] Avg episode reward: [(0, '1.072')] [2024-03-21 01:13:34,409][04017] Updated weights for policy 0, policy_version 12665 (0.0016) [2024-03-21 01:13:35,521][03784] Fps is (10 sec: 55705.5, 60 sec: 47513.6, 300 sec: 47097.0). Total num frames: 415105024. Throughput: 0: 47206.7. Samples: 416256400. Policy #0 lag: (min: 0.0, avg: 46.1, max: 115.0) [2024-03-21 01:13:35,522][03784] Avg episode reward: [(0, '1.328')] [2024-03-21 01:13:40,521][03784] Fps is (10 sec: 55704.9, 60 sec: 45329.0, 300 sec: 46541.7). Total num frames: 415268864. Throughput: 0: 46715.4. Samples: 416514900. Policy #0 lag: (min: 0.0, avg: 43.8, max: 86.0) [2024-03-21 01:13:40,522][03784] Avg episode reward: [(0, '1.109')] [2024-03-21 01:13:41,006][04017] Updated weights for policy 0, policy_version 12675 (0.0017) [2024-03-21 01:13:45,338][03995] Signal inference workers to stop experience collection... (8350 times) [2024-03-21 01:13:45,446][04017] InferenceWorker_p0-w0: stopping experience collection (8350 times) [2024-03-21 01:13:45,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 46319.5). Total num frames: 415465472. Throughput: 0: 46984.5. Samples: 416661100. Policy #0 lag: (min: 0.0, avg: 43.8, max: 86.0) [2024-03-21 01:13:45,522][03784] Avg episode reward: [(0, '1.135')] [2024-03-21 01:13:45,535][03995] Signal inference workers to resume experience collection... (8350 times) [2024-03-21 01:13:45,536][04017] InferenceWorker_p0-w0: resuming experience collection (8350 times) [2024-03-21 01:13:50,521][03784] Fps is (10 sec: 36044.9, 60 sec: 43144.6, 300 sec: 45986.3). Total num frames: 415629312. Throughput: 0: 46946.6. Samples: 416944700. Policy #0 lag: (min: 0.0, avg: 32.4, max: 61.0) [2024-03-21 01:13:50,522][03784] Avg episode reward: [(0, '1.066')] [2024-03-21 01:13:50,685][04017] Updated weights for policy 0, policy_version 12685 (0.0012) [2024-03-21 01:13:55,521][03784] Fps is (10 sec: 45875.1, 60 sec: 44782.9, 300 sec: 46097.3). Total num frames: 415924224. Throughput: 0: 46028.9. Samples: 417211000. Policy #0 lag: (min: 0.0, avg: 32.4, max: 61.0) [2024-03-21 01:13:55,522][03784] Avg episode reward: [(0, '1.193')] [2024-03-21 01:13:56,843][04017] Updated weights for policy 0, policy_version 12695 (0.0011) [2024-03-21 01:14:00,521][03784] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 46208.4). Total num frames: 416153600. Throughput: 0: 46004.4. Samples: 417348000. Policy #0 lag: (min: 2.0, avg: 46.9, max: 113.0) [2024-03-21 01:14:00,522][03784] Avg episode reward: [(0, '1.249')] [2024-03-21 01:14:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000012700_416153600.pth... [2024-03-21 01:14:00,655][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000012362_405078016.pth [2024-03-21 01:14:04,146][04017] Updated weights for policy 0, policy_version 12705 (0.0016) [2024-03-21 01:14:05,521][03784] Fps is (10 sec: 49152.0, 60 sec: 45329.0, 300 sec: 46541.7). Total num frames: 416415744. Throughput: 0: 45262.2. Samples: 417613700. Policy #0 lag: (min: 2.0, avg: 46.9, max: 113.0) [2024-03-21 01:14:05,522][03784] Avg episode reward: [(0, '1.019')] [2024-03-21 01:14:08,241][04017] Updated weights for policy 0, policy_version 12715 (0.0016) [2024-03-21 01:14:10,521][03784] Fps is (10 sec: 58982.5, 60 sec: 50244.3, 300 sec: 46874.9). Total num frames: 416743424. Throughput: 0: 45180.0. Samples: 417882200. Policy #0 lag: (min: 2.0, avg: 51.3, max: 117.0) [2024-03-21 01:14:10,522][03784] Avg episode reward: [(0, '1.019')] [2024-03-21 01:14:14,416][04017] Updated weights for policy 0, policy_version 12725 (0.0012) [2024-03-21 01:14:15,521][03784] Fps is (10 sec: 58982.6, 60 sec: 50244.2, 300 sec: 46986.0). Total num frames: 417005568. Throughput: 0: 45260.0. Samples: 418028600. Policy #0 lag: (min: 2.0, avg: 51.3, max: 117.0) [2024-03-21 01:14:15,522][03784] Avg episode reward: [(0, '1.157')] [2024-03-21 01:14:20,521][03784] Fps is (10 sec: 42598.4, 60 sec: 48605.9, 300 sec: 46986.0). Total num frames: 417169408. Throughput: 0: 45593.4. Samples: 418308100. Policy #0 lag: (min: 0.0, avg: 36.1, max: 75.0) [2024-03-21 01:14:20,522][03784] Avg episode reward: [(0, '1.097')] [2024-03-21 01:14:22,592][04017] Updated weights for policy 0, policy_version 12735 (0.0016) [2024-03-21 01:14:25,521][03784] Fps is (10 sec: 45874.8, 60 sec: 48605.8, 300 sec: 47430.3). Total num frames: 417464320. Throughput: 0: 46104.5. Samples: 418589600. Policy #0 lag: (min: 0.0, avg: 36.1, max: 75.0) [2024-03-21 01:14:25,522][03784] Avg episode reward: [(0, '0.461')] [2024-03-21 01:14:30,521][03784] Fps is (10 sec: 36044.9, 60 sec: 46967.5, 300 sec: 46208.4). Total num frames: 417529856. Throughput: 0: 46328.9. Samples: 418745900. Policy #0 lag: (min: 0.0, avg: 36.1, max: 75.0) [2024-03-21 01:14:30,522][03784] Avg episode reward: [(0, '0.463')] [2024-03-21 01:14:34,167][04017] Updated weights for policy 0, policy_version 12745 (0.0011) [2024-03-21 01:14:35,521][03784] Fps is (10 sec: 16384.2, 60 sec: 42052.3, 300 sec: 45542.0). Total num frames: 417628160. Throughput: 0: 46540.1. Samples: 419039000. Policy #0 lag: (min: 0.0, avg: 35.8, max: 80.0) [2024-03-21 01:14:35,522][03784] Avg episode reward: [(0, '0.778')] [2024-03-21 01:14:40,521][03784] Fps is (10 sec: 19660.6, 60 sec: 40960.0, 300 sec: 45319.8). Total num frames: 417726464. Throughput: 0: 47251.1. Samples: 419337300. Policy #0 lag: (min: 0.0, avg: 35.8, max: 80.0) [2024-03-21 01:14:40,522][03784] Avg episode reward: [(0, '1.052')] [2024-03-21 01:14:43,512][04017] Updated weights for policy 0, policy_version 12755 (0.0018) [2024-03-21 01:14:44,937][03995] Signal inference workers to stop experience collection... (8400 times) [2024-03-21 01:14:45,011][04017] InferenceWorker_p0-w0: stopping experience collection (8400 times) [2024-03-21 01:14:45,011][03995] Signal inference workers to resume experience collection... (8400 times) [2024-03-21 01:14:45,057][04017] InferenceWorker_p0-w0: resuming experience collection (8400 times) [2024-03-21 01:14:45,521][03784] Fps is (10 sec: 49151.4, 60 sec: 44236.7, 300 sec: 45986.3). Total num frames: 418119680. Throughput: 0: 47068.8. Samples: 419466100. Policy #0 lag: (min: 1.0, avg: 36.1, max: 73.0) [2024-03-21 01:14:45,522][03784] Avg episode reward: [(0, '0.324')] [2024-03-21 01:14:49,806][04017] Updated weights for policy 0, policy_version 12765 (0.0011) [2024-03-21 01:14:50,521][03784] Fps is (10 sec: 62259.2, 60 sec: 45329.1, 300 sec: 45875.2). Total num frames: 418349056. Throughput: 0: 47557.7. Samples: 419753800. Policy #0 lag: (min: 1.0, avg: 36.1, max: 73.0) [2024-03-21 01:14:50,522][03784] Avg episode reward: [(0, '1.428')] [2024-03-21 01:14:54,710][04017] Updated weights for policy 0, policy_version 12775 (0.0012) [2024-03-21 01:14:55,521][03784] Fps is (10 sec: 52429.5, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 418643968. Throughput: 0: 47531.2. Samples: 420021100. Policy #0 lag: (min: 0.0, avg: 44.8, max: 115.0) [2024-03-21 01:14:55,522][03784] Avg episode reward: [(0, '1.422')] [2024-03-21 01:15:00,521][03784] Fps is (10 sec: 55706.0, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 418906112. Throughput: 0: 47228.9. Samples: 420153900. Policy #0 lag: (min: 0.0, avg: 44.8, max: 115.0) [2024-03-21 01:15:00,522][03784] Avg episode reward: [(0, '0.600')] [2024-03-21 01:15:00,790][04017] Updated weights for policy 0, policy_version 12785 (0.0016) [2024-03-21 01:15:04,282][04017] Updated weights for policy 0, policy_version 12795 (0.0016) [2024-03-21 01:15:05,521][03784] Fps is (10 sec: 72089.6, 60 sec: 49152.1, 300 sec: 47541.4). Total num frames: 419364864. Throughput: 0: 46982.3. Samples: 420422300. Policy #0 lag: (min: 2.0, avg: 35.7, max: 74.0) [2024-03-21 01:15:05,522][03784] Avg episode reward: [(0, '0.644')] [2024-03-21 01:15:10,521][03784] Fps is (10 sec: 62259.3, 60 sec: 46421.4, 300 sec: 47319.2). Total num frames: 419528704. Throughput: 0: 46795.6. Samples: 420695400. Policy #0 lag: (min: 2.0, avg: 35.7, max: 74.0) [2024-03-21 01:15:10,522][03784] Avg episode reward: [(0, '1.359')] [2024-03-21 01:15:11,847][04017] Updated weights for policy 0, policy_version 12805 (0.0012) [2024-03-21 01:15:15,521][03784] Fps is (10 sec: 39321.4, 60 sec: 45875.2, 300 sec: 47319.2). Total num frames: 419758080. Throughput: 0: 46508.9. Samples: 420838800. Policy #0 lag: (min: 0.0, avg: 42.5, max: 85.0) [2024-03-21 01:15:15,522][03784] Avg episode reward: [(0, '0.835')] [2024-03-21 01:15:19,883][04017] Updated weights for policy 0, policy_version 12815 (0.0015) [2024-03-21 01:15:20,521][03784] Fps is (10 sec: 39321.7, 60 sec: 45875.2, 300 sec: 46652.8). Total num frames: 419921920. Throughput: 0: 46373.3. Samples: 421125800. Policy #0 lag: (min: 0.0, avg: 42.5, max: 85.0) [2024-03-21 01:15:20,522][03784] Avg episode reward: [(0, '0.654')] [2024-03-21 01:15:25,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44236.9, 300 sec: 46763.8). Total num frames: 420118528. Throughput: 0: 46013.4. Samples: 421407900. Policy #0 lag: (min: 0.0, avg: 45.1, max: 103.0) [2024-03-21 01:15:25,522][03784] Avg episode reward: [(0, '1.032')] [2024-03-21 01:15:27,956][04017] Updated weights for policy 0, policy_version 12825 (0.0010) [2024-03-21 01:15:30,521][03784] Fps is (10 sec: 36044.6, 60 sec: 45875.2, 300 sec: 46097.4). Total num frames: 420282368. Throughput: 0: 46220.1. Samples: 421546000. Policy #0 lag: (min: 0.0, avg: 45.1, max: 103.0) [2024-03-21 01:15:30,522][03784] Avg episode reward: [(0, '0.946')] [2024-03-21 01:15:35,521][03784] Fps is (10 sec: 26214.4, 60 sec: 45875.2, 300 sec: 45097.6). Total num frames: 420380672. Throughput: 0: 46673.4. Samples: 421854100. Policy #0 lag: (min: 0.0, avg: 45.1, max: 103.0) [2024-03-21 01:15:35,522][03784] Avg episode reward: [(0, '1.131')] [2024-03-21 01:15:36,288][03995] Signal inference workers to stop experience collection... (8450 times) [2024-03-21 01:15:36,363][04017] InferenceWorker_p0-w0: stopping experience collection (8450 times) [2024-03-21 01:15:36,550][03995] Signal inference workers to resume experience collection... (8450 times) [2024-03-21 01:15:36,550][04017] InferenceWorker_p0-w0: resuming experience collection (8450 times) [2024-03-21 01:15:38,171][04017] Updated weights for policy 0, policy_version 12835 (0.0011) [2024-03-21 01:15:40,521][03784] Fps is (10 sec: 42598.4, 60 sec: 49698.2, 300 sec: 45319.8). Total num frames: 420708352. Throughput: 0: 46622.2. Samples: 422119100. Policy #0 lag: (min: 0.0, avg: 36.7, max: 76.0) [2024-03-21 01:15:40,522][03784] Avg episode reward: [(0, '1.143')] [2024-03-21 01:15:45,062][04017] Updated weights for policy 0, policy_version 12845 (0.0011) [2024-03-21 01:15:45,521][03784] Fps is (10 sec: 55705.7, 60 sec: 46967.5, 300 sec: 45653.1). Total num frames: 420937728. Throughput: 0: 46788.9. Samples: 422259400. Policy #0 lag: (min: 0.0, avg: 36.7, max: 76.0) [2024-03-21 01:15:45,522][03784] Avg episode reward: [(0, '1.185')] [2024-03-21 01:15:50,138][04017] Updated weights for policy 0, policy_version 12855 (0.0012) [2024-03-21 01:15:50,521][03784] Fps is (10 sec: 55704.6, 60 sec: 48605.8, 300 sec: 46319.5). Total num frames: 421265408. Throughput: 0: 46704.2. Samples: 422524000. Policy #0 lag: (min: 4.0, avg: 42.3, max: 85.0) [2024-03-21 01:15:50,522][03784] Avg episode reward: [(0, '1.679')] [2024-03-21 01:15:50,844][03995] Saving new best policy, reward=1.679! [2024-03-21 01:15:55,093][04017] Updated weights for policy 0, policy_version 12865 (0.0011) [2024-03-21 01:15:55,521][03784] Fps is (10 sec: 65535.4, 60 sec: 49151.9, 300 sec: 46986.0). Total num frames: 421593088. Throughput: 0: 46926.6. Samples: 422807100. Policy #0 lag: (min: 4.0, avg: 42.3, max: 85.0) [2024-03-21 01:15:55,522][03784] Avg episode reward: [(0, '0.877')] [2024-03-21 01:16:00,521][03784] Fps is (10 sec: 49152.8, 60 sec: 47513.6, 300 sec: 46652.7). Total num frames: 421756928. Throughput: 0: 47102.2. Samples: 422958400. Policy #0 lag: (min: 5.0, avg: 46.1, max: 96.0) [2024-03-21 01:16:00,522][03784] Avg episode reward: [(0, '0.786')] [2024-03-21 01:16:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000012871_421756928.pth... [2024-03-21 01:16:00,670][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000012531_410615808.pth [2024-03-21 01:16:03,593][04017] Updated weights for policy 0, policy_version 12875 (0.0021) [2024-03-21 01:16:05,521][03784] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 46874.9). Total num frames: 421986304. Throughput: 0: 46808.9. Samples: 423232200. Policy #0 lag: (min: 5.0, avg: 46.1, max: 96.0) [2024-03-21 01:16:05,522][03784] Avg episode reward: [(0, '1.023')] [2024-03-21 01:16:10,521][03784] Fps is (10 sec: 32768.0, 60 sec: 42598.4, 300 sec: 46319.5). Total num frames: 422084608. Throughput: 0: 47155.5. Samples: 423529900. Policy #0 lag: (min: 0.0, avg: 42.4, max: 86.0) [2024-03-21 01:16:10,522][03784] Avg episode reward: [(0, '0.557')] [2024-03-21 01:16:12,393][04017] Updated weights for policy 0, policy_version 12885 (0.0010) [2024-03-21 01:16:15,521][03784] Fps is (10 sec: 49151.8, 60 sec: 45329.1, 300 sec: 46541.7). Total num frames: 422477824. Throughput: 0: 47320.0. Samples: 423675400. Policy #0 lag: (min: 0.0, avg: 42.4, max: 86.0) [2024-03-21 01:16:15,522][03784] Avg episode reward: [(0, '1.219')] [2024-03-21 01:16:15,964][04017] Updated weights for policy 0, policy_version 12895 (0.0015) [2024-03-21 01:16:19,279][03995] Signal inference workers to stop experience collection... (8500 times) [2024-03-21 01:16:19,344][03995] Signal inference workers to resume experience collection... (8500 times) [2024-03-21 01:16:19,487][04017] InferenceWorker_p0-w0: stopping experience collection (8500 times) [2024-03-21 01:16:19,488][04017] InferenceWorker_p0-w0: resuming experience collection (8500 times) [2024-03-21 01:16:20,521][03784] Fps is (10 sec: 58982.2, 60 sec: 45875.1, 300 sec: 46319.5). Total num frames: 422674432. Throughput: 0: 46808.9. Samples: 423960500. Policy #0 lag: (min: 0.0, avg: 45.3, max: 88.0) [2024-03-21 01:16:20,522][03784] Avg episode reward: [(0, '1.219')] [2024-03-21 01:16:24,123][04017] Updated weights for policy 0, policy_version 12905 (0.0011) [2024-03-21 01:16:25,521][03784] Fps is (10 sec: 42598.7, 60 sec: 46421.4, 300 sec: 46319.5). Total num frames: 422903808. Throughput: 0: 47093.4. Samples: 424238300. Policy #0 lag: (min: 0.0, avg: 45.3, max: 88.0) [2024-03-21 01:16:25,522][03784] Avg episode reward: [(0, '0.776')] [2024-03-21 01:16:30,264][04017] Updated weights for policy 0, policy_version 12915 (0.0012) [2024-03-21 01:16:30,521][03784] Fps is (10 sec: 52428.4, 60 sec: 48605.8, 300 sec: 46541.7). Total num frames: 423198720. Throughput: 0: 47302.1. Samples: 424388000. Policy #0 lag: (min: 0.0, avg: 34.5, max: 71.0) [2024-03-21 01:16:30,522][03784] Avg episode reward: [(0, '0.630')] [2024-03-21 01:16:35,521][03784] Fps is (10 sec: 39321.2, 60 sec: 48605.8, 300 sec: 45764.1). Total num frames: 423297024. Throughput: 0: 47777.9. Samples: 424674000. Policy #0 lag: (min: 0.0, avg: 34.5, max: 71.0) [2024-03-21 01:16:35,522][03784] Avg episode reward: [(0, '0.435')] [2024-03-21 01:16:39,356][04017] Updated weights for policy 0, policy_version 12925 (0.0015) [2024-03-21 01:16:40,521][03784] Fps is (10 sec: 32768.4, 60 sec: 46967.5, 300 sec: 45653.1). Total num frames: 423526400. Throughput: 0: 47369.0. Samples: 424938700. Policy #0 lag: (min: 2.0, avg: 58.6, max: 112.0) [2024-03-21 01:16:40,522][03784] Avg episode reward: [(0, '1.021')] [2024-03-21 01:16:45,521][03784] Fps is (10 sec: 52428.4, 60 sec: 48059.6, 300 sec: 46208.4). Total num frames: 423821312. Throughput: 0: 47157.6. Samples: 425080500. Policy #0 lag: (min: 2.0, avg: 58.6, max: 112.0) [2024-03-21 01:16:45,522][03784] Avg episode reward: [(0, '1.021')] [2024-03-21 01:16:45,767][04017] Updated weights for policy 0, policy_version 12935 (0.0022) [2024-03-21 01:16:50,521][03784] Fps is (10 sec: 58982.9, 60 sec: 47513.8, 300 sec: 46430.6). Total num frames: 424116224. Throughput: 0: 47173.4. Samples: 425355000. Policy #0 lag: (min: 0.0, avg: 43.2, max: 86.0) [2024-03-21 01:16:50,521][03784] Avg episode reward: [(0, '0.834')] [2024-03-21 01:16:52,447][04017] Updated weights for policy 0, policy_version 12945 (0.0012) [2024-03-21 01:16:55,521][03784] Fps is (10 sec: 42598.9, 60 sec: 44236.8, 300 sec: 46208.4). Total num frames: 424247296. Throughput: 0: 46497.8. Samples: 425622300. Policy #0 lag: (min: 0.0, avg: 43.2, max: 86.0) [2024-03-21 01:16:55,522][03784] Avg episode reward: [(0, '0.608')] [2024-03-21 01:17:00,521][03784] Fps is (10 sec: 22937.4, 60 sec: 43144.5, 300 sec: 46208.4). Total num frames: 424345600. Throughput: 0: 46484.5. Samples: 425767200. Policy #0 lag: (min: 1.0, avg: 24.3, max: 59.0) [2024-03-21 01:17:00,522][03784] Avg episode reward: [(0, '1.311')] [2024-03-21 01:17:04,425][04017] Updated weights for policy 0, policy_version 12955 (0.0011) [2024-03-21 01:17:05,521][03784] Fps is (10 sec: 32768.1, 60 sec: 43144.5, 300 sec: 45764.1). Total num frames: 424574976. Throughput: 0: 46548.9. Samples: 426055200. Policy #0 lag: (min: 1.0, avg: 24.3, max: 59.0) [2024-03-21 01:17:05,522][03784] Avg episode reward: [(0, '0.735')] [2024-03-21 01:17:09,230][04017] Updated weights for policy 0, policy_version 12965 (0.0024) [2024-03-21 01:17:10,521][03784] Fps is (10 sec: 55705.5, 60 sec: 46967.5, 300 sec: 45875.2). Total num frames: 424902656. Throughput: 0: 46471.1. Samples: 426329500. Policy #0 lag: (min: 0.0, avg: 33.7, max: 67.0) [2024-03-21 01:17:10,522][03784] Avg episode reward: [(0, '1.417')] [2024-03-21 01:17:12,178][03995] Signal inference workers to stop experience collection... (8550 times) [2024-03-21 01:17:12,295][04017] InferenceWorker_p0-w0: stopping experience collection (8550 times) [2024-03-21 01:17:12,374][03995] Signal inference workers to resume experience collection... (8550 times) [2024-03-21 01:17:12,374][04017] InferenceWorker_p0-w0: resuming experience collection (8550 times) [2024-03-21 01:17:13,662][04017] Updated weights for policy 0, policy_version 12975 (0.0014) [2024-03-21 01:17:15,521][03784] Fps is (10 sec: 58982.0, 60 sec: 44782.9, 300 sec: 46097.3). Total num frames: 425164800. Throughput: 0: 46066.7. Samples: 426461000. Policy #0 lag: (min: 0.0, avg: 33.7, max: 67.0) [2024-03-21 01:17:15,522][03784] Avg episode reward: [(0, '1.417')] [2024-03-21 01:17:20,521][03784] Fps is (10 sec: 55705.6, 60 sec: 46421.4, 300 sec: 46541.7). Total num frames: 425459712. Throughput: 0: 45788.9. Samples: 426734500. Policy #0 lag: (min: 2.0, avg: 38.4, max: 71.0) [2024-03-21 01:17:20,522][03784] Avg episode reward: [(0, '0.758')] [2024-03-21 01:17:24,964][04017] Updated weights for policy 0, policy_version 12985 (0.0015) [2024-03-21 01:17:25,521][03784] Fps is (10 sec: 32768.4, 60 sec: 43144.6, 300 sec: 46208.4). Total num frames: 425492480. Throughput: 0: 46240.0. Samples: 427019500. Policy #0 lag: (min: 2.0, avg: 38.4, max: 71.0) [2024-03-21 01:17:25,522][03784] Avg episode reward: [(0, '1.239')] [2024-03-21 01:17:29,278][04017] Updated weights for policy 0, policy_version 12995 (0.0011) [2024-03-21 01:17:30,521][03784] Fps is (10 sec: 42598.2, 60 sec: 44783.0, 300 sec: 46208.4). Total num frames: 425885696. Throughput: 0: 45744.5. Samples: 427139000. Policy #0 lag: (min: 0.0, avg: 55.3, max: 108.0) [2024-03-21 01:17:30,522][03784] Avg episode reward: [(0, '1.125')] [2024-03-21 01:17:34,401][04017] Updated weights for policy 0, policy_version 13005 (0.0011) [2024-03-21 01:17:35,521][03784] Fps is (10 sec: 72088.9, 60 sec: 48605.9, 300 sec: 46319.5). Total num frames: 426213376. Throughput: 0: 45742.1. Samples: 427413400. Policy #0 lag: (min: 0.0, avg: 55.3, max: 108.0) [2024-03-21 01:17:35,522][03784] Avg episode reward: [(0, '1.084')] [2024-03-21 01:17:39,642][04017] Updated weights for policy 0, policy_version 13015 (0.0012) [2024-03-21 01:17:40,521][03784] Fps is (10 sec: 65536.6, 60 sec: 50244.3, 300 sec: 46541.7). Total num frames: 426541056. Throughput: 0: 45660.1. Samples: 427677000. Policy #0 lag: (min: 1.0, avg: 54.7, max: 108.0) [2024-03-21 01:17:40,522][03784] Avg episode reward: [(0, '1.338')] [2024-03-21 01:17:45,521][03784] Fps is (10 sec: 42598.8, 60 sec: 46967.6, 300 sec: 46097.4). Total num frames: 426639360. Throughput: 0: 45702.3. Samples: 427823800. Policy #0 lag: (min: 1.0, avg: 54.7, max: 108.0) [2024-03-21 01:17:45,522][03784] Avg episode reward: [(0, '0.950')] [2024-03-21 01:17:50,043][04017] Updated weights for policy 0, policy_version 13025 (0.0023) [2024-03-21 01:17:50,521][03784] Fps is (10 sec: 26214.2, 60 sec: 44782.8, 300 sec: 45986.3). Total num frames: 426803200. Throughput: 0: 45757.7. Samples: 428114300. Policy #0 lag: (min: 1.0, avg: 36.0, max: 81.0) [2024-03-21 01:17:50,522][03784] Avg episode reward: [(0, '1.341')] [2024-03-21 01:17:55,521][03784] Fps is (10 sec: 39321.7, 60 sec: 46421.4, 300 sec: 45764.2). Total num frames: 427032576. Throughput: 0: 46237.9. Samples: 428410200. Policy #0 lag: (min: 1.0, avg: 36.0, max: 81.0) [2024-03-21 01:17:55,522][03784] Avg episode reward: [(0, '1.183')] [2024-03-21 01:17:56,204][04017] Updated weights for policy 0, policy_version 13035 (0.0015) [2024-03-21 01:18:00,521][03784] Fps is (10 sec: 45875.2, 60 sec: 48605.8, 300 sec: 45986.3). Total num frames: 427261952. Throughput: 0: 46182.2. Samples: 428539200. Policy #0 lag: (min: 1.0, avg: 36.0, max: 81.0) [2024-03-21 01:18:00,522][03784] Avg episode reward: [(0, '0.945')] [2024-03-21 01:18:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000013039_427261952.pth... [2024-03-21 01:18:00,649][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000012700_416153600.pth [2024-03-21 01:18:05,521][03784] Fps is (10 sec: 32767.8, 60 sec: 46421.4, 300 sec: 46208.4). Total num frames: 427360256. Throughput: 0: 46691.2. Samples: 428835600. Policy #0 lag: (min: 0.0, avg: 38.4, max: 81.0) [2024-03-21 01:18:05,522][03784] Avg episode reward: [(0, '0.493')] [2024-03-21 01:18:07,418][04017] Updated weights for policy 0, policy_version 13045 (0.0010) [2024-03-21 01:18:10,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45329.0, 300 sec: 46208.4). Total num frames: 427622400. Throughput: 0: 46693.2. Samples: 429120700. Policy #0 lag: (min: 0.0, avg: 38.4, max: 81.0) [2024-03-21 01:18:10,522][03784] Avg episode reward: [(0, '0.493')] [2024-03-21 01:18:13,675][03995] Signal inference workers to stop experience collection... (8600 times) [2024-03-21 01:18:13,679][03995] Signal inference workers to resume experience collection... (8600 times) [2024-03-21 01:18:13,768][04017] InferenceWorker_p0-w0: stopping experience collection (8600 times) [2024-03-21 01:18:13,769][04017] InferenceWorker_p0-w0: resuming experience collection (8600 times) [2024-03-21 01:18:15,259][04017] Updated weights for policy 0, policy_version 13055 (0.0010) [2024-03-21 01:18:15,521][03784] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 45875.2). Total num frames: 427786240. Throughput: 0: 47315.6. Samples: 429268200. Policy #0 lag: (min: 0.0, avg: 27.5, max: 65.0) [2024-03-21 01:18:15,522][03784] Avg episode reward: [(0, '1.375')] [2024-03-21 01:18:19,765][04017] Updated weights for policy 0, policy_version 13065 (0.0011) [2024-03-21 01:18:20,521][03784] Fps is (10 sec: 55705.7, 60 sec: 45329.1, 300 sec: 46208.4). Total num frames: 428179456. Throughput: 0: 46682.2. Samples: 429514100. Policy #0 lag: (min: 0.0, avg: 27.5, max: 65.0) [2024-03-21 01:18:20,522][03784] Avg episode reward: [(0, '1.308')] [2024-03-21 01:18:25,521][03784] Fps is (10 sec: 58982.5, 60 sec: 48059.7, 300 sec: 46319.5). Total num frames: 428376064. Throughput: 0: 46588.9. Samples: 429773500. Policy #0 lag: (min: 0.0, avg: 36.5, max: 86.0) [2024-03-21 01:18:25,522][03784] Avg episode reward: [(0, '1.238')] [2024-03-21 01:18:29,703][04017] Updated weights for policy 0, policy_version 13075 (0.0021) [2024-03-21 01:18:30,521][03784] Fps is (10 sec: 29490.7, 60 sec: 43144.4, 300 sec: 45319.8). Total num frames: 428474368. Throughput: 0: 46490.8. Samples: 429915900. Policy #0 lag: (min: 0.0, avg: 36.5, max: 86.0) [2024-03-21 01:18:30,522][03784] Avg episode reward: [(0, '0.496')] [2024-03-21 01:18:34,593][04017] Updated weights for policy 0, policy_version 13085 (0.0013) [2024-03-21 01:18:35,521][03784] Fps is (10 sec: 45874.7, 60 sec: 43690.6, 300 sec: 45986.3). Total num frames: 428834816. Throughput: 0: 45577.7. Samples: 430165300. Policy #0 lag: (min: 0.0, avg: 44.7, max: 84.0) [2024-03-21 01:18:35,522][03784] Avg episode reward: [(0, '0.950')] [2024-03-21 01:18:38,509][04017] Updated weights for policy 0, policy_version 13095 (0.0019) [2024-03-21 01:18:40,521][03784] Fps is (10 sec: 81920.1, 60 sec: 45875.0, 300 sec: 46874.9). Total num frames: 429293568. Throughput: 0: 44310.8. Samples: 430404200. Policy #0 lag: (min: 0.0, avg: 44.7, max: 84.0) [2024-03-21 01:18:40,522][03784] Avg episode reward: [(0, '1.151')] [2024-03-21 01:18:45,347][04017] Updated weights for policy 0, policy_version 13105 (0.0016) [2024-03-21 01:18:45,521][03784] Fps is (10 sec: 58983.1, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 429424640. Throughput: 0: 44622.3. Samples: 430547200. Policy #0 lag: (min: 0.0, avg: 48.0, max: 87.0) [2024-03-21 01:18:45,522][03784] Avg episode reward: [(0, '0.991')] [2024-03-21 01:18:50,521][03784] Fps is (10 sec: 26215.0, 60 sec: 45875.3, 300 sec: 46208.4). Total num frames: 429555712. Throughput: 0: 44064.4. Samples: 430818500. Policy #0 lag: (min: 0.0, avg: 48.0, max: 87.0) [2024-03-21 01:18:50,522][03784] Avg episode reward: [(0, '0.466')] [2024-03-21 01:18:54,520][04017] Updated weights for policy 0, policy_version 13115 (0.0010) [2024-03-21 01:18:55,521][03784] Fps is (10 sec: 39321.6, 60 sec: 46421.3, 300 sec: 46319.5). Total num frames: 429817856. Throughput: 0: 44104.5. Samples: 431105400. Policy #0 lag: (min: 1.0, avg: 65.1, max: 102.0) [2024-03-21 01:18:55,522][03784] Avg episode reward: [(0, '1.174')] [2024-03-21 01:19:00,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 46097.4). Total num frames: 430014464. Throughput: 0: 43993.3. Samples: 431247900. Policy #0 lag: (min: 1.0, avg: 65.1, max: 102.0) [2024-03-21 01:19:00,522][03784] Avg episode reward: [(0, '0.570')] [2024-03-21 01:19:00,850][04017] Updated weights for policy 0, policy_version 13125 (0.0015) [2024-03-21 01:19:04,641][03995] Signal inference workers to stop experience collection... (8650 times) [2024-03-21 01:19:04,647][03995] Signal inference workers to resume experience collection... (8650 times) [2024-03-21 01:19:04,726][04017] InferenceWorker_p0-w0: stopping experience collection (8650 times) [2024-03-21 01:19:04,726][04017] InferenceWorker_p0-w0: resuming experience collection (8650 times) [2024-03-21 01:19:05,521][03784] Fps is (10 sec: 45874.8, 60 sec: 48605.8, 300 sec: 45875.2). Total num frames: 430276608. Throughput: 0: 44937.7. Samples: 431536300. Policy #0 lag: (min: 1.0, avg: 65.1, max: 102.0) [2024-03-21 01:19:05,522][03784] Avg episode reward: [(0, '0.570')] [2024-03-21 01:19:07,785][04017] Updated weights for policy 0, policy_version 13135 (0.0011) [2024-03-21 01:19:10,521][03784] Fps is (10 sec: 52428.5, 60 sec: 48605.8, 300 sec: 45875.2). Total num frames: 430538752. Throughput: 0: 45308.8. Samples: 431812400. Policy #0 lag: (min: 0.0, avg: 31.4, max: 76.0) [2024-03-21 01:19:10,522][03784] Avg episode reward: [(0, '0.875')] [2024-03-21 01:19:14,021][04017] Updated weights for policy 0, policy_version 13145 (0.0019) [2024-03-21 01:19:15,521][03784] Fps is (10 sec: 45875.7, 60 sec: 49152.0, 300 sec: 45986.3). Total num frames: 430735360. Throughput: 0: 45355.8. Samples: 431956900. Policy #0 lag: (min: 0.0, avg: 31.4, max: 76.0) [2024-03-21 01:19:15,522][03784] Avg episode reward: [(0, '0.875')] [2024-03-21 01:19:20,521][03784] Fps is (10 sec: 39322.2, 60 sec: 45875.3, 300 sec: 45653.1). Total num frames: 430931968. Throughput: 0: 46129.0. Samples: 432241100. Policy #0 lag: (min: 0.0, avg: 38.4, max: 78.0) [2024-03-21 01:19:20,522][03784] Avg episode reward: [(0, '1.017')] [2024-03-21 01:19:24,485][04017] Updated weights for policy 0, policy_version 13155 (0.0022) [2024-03-21 01:19:25,521][03784] Fps is (10 sec: 36044.1, 60 sec: 45328.9, 300 sec: 45986.3). Total num frames: 431095808. Throughput: 0: 47251.2. Samples: 432530500. Policy #0 lag: (min: 0.0, avg: 38.4, max: 78.0) [2024-03-21 01:19:25,522][03784] Avg episode reward: [(0, '0.836')] [2024-03-21 01:19:29,922][04017] Updated weights for policy 0, policy_version 13165 (0.0014) [2024-03-21 01:19:30,521][03784] Fps is (10 sec: 45874.6, 60 sec: 48606.0, 300 sec: 46652.7). Total num frames: 431390720. Throughput: 0: 47155.5. Samples: 432669200. Policy #0 lag: (min: 0.0, avg: 35.4, max: 90.0) [2024-03-21 01:19:30,522][03784] Avg episode reward: [(0, '0.684')] [2024-03-21 01:19:35,521][03784] Fps is (10 sec: 49152.6, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 431587328. Throughput: 0: 47464.4. Samples: 432954400. Policy #0 lag: (min: 0.0, avg: 35.4, max: 90.0) [2024-03-21 01:19:35,522][03784] Avg episode reward: [(0, '0.472')] [2024-03-21 01:19:37,353][04017] Updated weights for policy 0, policy_version 13175 (0.0010) [2024-03-21 01:19:40,521][03784] Fps is (10 sec: 45875.6, 60 sec: 42598.6, 300 sec: 46541.7). Total num frames: 431849472. Throughput: 0: 47200.0. Samples: 433229400. Policy #0 lag: (min: 0.0, avg: 38.5, max: 98.0) [2024-03-21 01:19:40,522][03784] Avg episode reward: [(0, '0.283')] [2024-03-21 01:19:45,521][03784] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 46319.5). Total num frames: 432013312. Throughput: 0: 47217.8. Samples: 433372700. Policy #0 lag: (min: 0.0, avg: 38.5, max: 98.0) [2024-03-21 01:19:45,522][03784] Avg episode reward: [(0, '0.283')] [2024-03-21 01:19:47,020][04017] Updated weights for policy 0, policy_version 13185 (0.0011) [2024-03-21 01:19:50,521][03784] Fps is (10 sec: 42598.0, 60 sec: 45329.0, 300 sec: 46208.4). Total num frames: 432275456. Throughput: 0: 47091.1. Samples: 433655400. Policy #0 lag: (min: 0.0, avg: 32.7, max: 70.0) [2024-03-21 01:19:50,522][03784] Avg episode reward: [(0, '0.425')] [2024-03-21 01:19:51,883][04017] Updated weights for policy 0, policy_version 13195 (0.0012) [2024-03-21 01:19:55,521][03784] Fps is (10 sec: 55705.5, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 432570368. Throughput: 0: 46631.2. Samples: 433910800. Policy #0 lag: (min: 0.0, avg: 32.7, max: 70.0) [2024-03-21 01:19:55,522][03784] Avg episode reward: [(0, '0.934')] [2024-03-21 01:19:57,749][04017] Updated weights for policy 0, policy_version 13205 (0.0011) [2024-03-21 01:19:58,376][03995] Signal inference workers to stop experience collection... (8700 times) [2024-03-21 01:19:58,492][04017] InferenceWorker_p0-w0: stopping experience collection (8700 times) [2024-03-21 01:19:58,601][03995] Signal inference workers to resume experience collection... (8700 times) [2024-03-21 01:19:58,601][04017] InferenceWorker_p0-w0: resuming experience collection (8700 times) [2024-03-21 01:20:00,521][03784] Fps is (10 sec: 65535.4, 60 sec: 48605.8, 300 sec: 45986.2). Total num frames: 432930816. Throughput: 0: 46564.2. Samples: 434052300. Policy #0 lag: (min: 0.0, avg: 37.8, max: 94.0) [2024-03-21 01:20:00,522][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 01:20:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000013212_432930816.pth... [2024-03-21 01:20:00,692][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000012871_421756928.pth [2024-03-21 01:20:03,239][04017] Updated weights for policy 0, policy_version 13215 (0.0011) [2024-03-21 01:20:05,521][03784] Fps is (10 sec: 62259.4, 60 sec: 48605.9, 300 sec: 46319.5). Total num frames: 433192960. Throughput: 0: 46135.5. Samples: 434317200. Policy #0 lag: (min: 0.0, avg: 37.8, max: 94.0) [2024-03-21 01:20:05,522][03784] Avg episode reward: [(0, '1.062')] [2024-03-21 01:20:08,464][04017] Updated weights for policy 0, policy_version 13225 (0.0010) [2024-03-21 01:20:10,521][03784] Fps is (10 sec: 52429.4, 60 sec: 48605.9, 300 sec: 46430.6). Total num frames: 433455104. Throughput: 0: 46035.7. Samples: 434602100. Policy #0 lag: (min: 2.0, avg: 52.8, max: 105.0) [2024-03-21 01:20:10,522][03784] Avg episode reward: [(0, '0.661')] [2024-03-21 01:20:15,521][03784] Fps is (10 sec: 45874.8, 60 sec: 48605.8, 300 sec: 46541.6). Total num frames: 433651712. Throughput: 0: 46415.5. Samples: 434757900. Policy #0 lag: (min: 2.0, avg: 52.8, max: 105.0) [2024-03-21 01:20:15,522][03784] Avg episode reward: [(0, '0.527')] [2024-03-21 01:20:15,735][04017] Updated weights for policy 0, policy_version 13235 (0.0010) [2024-03-21 01:20:20,521][03784] Fps is (10 sec: 29491.2, 60 sec: 46967.4, 300 sec: 46208.4). Total num frames: 433750016. Throughput: 0: 46806.7. Samples: 435060700. Policy #0 lag: (min: 0.0, avg: 38.9, max: 79.0) [2024-03-21 01:20:20,522][03784] Avg episode reward: [(0, '0.527')] [2024-03-21 01:20:25,521][03784] Fps is (10 sec: 29491.3, 60 sec: 47513.7, 300 sec: 46319.5). Total num frames: 433946624. Throughput: 0: 47359.9. Samples: 435360600. Policy #0 lag: (min: 0.0, avg: 38.9, max: 79.0) [2024-03-21 01:20:25,522][03784] Avg episode reward: [(0, '0.748')] [2024-03-21 01:20:26,060][04017] Updated weights for policy 0, policy_version 13245 (0.0012) [2024-03-21 01:20:30,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 434176000. Throughput: 0: 47308.8. Samples: 435501600. Policy #0 lag: (min: 0.0, avg: 38.9, max: 79.0) [2024-03-21 01:20:30,522][03784] Avg episode reward: [(0, '0.378')] [2024-03-21 01:20:32,614][04017] Updated weights for policy 0, policy_version 13255 (0.0013) [2024-03-21 01:20:35,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46967.4, 300 sec: 46430.6). Total num frames: 434405376. Throughput: 0: 47242.2. Samples: 435781300. Policy #0 lag: (min: 0.0, avg: 46.9, max: 86.0) [2024-03-21 01:20:35,522][03784] Avg episode reward: [(0, '0.882')] [2024-03-21 01:20:40,521][03784] Fps is (10 sec: 45875.8, 60 sec: 46421.4, 300 sec: 46430.6). Total num frames: 434634752. Throughput: 0: 48309.0. Samples: 436084700. Policy #0 lag: (min: 0.0, avg: 46.9, max: 86.0) [2024-03-21 01:20:40,522][03784] Avg episode reward: [(0, '0.475')] [2024-03-21 01:20:41,091][04017] Updated weights for policy 0, policy_version 13265 (0.0020) [2024-03-21 01:20:45,496][04017] Updated weights for policy 0, policy_version 13275 (0.0010) [2024-03-21 01:20:45,521][03784] Fps is (10 sec: 58982.9, 60 sec: 49698.1, 300 sec: 46541.7). Total num frames: 434995200. Throughput: 0: 48340.2. Samples: 436227600. Policy #0 lag: (min: 0.0, avg: 42.1, max: 93.0) [2024-03-21 01:20:45,522][03784] Avg episode reward: [(0, '0.963')] [2024-03-21 01:20:50,521][03784] Fps is (10 sec: 45874.8, 60 sec: 46967.5, 300 sec: 45764.1). Total num frames: 435093504. Throughput: 0: 49231.1. Samples: 436532600. Policy #0 lag: (min: 0.0, avg: 42.1, max: 93.0) [2024-03-21 01:20:50,522][03784] Avg episode reward: [(0, '0.397')] [2024-03-21 01:20:53,279][04017] Updated weights for policy 0, policy_version 13285 (0.0010) [2024-03-21 01:20:53,354][03995] Signal inference workers to stop experience collection... (8750 times) [2024-03-21 01:20:53,359][03995] Signal inference workers to resume experience collection... (8750 times) [2024-03-21 01:20:53,426][04017] InferenceWorker_p0-w0: stopping experience collection (8750 times) [2024-03-21 01:20:53,426][04017] InferenceWorker_p0-w0: resuming experience collection (8750 times) [2024-03-21 01:20:55,521][03784] Fps is (10 sec: 45874.9, 60 sec: 48059.7, 300 sec: 46430.6). Total num frames: 435453952. Throughput: 0: 48675.5. Samples: 436792500. Policy #0 lag: (min: 0.0, avg: 52.0, max: 111.0) [2024-03-21 01:20:55,522][03784] Avg episode reward: [(0, '1.067')] [2024-03-21 01:20:58,020][04017] Updated weights for policy 0, policy_version 13295 (0.0017) [2024-03-21 01:21:00,521][03784] Fps is (10 sec: 65535.4, 60 sec: 46967.5, 300 sec: 46652.7). Total num frames: 435748864. Throughput: 0: 48302.2. Samples: 436931500. Policy #0 lag: (min: 0.0, avg: 52.0, max: 111.0) [2024-03-21 01:21:00,522][03784] Avg episode reward: [(0, '1.067')] [2024-03-21 01:21:04,444][04017] Updated weights for policy 0, policy_version 13305 (0.0014) [2024-03-21 01:21:05,521][03784] Fps is (10 sec: 55706.0, 60 sec: 46967.5, 300 sec: 47208.1). Total num frames: 436011008. Throughput: 0: 47573.4. Samples: 437201500. Policy #0 lag: (min: 1.0, avg: 36.4, max: 95.0) [2024-03-21 01:21:05,522][03784] Avg episode reward: [(0, '0.978')] [2024-03-21 01:21:10,521][03784] Fps is (10 sec: 45875.6, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 436207616. Throughput: 0: 47382.2. Samples: 437492800. Policy #0 lag: (min: 1.0, avg: 36.4, max: 95.0) [2024-03-21 01:21:10,522][03784] Avg episode reward: [(0, '1.079')] [2024-03-21 01:21:13,629][04017] Updated weights for policy 0, policy_version 13315 (0.0015) [2024-03-21 01:21:15,521][03784] Fps is (10 sec: 39321.1, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 436404224. Throughput: 0: 47517.7. Samples: 437639900. Policy #0 lag: (min: 1.0, avg: 63.5, max: 112.0) [2024-03-21 01:21:15,522][03784] Avg episode reward: [(0, '1.230')] [2024-03-21 01:21:18,380][04017] Updated weights for policy 0, policy_version 13325 (0.0012) [2024-03-21 01:21:20,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49698.1, 300 sec: 46874.9). Total num frames: 436731904. Throughput: 0: 47211.1. Samples: 437905800. Policy #0 lag: (min: 1.0, avg: 63.5, max: 112.0) [2024-03-21 01:21:20,522][03784] Avg episode reward: [(0, '0.636')] [2024-03-21 01:21:25,521][03784] Fps is (10 sec: 52429.9, 60 sec: 49698.2, 300 sec: 46541.7). Total num frames: 436928512. Throughput: 0: 47008.9. Samples: 438200100. Policy #0 lag: (min: 0.0, avg: 67.6, max: 125.0) [2024-03-21 01:21:25,521][03784] Avg episode reward: [(0, '1.341')] [2024-03-21 01:21:27,473][04017] Updated weights for policy 0, policy_version 13335 (0.0015) [2024-03-21 01:21:30,521][03784] Fps is (10 sec: 39321.5, 60 sec: 49152.0, 300 sec: 46874.9). Total num frames: 437125120. Throughput: 0: 47186.6. Samples: 438351000. Policy #0 lag: (min: 0.0, avg: 67.6, max: 125.0) [2024-03-21 01:21:30,522][03784] Avg episode reward: [(0, '1.341')] [2024-03-21 01:21:35,521][03784] Fps is (10 sec: 26214.1, 60 sec: 46421.4, 300 sec: 46319.5). Total num frames: 437190656. Throughput: 0: 46186.7. Samples: 438611000. Policy #0 lag: (min: 0.0, avg: 32.9, max: 76.0) [2024-03-21 01:21:35,522][03784] Avg episode reward: [(0, '1.256')] [2024-03-21 01:21:38,675][04017] Updated weights for policy 0, policy_version 13345 (0.0017) [2024-03-21 01:21:40,521][03784] Fps is (10 sec: 19661.0, 60 sec: 44782.9, 300 sec: 45764.1). Total num frames: 437321728. Throughput: 0: 46589.0. Samples: 438889000. Policy #0 lag: (min: 0.0, avg: 32.9, max: 76.0) [2024-03-21 01:21:40,522][03784] Avg episode reward: [(0, '0.551')] [2024-03-21 01:21:44,529][04017] Updated weights for policy 0, policy_version 13355 (0.0012) [2024-03-21 01:21:45,521][03784] Fps is (10 sec: 45875.2, 60 sec: 44236.8, 300 sec: 45875.2). Total num frames: 437649408. Throughput: 0: 46466.7. Samples: 439022500. Policy #0 lag: (min: 2.0, avg: 25.8, max: 63.0) [2024-03-21 01:21:45,522][03784] Avg episode reward: [(0, '0.668')] [2024-03-21 01:21:50,521][03784] Fps is (10 sec: 55704.9, 60 sec: 46421.3, 300 sec: 46208.4). Total num frames: 437878784. Throughput: 0: 46919.9. Samples: 439312900. Policy #0 lag: (min: 2.0, avg: 25.8, max: 63.0) [2024-03-21 01:21:50,522][03784] Avg episode reward: [(0, '0.653')] [2024-03-21 01:21:51,151][04017] Updated weights for policy 0, policy_version 13365 (0.0011) [2024-03-21 01:21:55,521][03784] Fps is (10 sec: 45874.5, 60 sec: 44236.7, 300 sec: 46652.7). Total num frames: 438108160. Throughput: 0: 46082.1. Samples: 439566500. Policy #0 lag: (min: 0.0, avg: 42.1, max: 112.0) [2024-03-21 01:21:55,522][03784] Avg episode reward: [(0, '0.624')] [2024-03-21 01:21:55,737][03995] Signal inference workers to stop experience collection... (8800 times) [2024-03-21 01:21:55,809][04017] InferenceWorker_p0-w0: stopping experience collection (8800 times) [2024-03-21 01:21:55,809][03995] Signal inference workers to resume experience collection... (8800 times) [2024-03-21 01:21:55,858][04017] InferenceWorker_p0-w0: resuming experience collection (8800 times) [2024-03-21 01:21:57,107][04017] Updated weights for policy 0, policy_version 13375 (0.0015) [2024-03-21 01:22:00,521][03784] Fps is (10 sec: 52429.2, 60 sec: 44236.9, 300 sec: 46874.9). Total num frames: 438403072. Throughput: 0: 45342.3. Samples: 439680300. Policy #0 lag: (min: 0.0, avg: 42.1, max: 112.0) [2024-03-21 01:22:00,522][03784] Avg episode reward: [(0, '0.825')] [2024-03-21 01:22:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000013379_438403072.pth... [2024-03-21 01:22:00,663][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000013039_427261952.pth [2024-03-21 01:22:03,397][04017] Updated weights for policy 0, policy_version 13385 (0.0015) [2024-03-21 01:22:05,521][03784] Fps is (10 sec: 65537.1, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 438763520. Throughput: 0: 45035.6. Samples: 439932400. Policy #0 lag: (min: 2.0, avg: 36.4, max: 90.0) [2024-03-21 01:22:05,522][03784] Avg episode reward: [(0, '1.035')] [2024-03-21 01:22:08,859][04017] Updated weights for policy 0, policy_version 13395 (0.0019) [2024-03-21 01:22:10,521][03784] Fps is (10 sec: 52429.0, 60 sec: 45329.1, 300 sec: 46652.8). Total num frames: 438927360. Throughput: 0: 44373.3. Samples: 440196900. Policy #0 lag: (min: 2.0, avg: 36.4, max: 90.0) [2024-03-21 01:22:10,522][03784] Avg episode reward: [(0, '0.914')] [2024-03-21 01:22:15,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45329.2, 300 sec: 46319.5). Total num frames: 439123968. Throughput: 0: 43984.6. Samples: 440330300. Policy #0 lag: (min: 1.0, avg: 41.3, max: 85.0) [2024-03-21 01:22:15,522][03784] Avg episode reward: [(0, '1.196')] [2024-03-21 01:22:18,884][04017] Updated weights for policy 0, policy_version 13405 (0.0020) [2024-03-21 01:22:20,521][03784] Fps is (10 sec: 39321.5, 60 sec: 43144.6, 300 sec: 46874.9). Total num frames: 439320576. Throughput: 0: 43768.9. Samples: 440580600. Policy #0 lag: (min: 1.0, avg: 41.3, max: 85.0) [2024-03-21 01:22:20,522][03784] Avg episode reward: [(0, '0.601')] [2024-03-21 01:22:25,521][03784] Fps is (10 sec: 36044.7, 60 sec: 42598.4, 300 sec: 46097.4). Total num frames: 439484416. Throughput: 0: 44308.9. Samples: 440882900. Policy #0 lag: (min: 1.0, avg: 41.3, max: 85.0) [2024-03-21 01:22:25,522][03784] Avg episode reward: [(0, '0.870')] [2024-03-21 01:22:26,641][04017] Updated weights for policy 0, policy_version 13415 (0.0011) [2024-03-21 01:22:30,521][03784] Fps is (10 sec: 52428.7, 60 sec: 45329.1, 300 sec: 46208.4). Total num frames: 439844864. Throughput: 0: 44157.8. Samples: 441009600. Policy #0 lag: (min: 0.0, avg: 50.9, max: 109.0) [2024-03-21 01:22:30,522][03784] Avg episode reward: [(0, '0.587')] [2024-03-21 01:22:31,745][04017] Updated weights for policy 0, policy_version 13425 (0.0020) [2024-03-21 01:22:35,521][03784] Fps is (10 sec: 55705.5, 60 sec: 47513.6, 300 sec: 45764.1). Total num frames: 440041472. Throughput: 0: 44137.9. Samples: 441299100. Policy #0 lag: (min: 0.0, avg: 50.9, max: 109.0) [2024-03-21 01:22:35,522][03784] Avg episode reward: [(0, '0.938')] [2024-03-21 01:22:39,169][04017] Updated weights for policy 0, policy_version 13435 (0.0024) [2024-03-21 01:22:40,521][03784] Fps is (10 sec: 45875.4, 60 sec: 49698.1, 300 sec: 46319.5). Total num frames: 440303616. Throughput: 0: 44811.3. Samples: 441583000. Policy #0 lag: (min: 0.0, avg: 37.4, max: 82.0) [2024-03-21 01:22:40,522][03784] Avg episode reward: [(0, '0.925')] [2024-03-21 01:22:45,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45875.3, 300 sec: 46097.4). Total num frames: 440401920. Throughput: 0: 45806.8. Samples: 441741600. Policy #0 lag: (min: 0.0, avg: 37.4, max: 82.0) [2024-03-21 01:22:45,522][03784] Avg episode reward: [(0, '0.925')] [2024-03-21 01:22:50,174][04017] Updated weights for policy 0, policy_version 13445 (0.0012) [2024-03-21 01:22:50,521][03784] Fps is (10 sec: 26214.2, 60 sec: 44783.0, 300 sec: 45875.2). Total num frames: 440565760. Throughput: 0: 46155.5. Samples: 442009400. Policy #0 lag: (min: 0.0, avg: 30.8, max: 65.0) [2024-03-21 01:22:50,522][03784] Avg episode reward: [(0, '1.153')] [2024-03-21 01:22:55,521][03784] Fps is (10 sec: 32767.4, 60 sec: 43690.7, 300 sec: 45653.0). Total num frames: 440729600. Throughput: 0: 45817.7. Samples: 442258700. Policy #0 lag: (min: 0.0, avg: 30.8, max: 65.0) [2024-03-21 01:22:55,522][03784] Avg episode reward: [(0, '0.914')] [2024-03-21 01:22:59,288][04017] Updated weights for policy 0, policy_version 13455 (0.0011) [2024-03-21 01:22:59,881][03995] Signal inference workers to stop experience collection... (8850 times) [2024-03-21 01:22:59,959][03995] Signal inference workers to resume experience collection... (8850 times) [2024-03-21 01:22:59,964][04017] InferenceWorker_p0-w0: stopping experience collection (8850 times) [2024-03-21 01:23:00,002][04017] InferenceWorker_p0-w0: resuming experience collection (8850 times) [2024-03-21 01:23:00,521][03784] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 46208.4). Total num frames: 440991744. Throughput: 0: 45624.3. Samples: 442383400. Policy #0 lag: (min: 0.0, avg: 28.6, max: 61.0) [2024-03-21 01:23:00,522][03784] Avg episode reward: [(0, '0.427')] [2024-03-21 01:23:04,855][04017] Updated weights for policy 0, policy_version 13465 (0.0013) [2024-03-21 01:23:05,521][03784] Fps is (10 sec: 55706.0, 60 sec: 42052.3, 300 sec: 46319.5). Total num frames: 441286656. Throughput: 0: 45722.2. Samples: 442638100. Policy #0 lag: (min: 0.0, avg: 28.6, max: 61.0) [2024-03-21 01:23:05,522][03784] Avg episode reward: [(0, '1.406')] [2024-03-21 01:23:10,512][04017] Updated weights for policy 0, policy_version 13475 (0.0015) [2024-03-21 01:23:10,521][03784] Fps is (10 sec: 55706.0, 60 sec: 43690.7, 300 sec: 46652.7). Total num frames: 441548800. Throughput: 0: 44786.7. Samples: 442898300. Policy #0 lag: (min: 0.0, avg: 32.8, max: 65.0) [2024-03-21 01:23:10,522][03784] Avg episode reward: [(0, '0.834')] [2024-03-21 01:23:15,498][04017] Updated weights for policy 0, policy_version 13485 (0.0011) [2024-03-21 01:23:15,521][03784] Fps is (10 sec: 58982.7, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 441876480. Throughput: 0: 45224.5. Samples: 443044700. Policy #0 lag: (min: 0.0, avg: 32.8, max: 65.0) [2024-03-21 01:23:15,522][03784] Avg episode reward: [(0, '1.230')] [2024-03-21 01:23:20,521][03784] Fps is (10 sec: 49151.6, 60 sec: 45329.0, 300 sec: 46319.5). Total num frames: 442040320. Throughput: 0: 45319.9. Samples: 443338500. Policy #0 lag: (min: 0.0, avg: 34.7, max: 107.0) [2024-03-21 01:23:20,522][03784] Avg episode reward: [(0, '0.437')] [2024-03-21 01:23:22,042][04017] Updated weights for policy 0, policy_version 13495 (0.0015) [2024-03-21 01:23:25,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48605.8, 300 sec: 47208.2). Total num frames: 442400768. Throughput: 0: 45111.1. Samples: 443613000. Policy #0 lag: (min: 0.0, avg: 34.7, max: 107.0) [2024-03-21 01:23:25,522][03784] Avg episode reward: [(0, '1.166')] [2024-03-21 01:23:29,984][04017] Updated weights for policy 0, policy_version 13505 (0.0021) [2024-03-21 01:23:30,521][03784] Fps is (10 sec: 52428.7, 60 sec: 45329.0, 300 sec: 46541.7). Total num frames: 442564608. Throughput: 0: 45064.2. Samples: 443769500. Policy #0 lag: (min: 0.0, avg: 34.7, max: 107.0) [2024-03-21 01:23:30,522][03784] Avg episode reward: [(0, '1.128')] [2024-03-21 01:23:34,452][04017] Updated weights for policy 0, policy_version 13515 (0.0012) [2024-03-21 01:23:35,521][03784] Fps is (10 sec: 45874.8, 60 sec: 46967.4, 300 sec: 45986.3). Total num frames: 442859520. Throughput: 0: 45566.6. Samples: 444059900. Policy #0 lag: (min: 2.0, avg: 48.6, max: 93.0) [2024-03-21 01:23:35,522][03784] Avg episode reward: [(0, '0.591')] [2024-03-21 01:23:40,521][03784] Fps is (10 sec: 49152.2, 60 sec: 45875.2, 300 sec: 46208.4). Total num frames: 443056128. Throughput: 0: 46897.8. Samples: 444369100. Policy #0 lag: (min: 2.0, avg: 48.6, max: 93.0) [2024-03-21 01:23:40,522][03784] Avg episode reward: [(0, '0.798')] [2024-03-21 01:23:42,150][04017] Updated weights for policy 0, policy_version 13525 (0.0028) [2024-03-21 01:23:43,801][03995] Signal inference workers to stop experience collection... (8900 times) [2024-03-21 01:23:43,867][04017] InferenceWorker_p0-w0: stopping experience collection (8900 times) [2024-03-21 01:23:43,870][03995] Signal inference workers to resume experience collection... (8900 times) [2024-03-21 01:23:43,917][04017] InferenceWorker_p0-w0: resuming experience collection (8900 times) [2024-03-21 01:23:45,521][03784] Fps is (10 sec: 45875.8, 60 sec: 48605.8, 300 sec: 46652.8). Total num frames: 443318272. Throughput: 0: 47397.9. Samples: 444516300. Policy #0 lag: (min: 2.0, avg: 65.8, max: 113.0) [2024-03-21 01:23:45,522][03784] Avg episode reward: [(0, '0.798')] [2024-03-21 01:23:47,957][04017] Updated weights for policy 0, policy_version 13535 (0.0015) [2024-03-21 01:23:50,521][03784] Fps is (10 sec: 58982.2, 60 sec: 51336.5, 300 sec: 46874.9). Total num frames: 443645952. Throughput: 0: 47786.6. Samples: 444788500. Policy #0 lag: (min: 2.0, avg: 65.8, max: 113.0) [2024-03-21 01:23:50,522][03784] Avg episode reward: [(0, '1.089')] [2024-03-21 01:23:55,521][03784] Fps is (10 sec: 45875.1, 60 sec: 50790.5, 300 sec: 46652.8). Total num frames: 443777024. Throughput: 0: 49008.9. Samples: 445103700. Policy #0 lag: (min: 0.0, avg: 27.4, max: 68.0) [2024-03-21 01:23:55,522][03784] Avg episode reward: [(0, '1.043')] [2024-03-21 01:23:55,937][04017] Updated weights for policy 0, policy_version 13545 (0.0011) [2024-03-21 01:24:00,521][03784] Fps is (10 sec: 32768.3, 60 sec: 49698.2, 300 sec: 46430.6). Total num frames: 443973632. Throughput: 0: 48982.2. Samples: 445248900. Policy #0 lag: (min: 0.0, avg: 27.4, max: 68.0) [2024-03-21 01:24:00,522][03784] Avg episode reward: [(0, '1.043')] [2024-03-21 01:24:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000013550_444006400.pth... [2024-03-21 01:24:00,636][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000013212_432930816.pth [2024-03-21 01:24:05,521][03784] Fps is (10 sec: 36044.5, 60 sec: 47513.6, 300 sec: 46097.4). Total num frames: 444137472. Throughput: 0: 49077.8. Samples: 445547000. Policy #0 lag: (min: 0.0, avg: 28.4, max: 97.0) [2024-03-21 01:24:05,522][03784] Avg episode reward: [(0, '0.403')] [2024-03-21 01:24:05,852][04017] Updated weights for policy 0, policy_version 13555 (0.0014) [2024-03-21 01:24:10,521][03784] Fps is (10 sec: 42598.2, 60 sec: 47513.6, 300 sec: 46319.5). Total num frames: 444399616. Throughput: 0: 49457.7. Samples: 445838600. Policy #0 lag: (min: 0.0, avg: 28.4, max: 97.0) [2024-03-21 01:24:10,522][03784] Avg episode reward: [(0, '0.403')] [2024-03-21 01:24:11,563][04017] Updated weights for policy 0, policy_version 13565 (0.0011) [2024-03-21 01:24:15,521][03784] Fps is (10 sec: 45875.6, 60 sec: 45329.1, 300 sec: 46319.5). Total num frames: 444596224. Throughput: 0: 49044.6. Samples: 445976500. Policy #0 lag: (min: 0.0, avg: 28.4, max: 97.0) [2024-03-21 01:24:15,522][03784] Avg episode reward: [(0, '0.863')] [2024-03-21 01:24:20,521][03784] Fps is (10 sec: 39321.3, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 444792832. Throughput: 0: 49037.7. Samples: 446266600. Policy #0 lag: (min: 0.0, avg: 42.1, max: 95.0) [2024-03-21 01:24:20,522][03784] Avg episode reward: [(0, '0.714')] [2024-03-21 01:24:20,678][04017] Updated weights for policy 0, policy_version 13575 (0.0019) [2024-03-21 01:24:24,566][04017] Updated weights for policy 0, policy_version 13585 (0.0020) [2024-03-21 01:24:25,521][03784] Fps is (10 sec: 58981.7, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 445186048. Throughput: 0: 48108.8. Samples: 446534000. Policy #0 lag: (min: 0.0, avg: 42.1, max: 95.0) [2024-03-21 01:24:25,522][03784] Avg episode reward: [(0, '0.907')] [2024-03-21 01:24:29,731][04017] Updated weights for policy 0, policy_version 13595 (0.0012) [2024-03-21 01:24:30,521][03784] Fps is (10 sec: 72090.7, 60 sec: 49152.1, 300 sec: 47208.1). Total num frames: 445513728. Throughput: 0: 47695.5. Samples: 446662600. Policy #0 lag: (min: 0.0, avg: 33.8, max: 97.0) [2024-03-21 01:24:30,522][03784] Avg episode reward: [(0, '0.820')] [2024-03-21 01:24:35,521][03784] Fps is (10 sec: 52429.8, 60 sec: 47513.7, 300 sec: 46986.0). Total num frames: 445710336. Throughput: 0: 47533.5. Samples: 446927500. Policy #0 lag: (min: 0.0, avg: 33.8, max: 97.0) [2024-03-21 01:24:35,521][03784] Avg episode reward: [(0, '0.985')] [2024-03-21 01:24:37,365][03995] Signal inference workers to stop experience collection... (8950 times) [2024-03-21 01:24:37,456][04017] InferenceWorker_p0-w0: stopping experience collection (8950 times) [2024-03-21 01:24:37,610][03995] Signal inference workers to resume experience collection... (8950 times) [2024-03-21 01:24:37,610][04017] InferenceWorker_p0-w0: resuming experience collection (8950 times) [2024-03-21 01:24:37,612][04017] Updated weights for policy 0, policy_version 13605 (0.0014) [2024-03-21 01:24:40,521][03784] Fps is (10 sec: 52428.6, 60 sec: 49698.2, 300 sec: 47541.4). Total num frames: 446038016. Throughput: 0: 46713.3. Samples: 447205800. Policy #0 lag: (min: 1.0, avg: 61.0, max: 121.0) [2024-03-21 01:24:40,522][03784] Avg episode reward: [(0, '1.141')] [2024-03-21 01:24:42,976][04017] Updated weights for policy 0, policy_version 13615 (0.0017) [2024-03-21 01:24:45,521][03784] Fps is (10 sec: 49151.5, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 446201856. Throughput: 0: 46828.9. Samples: 447356200. Policy #0 lag: (min: 1.0, avg: 61.0, max: 121.0) [2024-03-21 01:24:45,522][03784] Avg episode reward: [(0, '1.141')] [2024-03-21 01:24:49,125][04017] Updated weights for policy 0, policy_version 13625 (0.0028) [2024-03-21 01:24:50,521][03784] Fps is (10 sec: 49152.0, 60 sec: 48059.8, 300 sec: 47319.2). Total num frames: 446529536. Throughput: 0: 46366.8. Samples: 447633500. Policy #0 lag: (min: 2.0, avg: 52.8, max: 119.0) [2024-03-21 01:24:50,522][03784] Avg episode reward: [(0, '1.245')] [2024-03-21 01:24:55,521][03784] Fps is (10 sec: 49152.0, 60 sec: 48605.8, 300 sec: 46652.8). Total num frames: 446693376. Throughput: 0: 46300.0. Samples: 447922100. Policy #0 lag: (min: 2.0, avg: 52.8, max: 119.0) [2024-03-21 01:24:55,522][03784] Avg episode reward: [(0, '1.370')] [2024-03-21 01:24:59,393][04017] Updated weights for policy 0, policy_version 13635 (0.0011) [2024-03-21 01:25:00,521][03784] Fps is (10 sec: 26214.3, 60 sec: 46967.4, 300 sec: 46097.3). Total num frames: 446791680. Throughput: 0: 46255.5. Samples: 448058000. Policy #0 lag: (min: 0.0, avg: 36.0, max: 80.0) [2024-03-21 01:25:00,522][03784] Avg episode reward: [(0, '1.162')] [2024-03-21 01:25:05,521][03784] Fps is (10 sec: 32767.8, 60 sec: 48059.7, 300 sec: 45986.3). Total num frames: 447021056. Throughput: 0: 45973.4. Samples: 448335400. Policy #0 lag: (min: 0.0, avg: 36.0, max: 80.0) [2024-03-21 01:25:05,522][03784] Avg episode reward: [(0, '1.149')] [2024-03-21 01:25:08,319][04017] Updated weights for policy 0, policy_version 13645 (0.0018) [2024-03-21 01:25:10,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 447152128. Throughput: 0: 46460.0. Samples: 448624700. Policy #0 lag: (min: 0.0, avg: 36.0, max: 80.0) [2024-03-21 01:25:10,522][03784] Avg episode reward: [(0, '1.572')] [2024-03-21 01:25:15,521][03784] Fps is (10 sec: 32768.3, 60 sec: 45875.2, 300 sec: 46097.4). Total num frames: 447348736. Throughput: 0: 46813.3. Samples: 448769200. Policy #0 lag: (min: 0.0, avg: 30.0, max: 72.0) [2024-03-21 01:25:15,522][03784] Avg episode reward: [(0, '0.519')] [2024-03-21 01:25:16,445][04017] Updated weights for policy 0, policy_version 13655 (0.0016) [2024-03-21 01:25:20,521][03784] Fps is (10 sec: 58982.6, 60 sec: 49152.1, 300 sec: 46763.8). Total num frames: 447741952. Throughput: 0: 46868.7. Samples: 449036600. Policy #0 lag: (min: 0.0, avg: 30.0, max: 72.0) [2024-03-21 01:25:20,522][03784] Avg episode reward: [(0, '0.519')] [2024-03-21 01:25:21,699][04017] Updated weights for policy 0, policy_version 13665 (0.0020) [2024-03-21 01:25:25,521][03784] Fps is (10 sec: 55704.9, 60 sec: 45329.1, 300 sec: 46541.7). Total num frames: 447905792. Throughput: 0: 46368.8. Samples: 449292400. Policy #0 lag: (min: 0.0, avg: 38.0, max: 113.0) [2024-03-21 01:25:25,522][03784] Avg episode reward: [(0, '0.778')] [2024-03-21 01:25:27,903][04017] Updated weights for policy 0, policy_version 13675 (0.0018) [2024-03-21 01:25:30,521][03784] Fps is (10 sec: 49151.8, 60 sec: 45329.0, 300 sec: 46874.9). Total num frames: 448233472. Throughput: 0: 45648.8. Samples: 449410400. Policy #0 lag: (min: 0.0, avg: 38.0, max: 113.0) [2024-03-21 01:25:30,522][03784] Avg episode reward: [(0, '0.965')] [2024-03-21 01:25:35,166][04017] Updated weights for policy 0, policy_version 13685 (0.0012) [2024-03-21 01:25:35,521][03784] Fps is (10 sec: 55705.6, 60 sec: 45875.1, 300 sec: 46874.9). Total num frames: 448462848. Throughput: 0: 45957.7. Samples: 449701600. Policy #0 lag: (min: 0.0, avg: 44.9, max: 81.0) [2024-03-21 01:25:35,522][03784] Avg episode reward: [(0, '0.621')] [2024-03-21 01:25:36,688][03995] Signal inference workers to stop experience collection... (9000 times) [2024-03-21 01:25:36,689][03995] Signal inference workers to resume experience collection... (9000 times) [2024-03-21 01:25:36,731][04017] InferenceWorker_p0-w0: stopping experience collection (9000 times) [2024-03-21 01:25:36,778][04017] InferenceWorker_p0-w0: resuming experience collection (9000 times) [2024-03-21 01:25:40,401][04017] Updated weights for policy 0, policy_version 13695 (0.0011) [2024-03-21 01:25:40,521][03784] Fps is (10 sec: 52429.2, 60 sec: 45329.1, 300 sec: 46652.7). Total num frames: 448757760. Throughput: 0: 45551.1. Samples: 449971900. Policy #0 lag: (min: 0.0, avg: 44.9, max: 81.0) [2024-03-21 01:25:40,522][03784] Avg episode reward: [(0, '0.618')] [2024-03-21 01:25:45,521][03784] Fps is (10 sec: 49152.5, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 448954368. Throughput: 0: 45631.2. Samples: 450111400. Policy #0 lag: (min: 1.0, avg: 47.8, max: 100.0) [2024-03-21 01:25:45,522][03784] Avg episode reward: [(0, '0.760')] [2024-03-21 01:25:48,087][04017] Updated weights for policy 0, policy_version 13705 (0.0018) [2024-03-21 01:25:50,521][03784] Fps is (10 sec: 39321.2, 60 sec: 43690.6, 300 sec: 46430.6). Total num frames: 449150976. Throughput: 0: 45102.2. Samples: 450365000. Policy #0 lag: (min: 1.0, avg: 47.8, max: 100.0) [2024-03-21 01:25:50,522][03784] Avg episode reward: [(0, '1.275')] [2024-03-21 01:25:55,521][03784] Fps is (10 sec: 42598.2, 60 sec: 44782.9, 300 sec: 46208.4). Total num frames: 449380352. Throughput: 0: 44793.4. Samples: 450640400. Policy #0 lag: (min: 1.0, avg: 31.5, max: 75.0) [2024-03-21 01:25:55,522][03784] Avg episode reward: [(0, '1.017')] [2024-03-21 01:25:58,482][04017] Updated weights for policy 0, policy_version 13715 (0.0010) [2024-03-21 01:26:00,521][03784] Fps is (10 sec: 32768.0, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 449478656. Throughput: 0: 45253.2. Samples: 450805600. Policy #0 lag: (min: 1.0, avg: 31.5, max: 75.0) [2024-03-21 01:26:00,522][03784] Avg episode reward: [(0, '0.742')] [2024-03-21 01:26:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000013717_449478656.pth... [2024-03-21 01:26:00,686][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000013379_438403072.pth [2024-03-21 01:26:03,830][04017] Updated weights for policy 0, policy_version 13725 (0.0012) [2024-03-21 01:26:05,521][03784] Fps is (10 sec: 39321.4, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 449773568. Throughput: 0: 45202.2. Samples: 451070700. Policy #0 lag: (min: 4.0, avg: 32.6, max: 67.0) [2024-03-21 01:26:05,522][03784] Avg episode reward: [(0, '0.836')] [2024-03-21 01:26:09,890][04017] Updated weights for policy 0, policy_version 13735 (0.0009) [2024-03-21 01:26:10,521][03784] Fps is (10 sec: 62258.8, 60 sec: 49151.9, 300 sec: 46430.6). Total num frames: 450101248. Throughput: 0: 45277.7. Samples: 451329900. Policy #0 lag: (min: 4.0, avg: 32.6, max: 67.0) [2024-03-21 01:26:10,523][03784] Avg episode reward: [(0, '0.836')] [2024-03-21 01:26:15,521][03784] Fps is (10 sec: 52429.6, 60 sec: 49152.0, 300 sec: 45986.3). Total num frames: 450297856. Throughput: 0: 45906.8. Samples: 451476200. Policy #0 lag: (min: 0.0, avg: 33.7, max: 70.0) [2024-03-21 01:26:15,522][03784] Avg episode reward: [(0, '1.059')] [2024-03-21 01:26:20,521][03784] Fps is (10 sec: 19661.0, 60 sec: 42598.4, 300 sec: 45319.8). Total num frames: 450297856. Throughput: 0: 45453.4. Samples: 451747000. Policy #0 lag: (min: 0.0, avg: 33.7, max: 70.0) [2024-03-21 01:26:20,522][03784] Avg episode reward: [(0, '1.287')] [2024-03-21 01:26:21,655][04017] Updated weights for policy 0, policy_version 13745 (0.0020) [2024-03-21 01:26:25,521][03784] Fps is (10 sec: 29490.7, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 450592768. Throughput: 0: 45786.6. Samples: 452032300. Policy #0 lag: (min: 0.0, avg: 33.7, max: 70.0) [2024-03-21 01:26:25,522][03784] Avg episode reward: [(0, '0.923')] [2024-03-21 01:26:30,521][03784] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 45764.1). Total num frames: 450691072. Throughput: 0: 46142.2. Samples: 452187800. Policy #0 lag: (min: 0.0, avg: 33.1, max: 70.0) [2024-03-21 01:26:30,522][03784] Avg episode reward: [(0, '1.291')] [2024-03-21 01:26:30,698][04017] Updated weights for policy 0, policy_version 13755 (0.0011) [2024-03-21 01:26:32,840][03995] Signal inference workers to stop experience collection... (9050 times) [2024-03-21 01:26:32,946][04017] InferenceWorker_p0-w0: stopping experience collection (9050 times) [2024-03-21 01:26:33,055][03995] Signal inference workers to resume experience collection... (9050 times) [2024-03-21 01:26:33,055][04017] InferenceWorker_p0-w0: resuming experience collection (9050 times) [2024-03-21 01:26:33,662][04017] Updated weights for policy 0, policy_version 13765 (0.0017) [2024-03-21 01:26:35,521][03784] Fps is (10 sec: 52429.4, 60 sec: 44236.9, 300 sec: 46763.8). Total num frames: 451117056. Throughput: 0: 45573.4. Samples: 452415800. Policy #0 lag: (min: 0.0, avg: 33.1, max: 70.0) [2024-03-21 01:26:35,522][03784] Avg episode reward: [(0, '1.109')] [2024-03-21 01:26:39,685][04017] Updated weights for policy 0, policy_version 13775 (0.0013) [2024-03-21 01:26:40,521][03784] Fps is (10 sec: 68813.0, 60 sec: 43690.6, 300 sec: 46541.7). Total num frames: 451379200. Throughput: 0: 45424.4. Samples: 452684500. Policy #0 lag: (min: 2.0, avg: 37.3, max: 71.0) [2024-03-21 01:26:40,522][03784] Avg episode reward: [(0, '0.842')] [2024-03-21 01:26:45,521][03784] Fps is (10 sec: 45874.6, 60 sec: 43690.6, 300 sec: 46430.6). Total num frames: 451575808. Throughput: 0: 44995.5. Samples: 452830400. Policy #0 lag: (min: 2.0, avg: 37.3, max: 71.0) [2024-03-21 01:26:45,522][03784] Avg episode reward: [(0, '0.842')] [2024-03-21 01:26:46,802][04017] Updated weights for policy 0, policy_version 13785 (0.0015) [2024-03-21 01:26:50,521][03784] Fps is (10 sec: 58982.8, 60 sec: 46967.6, 300 sec: 46986.0). Total num frames: 451969024. Throughput: 0: 44815.7. Samples: 453087400. Policy #0 lag: (min: 0.0, avg: 56.6, max: 105.0) [2024-03-21 01:26:50,522][03784] Avg episode reward: [(0, '1.363')] [2024-03-21 01:26:52,488][04017] Updated weights for policy 0, policy_version 13795 (0.0012) [2024-03-21 01:26:55,521][03784] Fps is (10 sec: 55706.2, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 452132864. Throughput: 0: 45962.4. Samples: 453398200. Policy #0 lag: (min: 0.0, avg: 56.6, max: 105.0) [2024-03-21 01:26:55,522][03784] Avg episode reward: [(0, '0.792')] [2024-03-21 01:27:00,055][04017] Updated weights for policy 0, policy_version 13805 (0.0024) [2024-03-21 01:27:00,521][03784] Fps is (10 sec: 42598.3, 60 sec: 48606.0, 300 sec: 46208.4). Total num frames: 452395008. Throughput: 0: 46128.8. Samples: 453552000. Policy #0 lag: (min: 4.0, avg: 53.7, max: 104.0) [2024-03-21 01:27:00,522][03784] Avg episode reward: [(0, '0.792')] [2024-03-21 01:27:05,204][04017] Updated weights for policy 0, policy_version 13815 (0.0020) [2024-03-21 01:27:05,521][03784] Fps is (10 sec: 55705.9, 60 sec: 48606.0, 300 sec: 46652.8). Total num frames: 452689920. Throughput: 0: 46329.0. Samples: 453831800. Policy #0 lag: (min: 4.0, avg: 53.7, max: 104.0) [2024-03-21 01:27:05,522][03784] Avg episode reward: [(0, '0.792')] [2024-03-21 01:27:10,521][03784] Fps is (10 sec: 39321.9, 60 sec: 44783.1, 300 sec: 46319.5). Total num frames: 452788224. Throughput: 0: 46871.3. Samples: 454141500. Policy #0 lag: (min: 4.0, avg: 53.7, max: 104.0) [2024-03-21 01:27:10,521][03784] Avg episode reward: [(0, '0.792')] [2024-03-21 01:27:13,679][04017] Updated weights for policy 0, policy_version 13825 (0.0012) [2024-03-21 01:27:15,521][03784] Fps is (10 sec: 36044.1, 60 sec: 45875.0, 300 sec: 46541.6). Total num frames: 453050368. Throughput: 0: 46473.2. Samples: 454279100. Policy #0 lag: (min: 2.0, avg: 45.8, max: 90.0) [2024-03-21 01:27:15,522][03784] Avg episode reward: [(0, '0.695')] [2024-03-21 01:27:20,312][03995] Signal inference workers to stop experience collection... (9100 times) [2024-03-21 01:27:20,312][03995] Signal inference workers to resume experience collection... (9100 times) [2024-03-21 01:27:20,352][04017] InferenceWorker_p0-w0: stopping experience collection (9100 times) [2024-03-21 01:27:20,359][04017] InferenceWorker_p0-w0: resuming experience collection (9100 times) [2024-03-21 01:27:20,521][03784] Fps is (10 sec: 49150.9, 60 sec: 49698.1, 300 sec: 46763.8). Total num frames: 453279744. Throughput: 0: 47575.4. Samples: 454556700. Policy #0 lag: (min: 2.0, avg: 45.8, max: 90.0) [2024-03-21 01:27:20,522][03784] Avg episode reward: [(0, '1.066')] [2024-03-21 01:27:20,882][04017] Updated weights for policy 0, policy_version 13835 (0.0014) [2024-03-21 01:27:25,522][03784] Fps is (10 sec: 45874.1, 60 sec: 48605.6, 300 sec: 46319.5). Total num frames: 453509120. Throughput: 0: 47930.7. Samples: 454841400. Policy #0 lag: (min: 2.0, avg: 37.0, max: 84.0) [2024-03-21 01:27:25,522][03784] Avg episode reward: [(0, '1.023')] [2024-03-21 01:27:28,252][04017] Updated weights for policy 0, policy_version 13845 (0.0014) [2024-03-21 01:27:30,522][03784] Fps is (10 sec: 42596.9, 60 sec: 50243.9, 300 sec: 46319.4). Total num frames: 453705728. Throughput: 0: 47566.2. Samples: 454970900. Policy #0 lag: (min: 2.0, avg: 37.0, max: 84.0) [2024-03-21 01:27:30,522][03784] Avg episode reward: [(0, '0.616')] [2024-03-21 01:27:35,521][03784] Fps is (10 sec: 36046.3, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 453869568. Throughput: 0: 47953.3. Samples: 455245300. Policy #0 lag: (min: 2.0, avg: 25.6, max: 52.0) [2024-03-21 01:27:35,522][03784] Avg episode reward: [(0, '0.687')] [2024-03-21 01:27:40,398][04017] Updated weights for policy 0, policy_version 13855 (0.0021) [2024-03-21 01:27:40,521][03784] Fps is (10 sec: 29492.5, 60 sec: 43690.6, 300 sec: 46097.3). Total num frames: 454000640. Throughput: 0: 47586.6. Samples: 455539600. Policy #0 lag: (min: 2.0, avg: 25.6, max: 52.0) [2024-03-21 01:27:40,522][03784] Avg episode reward: [(0, '0.988')] [2024-03-21 01:27:45,521][03784] Fps is (10 sec: 32767.7, 60 sec: 43690.7, 300 sec: 46208.4). Total num frames: 454197248. Throughput: 0: 47297.7. Samples: 455680400. Policy #0 lag: (min: 0.0, avg: 33.4, max: 72.0) [2024-03-21 01:27:45,522][03784] Avg episode reward: [(0, '0.988')] [2024-03-21 01:27:48,180][04017] Updated weights for policy 0, policy_version 13865 (0.0015) [2024-03-21 01:27:50,521][03784] Fps is (10 sec: 52428.8, 60 sec: 42598.3, 300 sec: 46763.8). Total num frames: 454524928. Throughput: 0: 46397.7. Samples: 455919700. Policy #0 lag: (min: 0.0, avg: 33.4, max: 72.0) [2024-03-21 01:27:50,522][03784] Avg episode reward: [(0, '0.625')] [2024-03-21 01:27:51,764][04017] Updated weights for policy 0, policy_version 13875 (0.0011) [2024-03-21 01:27:55,521][03784] Fps is (10 sec: 65536.1, 60 sec: 45329.1, 300 sec: 46986.0). Total num frames: 454852608. Throughput: 0: 45331.0. Samples: 456181400. Policy #0 lag: (min: 0.0, avg: 33.4, max: 72.0) [2024-03-21 01:27:55,522][03784] Avg episode reward: [(0, '0.741')] [2024-03-21 01:27:58,703][04017] Updated weights for policy 0, policy_version 13885 (0.0014) [2024-03-21 01:28:00,521][03784] Fps is (10 sec: 52428.7, 60 sec: 44236.7, 300 sec: 46652.7). Total num frames: 455049216. Throughput: 0: 45440.1. Samples: 456323900. Policy #0 lag: (min: 0.0, avg: 39.0, max: 92.0) [2024-03-21 01:28:00,522][03784] Avg episode reward: [(0, '0.741')] [2024-03-21 01:28:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000013887_455049216.pth... [2024-03-21 01:28:00,643][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000013550_444006400.pth [2024-03-21 01:28:04,437][04017] Updated weights for policy 0, policy_version 13895 (0.0013) [2024-03-21 01:28:05,521][03784] Fps is (10 sec: 55704.9, 60 sec: 45328.9, 300 sec: 46986.0). Total num frames: 455409664. Throughput: 0: 44773.3. Samples: 456571500. Policy #0 lag: (min: 0.0, avg: 39.0, max: 92.0) [2024-03-21 01:28:05,522][03784] Avg episode reward: [(0, '0.678')] [2024-03-21 01:28:08,466][03995] Signal inference workers to stop experience collection... (9150 times) [2024-03-21 01:28:08,469][03995] Signal inference workers to resume experience collection... (9150 times) [2024-03-21 01:28:08,471][04017] Updated weights for policy 0, policy_version 13905 (0.0018) [2024-03-21 01:28:08,511][04017] InferenceWorker_p0-w0: stopping experience collection (9150 times) [2024-03-21 01:28:08,551][04017] InferenceWorker_p0-w0: resuming experience collection (9150 times) [2024-03-21 01:28:10,521][03784] Fps is (10 sec: 65536.5, 60 sec: 48605.8, 300 sec: 46874.9). Total num frames: 455704576. Throughput: 0: 44751.5. Samples: 456855200. Policy #0 lag: (min: 0.0, avg: 41.0, max: 79.0) [2024-03-21 01:28:10,522][03784] Avg episode reward: [(0, '1.105')] [2024-03-21 01:28:14,146][04017] Updated weights for policy 0, policy_version 13915 (0.0010) [2024-03-21 01:28:15,521][03784] Fps is (10 sec: 62260.0, 60 sec: 49698.2, 300 sec: 47430.3). Total num frames: 456032256. Throughput: 0: 44958.3. Samples: 456994000. Policy #0 lag: (min: 0.0, avg: 41.0, max: 79.0) [2024-03-21 01:28:15,522][03784] Avg episode reward: [(0, '1.092')] [2024-03-21 01:28:20,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48059.8, 300 sec: 46652.7). Total num frames: 456163328. Throughput: 0: 45459.9. Samples: 457291000. Policy #0 lag: (min: 0.0, avg: 50.1, max: 116.0) [2024-03-21 01:28:20,522][03784] Avg episode reward: [(0, '0.524')] [2024-03-21 01:28:21,928][04017] Updated weights for policy 0, policy_version 13925 (0.0010) [2024-03-21 01:28:25,521][03784] Fps is (10 sec: 36044.8, 60 sec: 48060.0, 300 sec: 46874.9). Total num frames: 456392704. Throughput: 0: 45302.3. Samples: 457578200. Policy #0 lag: (min: 0.0, avg: 50.1, max: 116.0) [2024-03-21 01:28:25,522][03784] Avg episode reward: [(0, '1.097')] [2024-03-21 01:28:30,521][03784] Fps is (10 sec: 32767.8, 60 sec: 46421.7, 300 sec: 46208.4). Total num frames: 456491008. Throughput: 0: 45582.2. Samples: 457731600. Policy #0 lag: (min: 0.0, avg: 50.1, max: 116.0) [2024-03-21 01:28:30,522][03784] Avg episode reward: [(0, '0.767')] [2024-03-21 01:28:35,263][04017] Updated weights for policy 0, policy_version 13935 (0.0011) [2024-03-21 01:28:35,521][03784] Fps is (10 sec: 22937.7, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 456622080. Throughput: 0: 46966.8. Samples: 458033200. Policy #0 lag: (min: 0.0, avg: 40.7, max: 86.0) [2024-03-21 01:28:35,522][03784] Avg episode reward: [(0, '1.124')] [2024-03-21 01:28:40,521][03784] Fps is (10 sec: 39321.9, 60 sec: 48059.8, 300 sec: 45986.3). Total num frames: 456884224. Throughput: 0: 46973.4. Samples: 458295200. Policy #0 lag: (min: 0.0, avg: 40.7, max: 86.0) [2024-03-21 01:28:40,522][03784] Avg episode reward: [(0, '0.642')] [2024-03-21 01:28:40,893][04017] Updated weights for policy 0, policy_version 13945 (0.0011) [2024-03-21 01:28:45,521][03784] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 45653.1). Total num frames: 457113600. Throughput: 0: 46635.6. Samples: 458422500. Policy #0 lag: (min: 2.0, avg: 27.0, max: 62.0) [2024-03-21 01:28:45,522][03784] Avg episode reward: [(0, '1.371')] [2024-03-21 01:28:50,090][04017] Updated weights for policy 0, policy_version 13955 (0.0020) [2024-03-21 01:28:50,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 457310208. Throughput: 0: 47229.0. Samples: 458696800. Policy #0 lag: (min: 2.0, avg: 27.0, max: 62.0) [2024-03-21 01:28:50,522][03784] Avg episode reward: [(0, '0.651')] [2024-03-21 01:28:55,094][04017] Updated weights for policy 0, policy_version 13965 (0.0022) [2024-03-21 01:28:55,521][03784] Fps is (10 sec: 49150.9, 60 sec: 45875.0, 300 sec: 46208.4). Total num frames: 457605120. Throughput: 0: 46706.4. Samples: 458957000. Policy #0 lag: (min: 0.0, avg: 35.9, max: 95.0) [2024-03-21 01:28:55,523][03784] Avg episode reward: [(0, '1.094')] [2024-03-21 01:29:00,521][03784] Fps is (10 sec: 55705.3, 60 sec: 46967.4, 300 sec: 46541.7). Total num frames: 457867264. Throughput: 0: 46706.6. Samples: 459095800. Policy #0 lag: (min: 0.0, avg: 35.9, max: 95.0) [2024-03-21 01:29:00,531][03784] Avg episode reward: [(0, '1.157')] [2024-03-21 01:29:01,132][04017] Updated weights for policy 0, policy_version 13975 (0.0011) [2024-03-21 01:29:05,521][03784] Fps is (10 sec: 45876.4, 60 sec: 44236.9, 300 sec: 46319.5). Total num frames: 458063872. Throughput: 0: 46080.0. Samples: 459364600. Policy #0 lag: (min: 0.0, avg: 39.9, max: 87.0) [2024-03-21 01:29:05,530][03784] Avg episode reward: [(0, '1.138')] [2024-03-21 01:29:08,993][04017] Updated weights for policy 0, policy_version 13985 (0.0013) [2024-03-21 01:29:10,521][03784] Fps is (10 sec: 52428.8, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 458391552. Throughput: 0: 45142.2. Samples: 459609600. Policy #0 lag: (min: 0.0, avg: 39.9, max: 87.0) [2024-03-21 01:29:10,522][03784] Avg episode reward: [(0, '0.596')] [2024-03-21 01:29:12,968][03995] Signal inference workers to stop experience collection... (9200 times) [2024-03-21 01:29:12,969][03995] Signal inference workers to resume experience collection... (9200 times) [2024-03-21 01:29:13,038][04017] InferenceWorker_p0-w0: stopping experience collection (9200 times) [2024-03-21 01:29:13,039][04017] InferenceWorker_p0-w0: resuming experience collection (9200 times) [2024-03-21 01:29:15,521][03784] Fps is (10 sec: 49152.2, 60 sec: 42052.3, 300 sec: 46652.8). Total num frames: 458555392. Throughput: 0: 45253.5. Samples: 459768000. Policy #0 lag: (min: 0.0, avg: 43.7, max: 87.0) [2024-03-21 01:29:15,530][03784] Avg episode reward: [(0, '0.596')] [2024-03-21 01:29:15,581][04017] Updated weights for policy 0, policy_version 13995 (0.0009) [2024-03-21 01:29:20,521][03784] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 46097.4). Total num frames: 458784768. Throughput: 0: 44982.2. Samples: 460057400. Policy #0 lag: (min: 0.0, avg: 43.7, max: 87.0) [2024-03-21 01:29:20,522][03784] Avg episode reward: [(0, '0.789')] [2024-03-21 01:29:22,660][04017] Updated weights for policy 0, policy_version 14005 (0.0011) [2024-03-21 01:29:25,521][03784] Fps is (10 sec: 52428.4, 60 sec: 44782.9, 300 sec: 45986.3). Total num frames: 459079680. Throughput: 0: 45682.2. Samples: 460350900. Policy #0 lag: (min: 0.0, avg: 43.7, max: 87.0) [2024-03-21 01:29:25,522][03784] Avg episode reward: [(0, '0.789')] [2024-03-21 01:29:27,008][04017] Updated weights for policy 0, policy_version 14015 (0.0011) [2024-03-21 01:29:30,521][03784] Fps is (10 sec: 58981.9, 60 sec: 48059.7, 300 sec: 46319.5). Total num frames: 459374592. Throughput: 0: 45713.3. Samples: 460479600. Policy #0 lag: (min: 2.0, avg: 41.5, max: 98.0) [2024-03-21 01:29:30,522][03784] Avg episode reward: [(0, '0.985')] [2024-03-21 01:29:32,542][04017] Updated weights for policy 0, policy_version 14025 (0.0025) [2024-03-21 01:29:35,521][03784] Fps is (10 sec: 58982.3, 60 sec: 50790.4, 300 sec: 46208.4). Total num frames: 459669504. Throughput: 0: 46104.5. Samples: 460771500. Policy #0 lag: (min: 2.0, avg: 41.5, max: 98.0) [2024-03-21 01:29:35,522][03784] Avg episode reward: [(0, '1.531')] [2024-03-21 01:29:40,521][03784] Fps is (10 sec: 42598.9, 60 sec: 48605.9, 300 sec: 46097.4). Total num frames: 459800576. Throughput: 0: 47247.0. Samples: 461083100. Policy #0 lag: (min: 0.0, avg: 33.6, max: 83.0) [2024-03-21 01:29:40,522][03784] Avg episode reward: [(0, '1.531')] [2024-03-21 01:29:42,747][04017] Updated weights for policy 0, policy_version 14035 (0.0019) [2024-03-21 01:29:45,521][03784] Fps is (10 sec: 22937.4, 60 sec: 46421.2, 300 sec: 45319.8). Total num frames: 459898880. Throughput: 0: 47231.1. Samples: 461221200. Policy #0 lag: (min: 0.0, avg: 33.6, max: 83.0) [2024-03-21 01:29:45,522][03784] Avg episode reward: [(0, '1.303')] [2024-03-21 01:29:50,521][03784] Fps is (10 sec: 26214.4, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 460062720. Throughput: 0: 47711.1. Samples: 461511600. Policy #0 lag: (min: 0.0, avg: 27.6, max: 82.0) [2024-03-21 01:29:50,522][03784] Avg episode reward: [(0, '1.159')] [2024-03-21 01:29:53,453][04017] Updated weights for policy 0, policy_version 14045 (0.0011) [2024-03-21 01:29:55,521][03784] Fps is (10 sec: 52429.9, 60 sec: 46967.7, 300 sec: 46208.5). Total num frames: 460423168. Throughput: 0: 48420.2. Samples: 461788500. Policy #0 lag: (min: 0.0, avg: 27.6, max: 82.0) [2024-03-21 01:29:55,522][03784] Avg episode reward: [(0, '0.433')] [2024-03-21 01:29:57,707][04017] Updated weights for policy 0, policy_version 14055 (0.0022) [2024-03-21 01:30:00,521][03784] Fps is (10 sec: 62258.5, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 460685312. Throughput: 0: 47613.2. Samples: 461910600. Policy #0 lag: (min: 0.0, avg: 27.6, max: 82.0) [2024-03-21 01:30:00,522][03784] Avg episode reward: [(0, '1.095')] [2024-03-21 01:30:00,577][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000014060_460718080.pth... [2024-03-21 01:30:00,696][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000013717_449478656.pth [2024-03-21 01:30:01,984][03995] Signal inference workers to stop experience collection... (9250 times) [2024-03-21 01:30:01,985][03995] Signal inference workers to resume experience collection... (9250 times) [2024-03-21 01:30:02,059][04017] InferenceWorker_p0-w0: stopping experience collection (9250 times) [2024-03-21 01:30:02,060][04017] InferenceWorker_p0-w0: resuming experience collection (9250 times) [2024-03-21 01:30:03,468][04017] Updated weights for policy 0, policy_version 14065 (0.0012) [2024-03-21 01:30:05,521][03784] Fps is (10 sec: 45874.9, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 460881920. Throughput: 0: 47653.4. Samples: 462201800. Policy #0 lag: (min: 0.0, avg: 42.0, max: 91.0) [2024-03-21 01:30:05,522][03784] Avg episode reward: [(0, '0.478')] [2024-03-21 01:30:10,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 46430.6). Total num frames: 461045760. Throughput: 0: 47826.6. Samples: 462503100. Policy #0 lag: (min: 0.0, avg: 42.0, max: 91.0) [2024-03-21 01:30:10,522][03784] Avg episode reward: [(0, '0.436')] [2024-03-21 01:30:12,016][04017] Updated weights for policy 0, policy_version 14075 (0.0017) [2024-03-21 01:30:15,521][03784] Fps is (10 sec: 52428.1, 60 sec: 47513.5, 300 sec: 46319.5). Total num frames: 461406208. Throughput: 0: 48013.3. Samples: 462640200. Policy #0 lag: (min: 1.0, avg: 52.9, max: 109.0) [2024-03-21 01:30:15,523][03784] Avg episode reward: [(0, '0.622')] [2024-03-21 01:30:17,271][04017] Updated weights for policy 0, policy_version 14085 (0.0011) [2024-03-21 01:30:20,521][03784] Fps is (10 sec: 55705.8, 60 sec: 46967.5, 300 sec: 46430.6). Total num frames: 461602816. Throughput: 0: 47980.0. Samples: 462930600. Policy #0 lag: (min: 1.0, avg: 52.9, max: 109.0) [2024-03-21 01:30:20,522][03784] Avg episode reward: [(0, '0.622')] [2024-03-21 01:30:24,450][04017] Updated weights for policy 0, policy_version 14095 (0.0010) [2024-03-21 01:30:25,521][03784] Fps is (10 sec: 45875.6, 60 sec: 46421.3, 300 sec: 46208.4). Total num frames: 461864960. Throughput: 0: 47466.6. Samples: 463219100. Policy #0 lag: (min: 1.0, avg: 34.0, max: 70.0) [2024-03-21 01:30:25,522][03784] Avg episode reward: [(0, '1.269')] [2024-03-21 01:30:30,197][04017] Updated weights for policy 0, policy_version 14105 (0.0010) [2024-03-21 01:30:30,522][03784] Fps is (10 sec: 58978.8, 60 sec: 46967.0, 300 sec: 46541.6). Total num frames: 462192640. Throughput: 0: 47010.6. Samples: 463336700. Policy #0 lag: (min: 1.0, avg: 34.0, max: 70.0) [2024-03-21 01:30:30,523][03784] Avg episode reward: [(0, '0.728')] [2024-03-21 01:30:35,521][03784] Fps is (10 sec: 55706.0, 60 sec: 45875.3, 300 sec: 46319.5). Total num frames: 462422016. Throughput: 0: 47237.8. Samples: 463637300. Policy #0 lag: (min: 0.0, avg: 42.1, max: 102.0) [2024-03-21 01:30:35,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 01:30:37,956][04017] Updated weights for policy 0, policy_version 14115 (0.0013) [2024-03-21 01:30:40,521][03784] Fps is (10 sec: 42601.0, 60 sec: 46967.4, 300 sec: 46319.5). Total num frames: 462618624. Throughput: 0: 47044.3. Samples: 463905500. Policy #0 lag: (min: 0.0, avg: 42.1, max: 102.0) [2024-03-21 01:30:40,522][03784] Avg episode reward: [(0, '0.734')] [2024-03-21 01:30:45,521][03784] Fps is (10 sec: 36044.9, 60 sec: 48059.9, 300 sec: 46208.5). Total num frames: 462782464. Throughput: 0: 47566.8. Samples: 464051100. Policy #0 lag: (min: 0.0, avg: 42.1, max: 102.0) [2024-03-21 01:30:45,522][03784] Avg episode reward: [(0, '0.586')] [2024-03-21 01:30:46,665][04017] Updated weights for policy 0, policy_version 14125 (0.0009) [2024-03-21 01:30:50,521][03784] Fps is (10 sec: 45874.9, 60 sec: 50244.2, 300 sec: 46430.6). Total num frames: 463077376. Throughput: 0: 47048.8. Samples: 464319000. Policy #0 lag: (min: 0.0, avg: 32.5, max: 77.0) [2024-03-21 01:30:50,522][03784] Avg episode reward: [(0, '1.094')] [2024-03-21 01:30:51,491][04017] Updated weights for policy 0, policy_version 14135 (0.0009) [2024-03-21 01:30:53,217][03995] Signal inference workers to stop experience collection... (9300 times) [2024-03-21 01:30:53,218][03995] Signal inference workers to resume experience collection... (9300 times) [2024-03-21 01:30:53,296][04017] InferenceWorker_p0-w0: stopping experience collection (9300 times) [2024-03-21 01:30:53,296][04017] InferenceWorker_p0-w0: resuming experience collection (9300 times) [2024-03-21 01:30:55,521][03784] Fps is (10 sec: 62259.0, 60 sec: 49698.1, 300 sec: 47208.2). Total num frames: 463405056. Throughput: 0: 46235.7. Samples: 464583700. Policy #0 lag: (min: 0.0, avg: 32.5, max: 77.0) [2024-03-21 01:30:55,522][03784] Avg episode reward: [(0, '0.713')] [2024-03-21 01:30:57,853][04017] Updated weights for policy 0, policy_version 14145 (0.0016) [2024-03-21 01:31:00,521][03784] Fps is (10 sec: 45875.5, 60 sec: 47513.6, 300 sec: 46652.8). Total num frames: 463536128. Throughput: 0: 46306.7. Samples: 464724000. Policy #0 lag: (min: 0.0, avg: 32.5, max: 73.0) [2024-03-21 01:31:00,522][03784] Avg episode reward: [(0, '0.716')] [2024-03-21 01:31:05,521][03784] Fps is (10 sec: 13107.2, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 463536128. Throughput: 0: 46124.5. Samples: 465006200. Policy #0 lag: (min: 0.0, avg: 32.5, max: 73.0) [2024-03-21 01:31:05,522][03784] Avg episode reward: [(0, '0.649')] [2024-03-21 01:31:10,521][03784] Fps is (10 sec: 22937.4, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 463765504. Throughput: 0: 45199.9. Samples: 465253100. Policy #0 lag: (min: 0.0, avg: 33.0, max: 80.0) [2024-03-21 01:31:10,522][03784] Avg episode reward: [(0, '1.139')] [2024-03-21 01:31:11,525][04017] Updated weights for policy 0, policy_version 14155 (0.0019) [2024-03-21 01:31:15,521][03784] Fps is (10 sec: 49151.3, 60 sec: 43690.7, 300 sec: 46541.7). Total num frames: 464027648. Throughput: 0: 45671.6. Samples: 465391900. Policy #0 lag: (min: 0.0, avg: 33.0, max: 80.0) [2024-03-21 01:31:15,522][03784] Avg episode reward: [(0, '1.149')] [2024-03-21 01:31:16,877][04017] Updated weights for policy 0, policy_version 14165 (0.0010) [2024-03-21 01:31:20,521][03784] Fps is (10 sec: 62259.1, 60 sec: 46421.2, 300 sec: 46763.8). Total num frames: 464388096. Throughput: 0: 44817.6. Samples: 465654100. Policy #0 lag: (min: 1.0, avg: 36.7, max: 74.0) [2024-03-21 01:31:20,522][03784] Avg episode reward: [(0, '0.724')] [2024-03-21 01:31:22,444][04017] Updated weights for policy 0, policy_version 14175 (0.0016) [2024-03-21 01:31:25,521][03784] Fps is (10 sec: 49152.1, 60 sec: 44236.8, 300 sec: 46874.9). Total num frames: 464519168. Throughput: 0: 44995.5. Samples: 465930300. Policy #0 lag: (min: 1.0, avg: 36.7, max: 74.0) [2024-03-21 01:31:25,522][03784] Avg episode reward: [(0, '1.052')] [2024-03-21 01:31:30,336][04017] Updated weights for policy 0, policy_version 14185 (0.0013) [2024-03-21 01:31:30,521][03784] Fps is (10 sec: 42598.2, 60 sec: 43691.0, 300 sec: 46430.6). Total num frames: 464814080. Throughput: 0: 45019.8. Samples: 466077000. Policy #0 lag: (min: 5.0, avg: 41.1, max: 87.0) [2024-03-21 01:31:30,522][03784] Avg episode reward: [(0, '1.375')] [2024-03-21 01:31:34,704][04017] Updated weights for policy 0, policy_version 14195 (0.0010) [2024-03-21 01:31:35,521][03784] Fps is (10 sec: 65536.6, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 465174528. Throughput: 0: 45033.4. Samples: 466345500. Policy #0 lag: (min: 5.0, avg: 41.1, max: 87.0) [2024-03-21 01:31:35,522][03784] Avg episode reward: [(0, '1.375')] [2024-03-21 01:31:40,521][03784] Fps is (10 sec: 62259.7, 60 sec: 46967.4, 300 sec: 46986.0). Total num frames: 465436672. Throughput: 0: 45357.7. Samples: 466624800. Policy #0 lag: (min: 5.0, avg: 41.1, max: 87.0) [2024-03-21 01:31:40,522][03784] Avg episode reward: [(0, '0.876')] [2024-03-21 01:31:42,155][04017] Updated weights for policy 0, policy_version 14205 (0.0016) [2024-03-21 01:31:45,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46967.4, 300 sec: 46208.4). Total num frames: 465600512. Throughput: 0: 45157.8. Samples: 466756100. Policy #0 lag: (min: 0.0, avg: 42.8, max: 76.0) [2024-03-21 01:31:45,522][03784] Avg episode reward: [(0, '0.665')] [2024-03-21 01:31:48,233][03995] Signal inference workers to stop experience collection... (9350 times) [2024-03-21 01:31:48,306][04017] InferenceWorker_p0-w0: stopping experience collection (9350 times) [2024-03-21 01:31:48,540][03995] Signal inference workers to resume experience collection... (9350 times) [2024-03-21 01:31:48,540][04017] InferenceWorker_p0-w0: resuming experience collection (9350 times) [2024-03-21 01:31:49,427][04017] Updated weights for policy 0, policy_version 14215 (0.0010) [2024-03-21 01:31:50,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46967.5, 300 sec: 46652.7). Total num frames: 465895424. Throughput: 0: 45068.8. Samples: 467034300. Policy #0 lag: (min: 0.0, avg: 42.8, max: 76.0) [2024-03-21 01:31:50,522][03784] Avg episode reward: [(0, '0.782')] [2024-03-21 01:31:53,754][04017] Updated weights for policy 0, policy_version 14225 (0.0021) [2024-03-21 01:31:55,521][03784] Fps is (10 sec: 52429.2, 60 sec: 45329.1, 300 sec: 46541.7). Total num frames: 466124800. Throughput: 0: 45715.7. Samples: 467310300. Policy #0 lag: (min: 1.0, avg: 51.1, max: 95.0) [2024-03-21 01:31:55,522][03784] Avg episode reward: [(0, '0.865')] [2024-03-21 01:32:00,528][03784] Fps is (10 sec: 42568.1, 60 sec: 46415.8, 300 sec: 46207.3). Total num frames: 466321408. Throughput: 0: 45392.9. Samples: 467434900. Policy #0 lag: (min: 1.0, avg: 51.1, max: 95.0) [2024-03-21 01:32:00,529][03784] Avg episode reward: [(0, '1.206')] [2024-03-21 01:32:00,543][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000014231_466321408.pth... [2024-03-21 01:32:00,640][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000013887_455049216.pth [2024-03-21 01:32:05,035][04017] Updated weights for policy 0, policy_version 14235 (0.0016) [2024-03-21 01:32:05,521][03784] Fps is (10 sec: 32767.3, 60 sec: 48605.7, 300 sec: 46319.5). Total num frames: 466452480. Throughput: 0: 45275.5. Samples: 467691500. Policy #0 lag: (min: 0.0, avg: 34.6, max: 72.0) [2024-03-21 01:32:05,522][03784] Avg episode reward: [(0, '1.237')] [2024-03-21 01:32:10,521][03784] Fps is (10 sec: 29512.2, 60 sec: 47513.7, 300 sec: 45986.3). Total num frames: 466616320. Throughput: 0: 45197.8. Samples: 467964200. Policy #0 lag: (min: 0.0, avg: 34.6, max: 72.0) [2024-03-21 01:32:10,522][03784] Avg episode reward: [(0, '0.609')] [2024-03-21 01:32:12,628][04017] Updated weights for policy 0, policy_version 14245 (0.0014) [2024-03-21 01:32:15,521][03784] Fps is (10 sec: 42599.0, 60 sec: 47513.6, 300 sec: 46097.4). Total num frames: 466878464. Throughput: 0: 44740.1. Samples: 468090300. Policy #0 lag: (min: 0.0, avg: 34.6, max: 72.0) [2024-03-21 01:32:15,522][03784] Avg episode reward: [(0, '0.590')] [2024-03-21 01:32:20,521][03784] Fps is (10 sec: 36045.0, 60 sec: 43144.6, 300 sec: 45653.1). Total num frames: 466976768. Throughput: 0: 45093.3. Samples: 468374700. Policy #0 lag: (min: 0.0, avg: 31.0, max: 80.0) [2024-03-21 01:32:20,522][03784] Avg episode reward: [(0, '0.771')] [2024-03-21 01:32:22,250][04017] Updated weights for policy 0, policy_version 14255 (0.0011) [2024-03-21 01:32:25,521][03784] Fps is (10 sec: 29491.3, 60 sec: 44236.8, 300 sec: 45653.1). Total num frames: 467173376. Throughput: 0: 45129.0. Samples: 468655600. Policy #0 lag: (min: 0.0, avg: 31.0, max: 80.0) [2024-03-21 01:32:25,522][03784] Avg episode reward: [(0, '0.847')] [2024-03-21 01:32:30,521][03784] Fps is (10 sec: 39321.7, 60 sec: 42598.6, 300 sec: 45764.1). Total num frames: 467369984. Throughput: 0: 44960.1. Samples: 468779300. Policy #0 lag: (min: 0.0, avg: 32.3, max: 114.0) [2024-03-21 01:32:30,522][03784] Avg episode reward: [(0, '1.014')] [2024-03-21 01:32:31,795][04017] Updated weights for policy 0, policy_version 14265 (0.0012) [2024-03-21 01:32:35,521][03784] Fps is (10 sec: 45874.9, 60 sec: 40959.9, 300 sec: 46208.4). Total num frames: 467632128. Throughput: 0: 44864.4. Samples: 469053200. Policy #0 lag: (min: 0.0, avg: 32.3, max: 114.0) [2024-03-21 01:32:35,522][03784] Avg episode reward: [(0, '0.859')] [2024-03-21 01:32:36,996][04017] Updated weights for policy 0, policy_version 14275 (0.0011) [2024-03-21 01:32:40,521][03784] Fps is (10 sec: 58981.6, 60 sec: 42052.3, 300 sec: 46652.7). Total num frames: 467959808. Throughput: 0: 45042.1. Samples: 469337200. Policy #0 lag: (min: 2.0, avg: 56.6, max: 107.0) [2024-03-21 01:32:40,522][03784] Avg episode reward: [(0, '1.200')] [2024-03-21 01:32:42,909][04017] Updated weights for policy 0, policy_version 14285 (0.0017) [2024-03-21 01:32:45,521][03784] Fps is (10 sec: 62259.4, 60 sec: 44236.8, 300 sec: 46541.7). Total num frames: 468254720. Throughput: 0: 45340.5. Samples: 469474900. Policy #0 lag: (min: 2.0, avg: 56.6, max: 107.0) [2024-03-21 01:32:45,522][03784] Avg episode reward: [(0, '0.830')] [2024-03-21 01:32:49,407][04017] Updated weights for policy 0, policy_version 14295 (0.0012) [2024-03-21 01:32:49,677][03995] Signal inference workers to stop experience collection... (9400 times) [2024-03-21 01:32:49,738][03995] Signal inference workers to resume experience collection... (9400 times) [2024-03-21 01:32:49,754][04017] InferenceWorker_p0-w0: stopping experience collection (9400 times) [2024-03-21 01:32:49,791][04017] InferenceWorker_p0-w0: resuming experience collection (9400 times) [2024-03-21 01:32:50,521][03784] Fps is (10 sec: 55705.9, 60 sec: 43690.7, 300 sec: 46319.5). Total num frames: 468516864. Throughput: 0: 46189.0. Samples: 469770000. Policy #0 lag: (min: 1.0, avg: 53.2, max: 115.0) [2024-03-21 01:32:50,522][03784] Avg episode reward: [(0, '0.830')] [2024-03-21 01:32:53,255][04017] Updated weights for policy 0, policy_version 14305 (0.0011) [2024-03-21 01:32:55,521][03784] Fps is (10 sec: 72089.3, 60 sec: 47513.5, 300 sec: 47208.1). Total num frames: 468975616. Throughput: 0: 45922.2. Samples: 470030700. Policy #0 lag: (min: 1.0, avg: 53.2, max: 115.0) [2024-03-21 01:32:55,522][03784] Avg episode reward: [(0, '1.086')] [2024-03-21 01:32:57,646][04017] Updated weights for policy 0, policy_version 14315 (0.0016) [2024-03-21 01:33:00,521][03784] Fps is (10 sec: 62259.1, 60 sec: 46973.0, 300 sec: 46541.7). Total num frames: 469139456. Throughput: 0: 46284.4. Samples: 470173100. Policy #0 lag: (min: 0.0, avg: 55.2, max: 116.0) [2024-03-21 01:33:00,522][03784] Avg episode reward: [(0, '0.669')] [2024-03-21 01:33:05,521][03784] Fps is (10 sec: 29491.6, 60 sec: 46967.6, 300 sec: 45986.3). Total num frames: 469270528. Throughput: 0: 46664.5. Samples: 470474600. Policy #0 lag: (min: 0.0, avg: 55.2, max: 116.0) [2024-03-21 01:33:05,522][03784] Avg episode reward: [(0, '1.080')] [2024-03-21 01:33:08,352][04017] Updated weights for policy 0, policy_version 14325 (0.0016) [2024-03-21 01:33:10,521][03784] Fps is (10 sec: 39321.5, 60 sec: 48605.8, 300 sec: 45764.1). Total num frames: 469532672. Throughput: 0: 46664.4. Samples: 470755500. Policy #0 lag: (min: 0.0, avg: 55.2, max: 116.0) [2024-03-21 01:33:10,522][03784] Avg episode reward: [(0, '1.468')] [2024-03-21 01:33:13,778][04017] Updated weights for policy 0, policy_version 14335 (0.0011) [2024-03-21 01:33:15,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48605.9, 300 sec: 46208.4). Total num frames: 469794816. Throughput: 0: 47217.7. Samples: 470904100. Policy #0 lag: (min: 0.0, avg: 41.9, max: 83.0) [2024-03-21 01:33:15,522][03784] Avg episode reward: [(0, '1.176')] [2024-03-21 01:33:20,521][03784] Fps is (10 sec: 36044.9, 60 sec: 48605.8, 300 sec: 45764.1). Total num frames: 469893120. Throughput: 0: 47844.5. Samples: 471206200. Policy #0 lag: (min: 0.0, avg: 41.9, max: 83.0) [2024-03-21 01:33:20,522][03784] Avg episode reward: [(0, '1.176')] [2024-03-21 01:33:25,521][03784] Fps is (10 sec: 19660.8, 60 sec: 46967.5, 300 sec: 45764.1). Total num frames: 469991424. Throughput: 0: 48273.4. Samples: 471509500. Policy #0 lag: (min: 0.0, avg: 35.3, max: 76.0) [2024-03-21 01:33:25,522][03784] Avg episode reward: [(0, '0.975')] [2024-03-21 01:33:26,529][04017] Updated weights for policy 0, policy_version 14345 (0.0010) [2024-03-21 01:33:30,521][03784] Fps is (10 sec: 39321.5, 60 sec: 48605.8, 300 sec: 46319.5). Total num frames: 470286336. Throughput: 0: 48135.5. Samples: 471641000. Policy #0 lag: (min: 0.0, avg: 35.3, max: 76.0) [2024-03-21 01:33:30,522][03784] Avg episode reward: [(0, '0.434')] [2024-03-21 01:33:32,051][04017] Updated weights for policy 0, policy_version 14355 (0.0010) [2024-03-21 01:33:35,521][03784] Fps is (10 sec: 62259.3, 60 sec: 49698.2, 300 sec: 46541.7). Total num frames: 470614016. Throughput: 0: 47646.7. Samples: 471914100. Policy #0 lag: (min: 2.0, avg: 32.1, max: 73.0) [2024-03-21 01:33:35,522][03784] Avg episode reward: [(0, '1.027')] [2024-03-21 01:33:36,195][04017] Updated weights for policy 0, policy_version 14365 (0.0011) [2024-03-21 01:33:40,521][03784] Fps is (10 sec: 58982.8, 60 sec: 48605.9, 300 sec: 46652.7). Total num frames: 470876160. Throughput: 0: 48066.8. Samples: 472193700. Policy #0 lag: (min: 2.0, avg: 32.1, max: 73.0) [2024-03-21 01:33:40,522][03784] Avg episode reward: [(0, '0.442')] [2024-03-21 01:33:42,924][03995] Signal inference workers to stop experience collection... (9450 times) [2024-03-21 01:33:42,965][04017] InferenceWorker_p0-w0: stopping experience collection (9450 times) [2024-03-21 01:33:43,143][03995] Signal inference workers to resume experience collection... (9450 times) [2024-03-21 01:33:43,144][04017] InferenceWorker_p0-w0: resuming experience collection (9450 times) [2024-03-21 01:33:43,702][04017] Updated weights for policy 0, policy_version 14375 (0.0019) [2024-03-21 01:33:45,521][03784] Fps is (10 sec: 45874.7, 60 sec: 46967.5, 300 sec: 46652.7). Total num frames: 471072768. Throughput: 0: 48162.2. Samples: 472340400. Policy #0 lag: (min: 2.0, avg: 32.1, max: 73.0) [2024-03-21 01:33:45,522][03784] Avg episode reward: [(0, '0.779')] [2024-03-21 01:33:50,431][04017] Updated weights for policy 0, policy_version 14385 (0.0011) [2024-03-21 01:33:50,521][03784] Fps is (10 sec: 49152.1, 60 sec: 47513.6, 300 sec: 46652.8). Total num frames: 471367680. Throughput: 0: 47880.0. Samples: 472629200. Policy #0 lag: (min: 0.0, avg: 37.9, max: 83.0) [2024-03-21 01:33:50,522][03784] Avg episode reward: [(0, '1.414')] [2024-03-21 01:33:55,521][03784] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 46319.5). Total num frames: 471531520. Throughput: 0: 47911.2. Samples: 472911500. Policy #0 lag: (min: 0.0, avg: 37.9, max: 83.0) [2024-03-21 01:33:55,522][03784] Avg episode reward: [(0, '0.630')] [2024-03-21 01:33:57,420][04017] Updated weights for policy 0, policy_version 14395 (0.0015) [2024-03-21 01:34:00,521][03784] Fps is (10 sec: 52428.4, 60 sec: 45875.2, 300 sec: 46874.9). Total num frames: 471891968. Throughput: 0: 47579.9. Samples: 473045200. Policy #0 lag: (min: 1.0, avg: 35.2, max: 69.0) [2024-03-21 01:34:00,522][03784] Avg episode reward: [(0, '0.731')] [2024-03-21 01:34:00,542][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000014401_471891968.pth... [2024-03-21 01:34:00,656][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000014060_460718080.pth [2024-03-21 01:34:03,826][04017] Updated weights for policy 0, policy_version 14405 (0.0017) [2024-03-21 01:34:05,521][03784] Fps is (10 sec: 55705.3, 60 sec: 46967.4, 300 sec: 46430.6). Total num frames: 472088576. Throughput: 0: 47255.6. Samples: 473332700. Policy #0 lag: (min: 1.0, avg: 35.2, max: 69.0) [2024-03-21 01:34:05,522][03784] Avg episode reward: [(0, '0.585')] [2024-03-21 01:34:09,417][04017] Updated weights for policy 0, policy_version 14415 (0.0015) [2024-03-21 01:34:10,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46967.5, 300 sec: 46763.8). Total num frames: 472350720. Throughput: 0: 47244.4. Samples: 473635500. Policy #0 lag: (min: 1.0, avg: 39.3, max: 107.0) [2024-03-21 01:34:10,531][03784] Avg episode reward: [(0, '0.640')] [2024-03-21 01:34:15,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 46652.7). Total num frames: 472547328. Throughput: 0: 47806.7. Samples: 473792300. Policy #0 lag: (min: 1.0, avg: 39.3, max: 107.0) [2024-03-21 01:34:15,522][03784] Avg episode reward: [(0, '0.640')] [2024-03-21 01:34:16,966][04017] Updated weights for policy 0, policy_version 14425 (0.0016) [2024-03-21 01:34:20,521][03784] Fps is (10 sec: 52428.2, 60 sec: 49698.0, 300 sec: 46763.8). Total num frames: 472875008. Throughput: 0: 47344.2. Samples: 474044600. Policy #0 lag: (min: 2.0, avg: 50.7, max: 96.0) [2024-03-21 01:34:20,522][03784] Avg episode reward: [(0, '0.653')] [2024-03-21 01:34:23,656][04017] Updated weights for policy 0, policy_version 14435 (0.0019) [2024-03-21 01:34:25,521][03784] Fps is (10 sec: 58982.6, 60 sec: 52428.8, 300 sec: 46652.8). Total num frames: 473137152. Throughput: 0: 46935.6. Samples: 474305800. Policy #0 lag: (min: 2.0, avg: 50.7, max: 96.0) [2024-03-21 01:34:25,522][03784] Avg episode reward: [(0, '0.710')] [2024-03-21 01:34:30,521][03784] Fps is (10 sec: 29491.5, 60 sec: 48059.7, 300 sec: 45764.1). Total num frames: 473169920. Throughput: 0: 47151.1. Samples: 474462200. Policy #0 lag: (min: 0.0, avg: 25.3, max: 81.0) [2024-03-21 01:34:30,522][03784] Avg episode reward: [(0, '0.687')] [2024-03-21 01:34:31,660][03995] Signal inference workers to stop experience collection... (9500 times) [2024-03-21 01:34:31,686][04017] InferenceWorker_p0-w0: stopping experience collection (9500 times) [2024-03-21 01:34:31,887][03995] Signal inference workers to resume experience collection... (9500 times) [2024-03-21 01:34:31,888][04017] InferenceWorker_p0-w0: resuming experience collection (9500 times) [2024-03-21 01:34:32,199][04017] Updated weights for policy 0, policy_version 14445 (0.0011) [2024-03-21 01:34:35,521][03784] Fps is (10 sec: 39321.2, 60 sec: 48605.8, 300 sec: 46541.6). Total num frames: 473530368. Throughput: 0: 46899.9. Samples: 474739700. Policy #0 lag: (min: 0.0, avg: 25.3, max: 81.0) [2024-03-21 01:34:35,522][03784] Avg episode reward: [(0, '1.154')] [2024-03-21 01:34:38,973][04017] Updated weights for policy 0, policy_version 14455 (0.0012) [2024-03-21 01:34:40,521][03784] Fps is (10 sec: 52429.0, 60 sec: 46967.5, 300 sec: 46763.8). Total num frames: 473694208. Throughput: 0: 46931.0. Samples: 475023400. Policy #0 lag: (min: 0.0, avg: 25.3, max: 81.0) [2024-03-21 01:34:40,522][03784] Avg episode reward: [(0, '0.671')] [2024-03-21 01:34:45,521][03784] Fps is (10 sec: 29491.7, 60 sec: 45875.3, 300 sec: 46652.8). Total num frames: 473825280. Throughput: 0: 47062.4. Samples: 475163000. Policy #0 lag: (min: 0.0, avg: 28.1, max: 85.0) [2024-03-21 01:34:45,522][03784] Avg episode reward: [(0, '0.767')] [2024-03-21 01:34:49,014][04017] Updated weights for policy 0, policy_version 14465 (0.0010) [2024-03-21 01:34:50,521][03784] Fps is (10 sec: 42598.8, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 474120192. Throughput: 0: 47353.4. Samples: 475463600. Policy #0 lag: (min: 0.0, avg: 28.1, max: 85.0) [2024-03-21 01:34:50,522][03784] Avg episode reward: [(0, '1.111')] [2024-03-21 01:34:53,450][04017] Updated weights for policy 0, policy_version 14475 (0.0021) [2024-03-21 01:34:55,521][03784] Fps is (10 sec: 62258.1, 60 sec: 48605.7, 300 sec: 46652.7). Total num frames: 474447872. Throughput: 0: 46028.8. Samples: 475706800. Policy #0 lag: (min: 1.0, avg: 35.2, max: 92.0) [2024-03-21 01:34:55,522][03784] Avg episode reward: [(0, '1.226')] [2024-03-21 01:34:58,627][04017] Updated weights for policy 0, policy_version 14485 (0.0015) [2024-03-21 01:35:00,521][03784] Fps is (10 sec: 58981.8, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 474710016. Throughput: 0: 45777.8. Samples: 475852300. Policy #0 lag: (min: 1.0, avg: 35.2, max: 92.0) [2024-03-21 01:35:00,522][03784] Avg episode reward: [(0, '1.315')] [2024-03-21 01:35:04,533][04017] Updated weights for policy 0, policy_version 14495 (0.0011) [2024-03-21 01:35:05,521][03784] Fps is (10 sec: 55705.8, 60 sec: 48605.8, 300 sec: 47319.2). Total num frames: 475004928. Throughput: 0: 46562.3. Samples: 476139900. Policy #0 lag: (min: 0.0, avg: 34.1, max: 66.0) [2024-03-21 01:35:05,522][03784] Avg episode reward: [(0, '0.833')] [2024-03-21 01:35:10,521][03784] Fps is (10 sec: 42598.7, 60 sec: 46421.4, 300 sec: 46541.7). Total num frames: 475136000. Throughput: 0: 47451.1. Samples: 476441100. Policy #0 lag: (min: 0.0, avg: 34.1, max: 66.0) [2024-03-21 01:35:10,522][03784] Avg episode reward: [(0, '0.833')] [2024-03-21 01:35:13,648][04017] Updated weights for policy 0, policy_version 14505 (0.0011) [2024-03-21 01:35:15,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46967.5, 300 sec: 46652.7). Total num frames: 475365376. Throughput: 0: 46724.5. Samples: 476564800. Policy #0 lag: (min: 2.0, avg: 33.6, max: 88.0) [2024-03-21 01:35:15,522][03784] Avg episode reward: [(0, '0.726')] [2024-03-21 01:35:20,521][03784] Fps is (10 sec: 36044.5, 60 sec: 43690.7, 300 sec: 46208.4). Total num frames: 475496448. Throughput: 0: 46293.4. Samples: 476822900. Policy #0 lag: (min: 2.0, avg: 33.6, max: 88.0) [2024-03-21 01:35:20,522][03784] Avg episode reward: [(0, '1.011')] [2024-03-21 01:35:21,824][04017] Updated weights for policy 0, policy_version 14515 (0.0013) [2024-03-21 01:35:25,521][03784] Fps is (10 sec: 45874.9, 60 sec: 44782.8, 300 sec: 46208.5). Total num frames: 475824128. Throughput: 0: 45948.8. Samples: 477091100. Policy #0 lag: (min: 2.0, avg: 33.6, max: 88.0) [2024-03-21 01:35:25,522][03784] Avg episode reward: [(0, '1.083')] [2024-03-21 01:35:27,883][04017] Updated weights for policy 0, policy_version 14525 (0.0030) [2024-03-21 01:35:27,934][03995] Signal inference workers to stop experience collection... (9550 times) [2024-03-21 01:35:28,006][04017] InferenceWorker_p0-w0: stopping experience collection (9550 times) [2024-03-21 01:35:28,169][03995] Signal inference workers to resume experience collection... (9550 times) [2024-03-21 01:35:28,169][04017] InferenceWorker_p0-w0: resuming experience collection (9550 times) [2024-03-21 01:35:30,521][03784] Fps is (10 sec: 65536.3, 60 sec: 49698.2, 300 sec: 46541.7). Total num frames: 476151808. Throughput: 0: 45937.7. Samples: 477230200. Policy #0 lag: (min: 0.0, avg: 47.1, max: 94.0) [2024-03-21 01:35:30,522][03784] Avg episode reward: [(0, '1.263')] [2024-03-21 01:35:35,043][04017] Updated weights for policy 0, policy_version 14535 (0.0020) [2024-03-21 01:35:35,521][03784] Fps is (10 sec: 45875.8, 60 sec: 45875.3, 300 sec: 46319.5). Total num frames: 476282880. Throughput: 0: 45391.1. Samples: 477506200. Policy #0 lag: (min: 0.0, avg: 47.1, max: 94.0) [2024-03-21 01:35:35,522][03784] Avg episode reward: [(0, '1.402')] [2024-03-21 01:35:40,521][03784] Fps is (10 sec: 29491.2, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 476446720. Throughput: 0: 45680.1. Samples: 477762400. Policy #0 lag: (min: 0.0, avg: 37.0, max: 79.0) [2024-03-21 01:35:40,522][03784] Avg episode reward: [(0, '0.503')] [2024-03-21 01:35:43,573][04017] Updated weights for policy 0, policy_version 14545 (0.0020) [2024-03-21 01:35:45,521][03784] Fps is (10 sec: 36044.5, 60 sec: 46967.4, 300 sec: 45986.3). Total num frames: 476643328. Throughput: 0: 45491.1. Samples: 477899400. Policy #0 lag: (min: 0.0, avg: 37.0, max: 79.0) [2024-03-21 01:35:45,522][03784] Avg episode reward: [(0, '1.391')] [2024-03-21 01:35:50,521][03784] Fps is (10 sec: 42598.3, 60 sec: 45875.1, 300 sec: 45653.0). Total num frames: 476872704. Throughput: 0: 45111.1. Samples: 478169900. Policy #0 lag: (min: 0.0, avg: 31.4, max: 69.0) [2024-03-21 01:35:50,522][03784] Avg episode reward: [(0, '0.886')] [2024-03-21 01:35:51,470][04017] Updated weights for policy 0, policy_version 14555 (0.0011) [2024-03-21 01:35:55,521][03784] Fps is (10 sec: 52428.9, 60 sec: 45329.1, 300 sec: 46208.4). Total num frames: 477167616. Throughput: 0: 44602.2. Samples: 478448200. Policy #0 lag: (min: 0.0, avg: 31.4, max: 69.0) [2024-03-21 01:35:55,522][03784] Avg episode reward: [(0, '1.034')] [2024-03-21 01:35:59,631][04017] Updated weights for policy 0, policy_version 14565 (0.0010) [2024-03-21 01:36:00,521][03784] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 46541.6). Total num frames: 477265920. Throughput: 0: 44973.3. Samples: 478588600. Policy #0 lag: (min: 0.0, avg: 36.7, max: 69.0) [2024-03-21 01:36:00,522][03784] Avg episode reward: [(0, '0.655')] [2024-03-21 01:36:00,532][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000014565_477265920.pth... [2024-03-21 01:36:00,686][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000014231_466321408.pth [2024-03-21 01:36:05,521][03784] Fps is (10 sec: 26214.4, 60 sec: 40413.9, 300 sec: 46319.5). Total num frames: 477429760. Throughput: 0: 45877.8. Samples: 478887400. Policy #0 lag: (min: 0.0, avg: 36.7, max: 69.0) [2024-03-21 01:36:05,522][03784] Avg episode reward: [(0, '1.049')] [2024-03-21 01:36:08,352][04017] Updated weights for policy 0, policy_version 14575 (0.0019) [2024-03-21 01:36:10,521][03784] Fps is (10 sec: 36045.2, 60 sec: 41506.1, 300 sec: 46097.4). Total num frames: 477626368. Throughput: 0: 46402.4. Samples: 479179200. Policy #0 lag: (min: 0.0, avg: 36.7, max: 69.0) [2024-03-21 01:36:10,522][03784] Avg episode reward: [(0, '1.049')] [2024-03-21 01:36:14,847][04017] Updated weights for policy 0, policy_version 14585 (0.0014) [2024-03-21 01:36:15,521][03784] Fps is (10 sec: 52429.6, 60 sec: 43144.6, 300 sec: 45986.3). Total num frames: 477954048. Throughput: 0: 45997.9. Samples: 479300100. Policy #0 lag: (min: 0.0, avg: 42.6, max: 98.0) [2024-03-21 01:36:15,522][03784] Avg episode reward: [(0, '1.000')] [2024-03-21 01:36:18,200][04017] Updated weights for policy 0, policy_version 14595 (0.0016) [2024-03-21 01:36:20,487][03995] Signal inference workers to stop experience collection... (9600 times) [2024-03-21 01:36:20,521][03784] Fps is (10 sec: 85197.4, 60 sec: 49698.2, 300 sec: 47319.2). Total num frames: 478478336. Throughput: 0: 45346.7. Samples: 479546800. Policy #0 lag: (min: 0.0, avg: 42.6, max: 98.0) [2024-03-21 01:36:20,522][03784] Avg episode reward: [(0, '0.447')] [2024-03-21 01:36:20,561][04017] InferenceWorker_p0-w0: stopping experience collection (9600 times) [2024-03-21 01:36:20,751][03995] Signal inference workers to resume experience collection... (9600 times) [2024-03-21 01:36:20,752][04017] InferenceWorker_p0-w0: resuming experience collection (9600 times) [2024-03-21 01:36:22,662][04017] Updated weights for policy 0, policy_version 14605 (0.0010) [2024-03-21 01:36:25,521][03784] Fps is (10 sec: 75365.3, 60 sec: 48059.8, 300 sec: 47097.1). Total num frames: 478707712. Throughput: 0: 45588.9. Samples: 479813900. Policy #0 lag: (min: 0.0, avg: 55.0, max: 117.0) [2024-03-21 01:36:25,522][03784] Avg episode reward: [(0, '1.303')] [2024-03-21 01:36:27,704][04017] Updated weights for policy 0, policy_version 14615 (0.0011) [2024-03-21 01:36:30,521][03784] Fps is (10 sec: 49151.3, 60 sec: 46967.4, 300 sec: 46763.8). Total num frames: 478969856. Throughput: 0: 45682.2. Samples: 479955100. Policy #0 lag: (min: 0.0, avg: 55.0, max: 117.0) [2024-03-21 01:36:30,522][03784] Avg episode reward: [(0, '1.104')] [2024-03-21 01:36:35,521][03784] Fps is (10 sec: 32767.5, 60 sec: 45875.0, 300 sec: 46097.3). Total num frames: 479035392. Throughput: 0: 46486.5. Samples: 480261800. Policy #0 lag: (min: 0.0, avg: 36.4, max: 78.0) [2024-03-21 01:36:35,522][03784] Avg episode reward: [(0, '1.383')] [2024-03-21 01:36:40,521][03784] Fps is (10 sec: 22937.5, 60 sec: 45875.1, 300 sec: 46097.3). Total num frames: 479199232. Throughput: 0: 46724.4. Samples: 480550800. Policy #0 lag: (min: 0.0, avg: 36.4, max: 78.0) [2024-03-21 01:36:40,522][03784] Avg episode reward: [(0, '0.804')] [2024-03-21 01:36:40,688][04017] Updated weights for policy 0, policy_version 14625 (0.0012) [2024-03-21 01:36:45,521][03784] Fps is (10 sec: 45875.9, 60 sec: 47513.6, 300 sec: 46097.4). Total num frames: 479494144. Throughput: 0: 46335.6. Samples: 480673700. Policy #0 lag: (min: 0.0, avg: 36.4, max: 78.0) [2024-03-21 01:36:45,522][03784] Avg episode reward: [(0, '1.295')] [2024-03-21 01:36:46,119][04017] Updated weights for policy 0, policy_version 14635 (0.0012) [2024-03-21 01:36:50,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 479625216. Throughput: 0: 45924.5. Samples: 480954000. Policy #0 lag: (min: 1.0, avg: 49.7, max: 98.0) [2024-03-21 01:36:50,522][03784] Avg episode reward: [(0, '0.862')] [2024-03-21 01:36:55,521][03784] Fps is (10 sec: 29491.1, 60 sec: 43690.6, 300 sec: 45654.1). Total num frames: 479789056. Throughput: 0: 45528.8. Samples: 481228000. Policy #0 lag: (min: 1.0, avg: 49.7, max: 98.0) [2024-03-21 01:36:55,522][03784] Avg episode reward: [(0, '1.058')] [2024-03-21 01:36:57,216][04017] Updated weights for policy 0, policy_version 14645 (0.0011) [2024-03-21 01:37:00,521][03784] Fps is (10 sec: 36044.6, 60 sec: 45329.1, 300 sec: 45875.2). Total num frames: 479985664. Throughput: 0: 45844.3. Samples: 481363100. Policy #0 lag: (min: 0.0, avg: 31.7, max: 79.0) [2024-03-21 01:37:00,522][03784] Avg episode reward: [(0, '1.255')] [2024-03-21 01:37:05,089][04017] Updated weights for policy 0, policy_version 14655 (0.0015) [2024-03-21 01:37:05,521][03784] Fps is (10 sec: 45875.8, 60 sec: 46967.5, 300 sec: 46208.4). Total num frames: 480247808. Throughput: 0: 46211.1. Samples: 481626300. Policy #0 lag: (min: 0.0, avg: 31.7, max: 79.0) [2024-03-21 01:37:05,522][03784] Avg episode reward: [(0, '1.152')] [2024-03-21 01:37:10,521][03784] Fps is (10 sec: 52429.0, 60 sec: 48059.7, 300 sec: 46208.4). Total num frames: 480509952. Throughput: 0: 46604.4. Samples: 481911100. Policy #0 lag: (min: 0.0, avg: 51.1, max: 114.0) [2024-03-21 01:37:10,522][03784] Avg episode reward: [(0, '1.095')] [2024-03-21 01:37:10,564][04017] Updated weights for policy 0, policy_version 14665 (0.0015) [2024-03-21 01:37:15,521][03784] Fps is (10 sec: 52428.6, 60 sec: 46967.4, 300 sec: 46763.8). Total num frames: 480772096. Throughput: 0: 46357.9. Samples: 482041200. Policy #0 lag: (min: 0.0, avg: 51.1, max: 114.0) [2024-03-21 01:37:15,522][03784] Avg episode reward: [(0, '0.819')] [2024-03-21 01:37:18,136][04017] Updated weights for policy 0, policy_version 14675 (0.0012) [2024-03-21 01:37:20,521][03784] Fps is (10 sec: 45875.6, 60 sec: 41506.1, 300 sec: 46763.8). Total num frames: 480968704. Throughput: 0: 46009.1. Samples: 482332200. Policy #0 lag: (min: 2.0, avg: 39.8, max: 77.0) [2024-03-21 01:37:20,522][03784] Avg episode reward: [(0, '0.968')] [2024-03-21 01:37:25,521][03784] Fps is (10 sec: 36044.4, 60 sec: 40413.8, 300 sec: 46652.7). Total num frames: 481132544. Throughput: 0: 45957.8. Samples: 482618900. Policy #0 lag: (min: 2.0, avg: 39.8, max: 77.0) [2024-03-21 01:37:25,522][03784] Avg episode reward: [(0, '1.115')] [2024-03-21 01:37:25,719][03995] Signal inference workers to stop experience collection... (9650 times) [2024-03-21 01:37:25,779][03995] Signal inference workers to resume experience collection... (9650 times) [2024-03-21 01:37:25,784][04017] InferenceWorker_p0-w0: stopping experience collection (9650 times) [2024-03-21 01:37:25,832][04017] InferenceWorker_p0-w0: resuming experience collection (9650 times) [2024-03-21 01:37:26,134][04017] Updated weights for policy 0, policy_version 14685 (0.0021) [2024-03-21 01:37:29,383][04017] Updated weights for policy 0, policy_version 14695 (0.0011) [2024-03-21 01:37:30,521][03784] Fps is (10 sec: 65535.6, 60 sec: 44236.8, 300 sec: 47430.3). Total num frames: 481624064. Throughput: 0: 46111.1. Samples: 482748700. Policy #0 lag: (min: 5.0, avg: 41.3, max: 71.0) [2024-03-21 01:37:30,522][03784] Avg episode reward: [(0, '0.526')] [2024-03-21 01:37:35,521][03784] Fps is (10 sec: 68814.0, 60 sec: 46421.5, 300 sec: 46986.0). Total num frames: 481820672. Throughput: 0: 46173.4. Samples: 483031800. Policy #0 lag: (min: 5.0, avg: 41.3, max: 71.0) [2024-03-21 01:37:35,521][03784] Avg episode reward: [(0, '0.638')] [2024-03-21 01:37:35,607][04017] Updated weights for policy 0, policy_version 14705 (0.0018) [2024-03-21 01:37:39,765][04017] Updated weights for policy 0, policy_version 14715 (0.0015) [2024-03-21 01:37:40,521][03784] Fps is (10 sec: 55704.9, 60 sec: 49698.1, 300 sec: 47208.1). Total num frames: 482181120. Throughput: 0: 46259.9. Samples: 483309700. Policy #0 lag: (min: 5.0, avg: 41.3, max: 71.0) [2024-03-21 01:37:40,522][03784] Avg episode reward: [(0, '1.077')] [2024-03-21 01:37:45,521][03784] Fps is (10 sec: 62258.7, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 482443264. Throughput: 0: 46360.1. Samples: 483449300. Policy #0 lag: (min: 1.0, avg: 46.6, max: 104.0) [2024-03-21 01:37:45,522][03784] Avg episode reward: [(0, '1.077')] [2024-03-21 01:37:48,055][04017] Updated weights for policy 0, policy_version 14725 (0.0015) [2024-03-21 01:37:50,521][03784] Fps is (10 sec: 36045.5, 60 sec: 48605.9, 300 sec: 45986.3). Total num frames: 482541568. Throughput: 0: 46926.6. Samples: 483738000. Policy #0 lag: (min: 1.0, avg: 46.6, max: 104.0) [2024-03-21 01:37:50,522][03784] Avg episode reward: [(0, '0.388')] [2024-03-21 01:37:55,521][03784] Fps is (10 sec: 19660.9, 60 sec: 47513.7, 300 sec: 45764.1). Total num frames: 482639872. Throughput: 0: 47646.8. Samples: 484055200. Policy #0 lag: (min: 0.0, avg: 47.7, max: 94.0) [2024-03-21 01:37:55,522][03784] Avg episode reward: [(0, '0.388')] [2024-03-21 01:37:58,072][04017] Updated weights for policy 0, policy_version 14735 (0.0016) [2024-03-21 01:38:00,521][03784] Fps is (10 sec: 39321.4, 60 sec: 49152.1, 300 sec: 46319.5). Total num frames: 482934784. Throughput: 0: 47337.7. Samples: 484171400. Policy #0 lag: (min: 0.0, avg: 47.7, max: 94.0) [2024-03-21 01:38:00,522][03784] Avg episode reward: [(0, '0.862')] [2024-03-21 01:38:00,768][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000014739_482967552.pth... [2024-03-21 01:38:00,875][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000014401_471891968.pth [2024-03-21 01:38:05,079][04017] Updated weights for policy 0, policy_version 14745 (0.0011) [2024-03-21 01:38:05,521][03784] Fps is (10 sec: 55705.8, 60 sec: 49152.0, 300 sec: 46319.5). Total num frames: 483196928. Throughput: 0: 47291.1. Samples: 484460300. Policy #0 lag: (min: 0.0, avg: 36.7, max: 83.0) [2024-03-21 01:38:05,522][03784] Avg episode reward: [(0, '0.862')] [2024-03-21 01:38:08,835][03995] Signal inference workers to stop experience collection... (9700 times) [2024-03-21 01:38:08,904][03995] Signal inference workers to resume experience collection... (9700 times) [2024-03-21 01:38:08,920][04017] InferenceWorker_p0-w0: stopping experience collection (9700 times) [2024-03-21 01:38:08,966][04017] InferenceWorker_p0-w0: resuming experience collection (9700 times) [2024-03-21 01:38:10,521][03784] Fps is (10 sec: 42598.3, 60 sec: 47513.6, 300 sec: 45986.3). Total num frames: 483360768. Throughput: 0: 47129.0. Samples: 484739700. Policy #0 lag: (min: 0.0, avg: 36.7, max: 83.0) [2024-03-21 01:38:10,522][03784] Avg episode reward: [(0, '1.132')] [2024-03-21 01:38:11,898][04017] Updated weights for policy 0, policy_version 14755 (0.0013) [2024-03-21 01:38:15,521][03784] Fps is (10 sec: 42597.9, 60 sec: 47513.5, 300 sec: 46541.7). Total num frames: 483622912. Throughput: 0: 46844.4. Samples: 484856700. Policy #0 lag: (min: 2.0, avg: 26.4, max: 53.0) [2024-03-21 01:38:15,522][03784] Avg episode reward: [(0, '1.052')] [2024-03-21 01:38:20,076][04017] Updated weights for policy 0, policy_version 14765 (0.0022) [2024-03-21 01:38:20,521][03784] Fps is (10 sec: 45875.4, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 483819520. Throughput: 0: 46799.9. Samples: 485137800. Policy #0 lag: (min: 2.0, avg: 26.4, max: 53.0) [2024-03-21 01:38:20,522][03784] Avg episode reward: [(0, '0.958')] [2024-03-21 01:38:25,521][03784] Fps is (10 sec: 45875.5, 60 sec: 49152.1, 300 sec: 46763.8). Total num frames: 484081664. Throughput: 0: 47242.4. Samples: 485435600. Policy #0 lag: (min: 1.0, avg: 34.0, max: 65.0) [2024-03-21 01:38:25,522][03784] Avg episode reward: [(0, '1.400')] [2024-03-21 01:38:27,868][04017] Updated weights for policy 0, policy_version 14775 (0.0011) [2024-03-21 01:38:30,521][03784] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 46208.4). Total num frames: 484245504. Throughput: 0: 47560.0. Samples: 485589500. Policy #0 lag: (min: 1.0, avg: 34.0, max: 65.0) [2024-03-21 01:38:30,522][03784] Avg episode reward: [(0, '1.115')] [2024-03-21 01:38:34,676][04017] Updated weights for policy 0, policy_version 14785 (0.0016) [2024-03-21 01:38:35,521][03784] Fps is (10 sec: 45875.3, 60 sec: 45329.0, 300 sec: 46319.5). Total num frames: 484540416. Throughput: 0: 47280.0. Samples: 485865600. Policy #0 lag: (min: 1.0, avg: 34.0, max: 65.0) [2024-03-21 01:38:35,522][03784] Avg episode reward: [(0, '1.240')] [2024-03-21 01:38:38,430][04017] Updated weights for policy 0, policy_version 14795 (0.0017) [2024-03-21 01:38:40,521][03784] Fps is (10 sec: 72089.3, 60 sec: 46421.4, 300 sec: 47097.1). Total num frames: 484966400. Throughput: 0: 45604.3. Samples: 486107400. Policy #0 lag: (min: 0.0, avg: 37.1, max: 89.0) [2024-03-21 01:38:40,522][03784] Avg episode reward: [(0, '1.189')] [2024-03-21 01:38:44,777][04017] Updated weights for policy 0, policy_version 14805 (0.0010) [2024-03-21 01:38:45,521][03784] Fps is (10 sec: 65535.3, 60 sec: 45875.2, 300 sec: 46874.9). Total num frames: 485195776. Throughput: 0: 46317.7. Samples: 486255700. Policy #0 lag: (min: 0.0, avg: 37.1, max: 89.0) [2024-03-21 01:38:45,522][03784] Avg episode reward: [(0, '1.189')] [2024-03-21 01:38:50,521][03784] Fps is (10 sec: 42598.8, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 485392384. Throughput: 0: 46533.3. Samples: 486554300. Policy #0 lag: (min: 0.0, avg: 50.5, max: 125.0) [2024-03-21 01:38:50,522][03784] Avg episode reward: [(0, '0.507')] [2024-03-21 01:38:52,929][04017] Updated weights for policy 0, policy_version 14815 (0.0015) [2024-03-21 01:38:55,521][03784] Fps is (10 sec: 39322.0, 60 sec: 49152.0, 300 sec: 46430.6). Total num frames: 485588992. Throughput: 0: 46497.8. Samples: 486832100. Policy #0 lag: (min: 0.0, avg: 50.5, max: 125.0) [2024-03-21 01:38:55,522][03784] Avg episode reward: [(0, '1.203')] [2024-03-21 01:38:58,419][04017] Updated weights for policy 0, policy_version 14825 (0.0011) [2024-03-21 01:39:00,521][03784] Fps is (10 sec: 42598.2, 60 sec: 48059.7, 300 sec: 46541.7). Total num frames: 485818368. Throughput: 0: 46575.6. Samples: 486952600. Policy #0 lag: (min: 3.0, avg: 50.8, max: 110.0) [2024-03-21 01:39:00,522][03784] Avg episode reward: [(0, '0.634')] [2024-03-21 01:39:02,009][03995] Signal inference workers to stop experience collection... (9750 times) [2024-03-21 01:39:02,111][04017] InferenceWorker_p0-w0: stopping experience collection (9750 times) [2024-03-21 01:39:02,255][03995] Signal inference workers to resume experience collection... (9750 times) [2024-03-21 01:39:02,255][04017] InferenceWorker_p0-w0: resuming experience collection (9750 times) [2024-03-21 01:39:05,521][03784] Fps is (10 sec: 49151.9, 60 sec: 48059.7, 300 sec: 46541.7). Total num frames: 486080512. Throughput: 0: 47035.6. Samples: 487254400. Policy #0 lag: (min: 3.0, avg: 50.8, max: 110.0) [2024-03-21 01:39:05,522][03784] Avg episode reward: [(0, '1.389')] [2024-03-21 01:39:06,140][04017] Updated weights for policy 0, policy_version 14835 (0.0011) [2024-03-21 01:39:10,521][03784] Fps is (10 sec: 49152.3, 60 sec: 49152.0, 300 sec: 46652.8). Total num frames: 486309888. Throughput: 0: 46173.3. Samples: 487513400. Policy #0 lag: (min: 0.0, avg: 39.5, max: 87.0) [2024-03-21 01:39:10,522][03784] Avg episode reward: [(0, '0.648')] [2024-03-21 01:39:15,352][04017] Updated weights for policy 0, policy_version 14845 (0.0016) [2024-03-21 01:39:15,521][03784] Fps is (10 sec: 36044.4, 60 sec: 46967.4, 300 sec: 45986.3). Total num frames: 486440960. Throughput: 0: 46211.1. Samples: 487669000. Policy #0 lag: (min: 0.0, avg: 39.5, max: 87.0) [2024-03-21 01:39:15,522][03784] Avg episode reward: [(0, '0.979')] [2024-03-21 01:39:20,521][03784] Fps is (10 sec: 32767.8, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 486637568. Throughput: 0: 46542.1. Samples: 487960000. Policy #0 lag: (min: 1.0, avg: 27.8, max: 61.0) [2024-03-21 01:39:20,522][03784] Avg episode reward: [(0, '1.280')] [2024-03-21 01:39:24,837][04017] Updated weights for policy 0, policy_version 14855 (0.0012) [2024-03-21 01:39:25,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 486834176. Throughput: 0: 47151.2. Samples: 488229200. Policy #0 lag: (min: 1.0, avg: 27.8, max: 61.0) [2024-03-21 01:39:25,522][03784] Avg episode reward: [(0, '1.087')] [2024-03-21 01:39:30,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46967.5, 300 sec: 45875.2). Total num frames: 487063552. Throughput: 0: 46751.1. Samples: 488359500. Policy #0 lag: (min: 1.0, avg: 27.8, max: 61.0) [2024-03-21 01:39:30,522][03784] Avg episode reward: [(0, '0.826')] [2024-03-21 01:39:33,064][04017] Updated weights for policy 0, policy_version 14865 (0.0015) [2024-03-21 01:39:35,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45329.0, 300 sec: 45986.3). Total num frames: 487260160. Throughput: 0: 46444.4. Samples: 488644300. Policy #0 lag: (min: 0.0, avg: 32.8, max: 81.0) [2024-03-21 01:39:35,522][03784] Avg episode reward: [(0, '1.236')] [2024-03-21 01:39:37,051][04017] Updated weights for policy 0, policy_version 14875 (0.0011) [2024-03-21 01:39:40,521][03784] Fps is (10 sec: 49152.5, 60 sec: 43144.6, 300 sec: 46541.7). Total num frames: 487555072. Throughput: 0: 46562.2. Samples: 488927400. Policy #0 lag: (min: 0.0, avg: 32.8, max: 81.0) [2024-03-21 01:39:40,522][03784] Avg episode reward: [(0, '1.213')] [2024-03-21 01:39:43,279][04017] Updated weights for policy 0, policy_version 14885 (0.0011) [2024-03-21 01:39:45,521][03784] Fps is (10 sec: 68812.7, 60 sec: 45875.2, 300 sec: 46874.9). Total num frames: 487948288. Throughput: 0: 47017.8. Samples: 489068400. Policy #0 lag: (min: 3.0, avg: 49.8, max: 113.0) [2024-03-21 01:39:45,522][03784] Avg episode reward: [(0, '1.107')] [2024-03-21 01:39:47,132][04017] Updated weights for policy 0, policy_version 14895 (0.0012) [2024-03-21 01:39:50,521][03784] Fps is (10 sec: 62259.1, 60 sec: 46421.3, 300 sec: 46541.7). Total num frames: 488177664. Throughput: 0: 46444.4. Samples: 489344400. Policy #0 lag: (min: 3.0, avg: 49.8, max: 113.0) [2024-03-21 01:39:50,522][03784] Avg episode reward: [(0, '1.107')] [2024-03-21 01:39:55,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46421.3, 300 sec: 46319.5). Total num frames: 488374272. Throughput: 0: 46860.0. Samples: 489622100. Policy #0 lag: (min: 3.0, avg: 49.8, max: 113.0) [2024-03-21 01:39:55,522][03784] Avg episode reward: [(0, '0.668')] [2024-03-21 01:39:56,620][04017] Updated weights for policy 0, policy_version 14905 (0.0015) [2024-03-21 01:40:00,521][03784] Fps is (10 sec: 32767.4, 60 sec: 44782.9, 300 sec: 45764.1). Total num frames: 488505344. Throughput: 0: 46582.1. Samples: 489765200. Policy #0 lag: (min: 0.0, avg: 46.8, max: 97.0) [2024-03-21 01:40:00,522][03784] Avg episode reward: [(0, '1.244')] [2024-03-21 01:40:00,771][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000014909_488538112.pth... [2024-03-21 01:40:00,878][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000014565_477265920.pth [2024-03-21 01:40:01,535][03995] Signal inference workers to stop experience collection... (9800 times) [2024-03-21 01:40:01,536][03995] Signal inference workers to resume experience collection... (9800 times) [2024-03-21 01:40:01,593][04017] InferenceWorker_p0-w0: stopping experience collection (9800 times) [2024-03-21 01:40:01,593][04017] InferenceWorker_p0-w0: resuming experience collection (9800 times) [2024-03-21 01:40:03,158][04017] Updated weights for policy 0, policy_version 14915 (0.0012) [2024-03-21 01:40:05,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45875.1, 300 sec: 46430.6). Total num frames: 488833024. Throughput: 0: 46495.6. Samples: 490052300. Policy #0 lag: (min: 0.0, avg: 46.8, max: 97.0) [2024-03-21 01:40:05,522][03784] Avg episode reward: [(0, '0.999')] [2024-03-21 01:40:09,451][04017] Updated weights for policy 0, policy_version 14925 (0.0011) [2024-03-21 01:40:10,521][03784] Fps is (10 sec: 58983.7, 60 sec: 46421.4, 300 sec: 46541.7). Total num frames: 489095168. Throughput: 0: 46217.9. Samples: 490309000. Policy #0 lag: (min: 2.0, avg: 37.3, max: 69.0) [2024-03-21 01:40:10,522][03784] Avg episode reward: [(0, '0.943')] [2024-03-21 01:40:15,521][03784] Fps is (10 sec: 45875.6, 60 sec: 47513.7, 300 sec: 46763.8). Total num frames: 489291776. Throughput: 0: 46273.4. Samples: 490441800. Policy #0 lag: (min: 2.0, avg: 37.3, max: 69.0) [2024-03-21 01:40:15,522][03784] Avg episode reward: [(0, '0.531')] [2024-03-21 01:40:19,104][04017] Updated weights for policy 0, policy_version 14935 (0.0010) [2024-03-21 01:40:20,521][03784] Fps is (10 sec: 39321.6, 60 sec: 47513.7, 300 sec: 46319.5). Total num frames: 489488384. Throughput: 0: 46100.1. Samples: 490718800. Policy #0 lag: (min: 0.0, avg: 31.2, max: 72.0) [2024-03-21 01:40:20,522][03784] Avg episode reward: [(0, '0.947')] [2024-03-21 01:40:24,044][04017] Updated weights for policy 0, policy_version 14945 (0.0020) [2024-03-21 01:40:25,521][03784] Fps is (10 sec: 52428.8, 60 sec: 49698.2, 300 sec: 46319.5). Total num frames: 489816064. Throughput: 0: 45377.8. Samples: 490969400. Policy #0 lag: (min: 0.0, avg: 31.2, max: 72.0) [2024-03-21 01:40:25,522][03784] Avg episode reward: [(0, '1.083')] [2024-03-21 01:40:30,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46967.5, 300 sec: 46097.4). Total num frames: 489881600. Throughput: 0: 45446.7. Samples: 491113500. Policy #0 lag: (min: 0.0, avg: 31.2, max: 72.0) [2024-03-21 01:40:30,522][03784] Avg episode reward: [(0, '0.701')] [2024-03-21 01:40:33,976][04017] Updated weights for policy 0, policy_version 14955 (0.0011) [2024-03-21 01:40:35,521][03784] Fps is (10 sec: 26214.4, 60 sec: 46967.5, 300 sec: 46208.4). Total num frames: 490078208. Throughput: 0: 45764.5. Samples: 491403800. Policy #0 lag: (min: 0.0, avg: 30.0, max: 71.0) [2024-03-21 01:40:35,522][03784] Avg episode reward: [(0, '0.579')] [2024-03-21 01:40:40,521][03784] Fps is (10 sec: 36044.4, 60 sec: 44782.8, 300 sec: 46097.4). Total num frames: 490242048. Throughput: 0: 46073.3. Samples: 491695400. Policy #0 lag: (min: 0.0, avg: 30.0, max: 71.0) [2024-03-21 01:40:40,522][03784] Avg episode reward: [(0, '1.130')] [2024-03-21 01:40:41,902][04017] Updated weights for policy 0, policy_version 14965 (0.0011) [2024-03-21 01:40:45,521][03784] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 46319.5). Total num frames: 490536960. Throughput: 0: 45797.8. Samples: 491826100. Policy #0 lag: (min: 1.0, avg: 46.4, max: 111.0) [2024-03-21 01:40:45,522][03784] Avg episode reward: [(0, '1.003')] [2024-03-21 01:40:47,473][04017] Updated weights for policy 0, policy_version 14975 (0.0011) [2024-03-21 01:40:50,521][03784] Fps is (10 sec: 49152.2, 60 sec: 42598.4, 300 sec: 45986.3). Total num frames: 490733568. Throughput: 0: 45866.7. Samples: 492116300. Policy #0 lag: (min: 1.0, avg: 46.4, max: 111.0) [2024-03-21 01:40:50,522][03784] Avg episode reward: [(0, '0.708')] [2024-03-21 01:40:55,521][03784] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 46319.5). Total num frames: 490930176. Throughput: 0: 46275.4. Samples: 492391400. Policy #0 lag: (min: 0.0, avg: 50.7, max: 113.0) [2024-03-21 01:40:55,522][03784] Avg episode reward: [(0, '0.711')] [2024-03-21 01:40:56,556][04017] Updated weights for policy 0, policy_version 14985 (0.0013) [2024-03-21 01:40:56,921][03995] Signal inference workers to stop experience collection... (9850 times) [2024-03-21 01:40:56,947][04017] InferenceWorker_p0-w0: stopping experience collection (9850 times) [2024-03-21 01:40:57,141][03995] Signal inference workers to resume experience collection... (9850 times) [2024-03-21 01:40:57,142][04017] InferenceWorker_p0-w0: resuming experience collection (9850 times) [2024-03-21 01:40:59,840][04017] Updated weights for policy 0, policy_version 14995 (0.0022) [2024-03-21 01:41:00,521][03784] Fps is (10 sec: 65535.6, 60 sec: 48059.8, 300 sec: 47319.2). Total num frames: 491388928. Throughput: 0: 46039.9. Samples: 492513600. Policy #0 lag: (min: 0.0, avg: 50.7, max: 113.0) [2024-03-21 01:41:00,522][03784] Avg episode reward: [(0, '1.212')] [2024-03-21 01:41:05,521][03784] Fps is (10 sec: 72088.0, 60 sec: 46967.3, 300 sec: 47541.3). Total num frames: 491651072. Throughput: 0: 46008.5. Samples: 492789200. Policy #0 lag: (min: 0.0, avg: 50.7, max: 113.0) [2024-03-21 01:41:05,523][03784] Avg episode reward: [(0, '1.441')] [2024-03-21 01:41:05,645][04017] Updated weights for policy 0, policy_version 15005 (0.0017) [2024-03-21 01:41:10,521][03784] Fps is (10 sec: 52428.3, 60 sec: 46967.3, 300 sec: 47319.2). Total num frames: 491913216. Throughput: 0: 46586.4. Samples: 493065800. Policy #0 lag: (min: 3.0, avg: 44.3, max: 89.0) [2024-03-21 01:41:10,522][03784] Avg episode reward: [(0, '1.270')] [2024-03-21 01:41:13,029][04017] Updated weights for policy 0, policy_version 15015 (0.0011) [2024-03-21 01:41:15,521][03784] Fps is (10 sec: 49153.1, 60 sec: 47513.5, 300 sec: 46319.5). Total num frames: 492142592. Throughput: 0: 46593.2. Samples: 493210200. Policy #0 lag: (min: 3.0, avg: 44.3, max: 89.0) [2024-03-21 01:41:15,522][03784] Avg episode reward: [(0, '1.146')] [2024-03-21 01:41:19,907][04017] Updated weights for policy 0, policy_version 15025 (0.0023) [2024-03-21 01:41:20,521][03784] Fps is (10 sec: 42599.2, 60 sec: 47513.6, 300 sec: 46208.4). Total num frames: 492339200. Throughput: 0: 46415.5. Samples: 493492500. Policy #0 lag: (min: 0.0, avg: 48.6, max: 123.0) [2024-03-21 01:41:20,522][03784] Avg episode reward: [(0, '1.380')] [2024-03-21 01:41:25,521][03784] Fps is (10 sec: 32768.3, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 492470272. Throughput: 0: 45824.6. Samples: 493757500. Policy #0 lag: (min: 0.0, avg: 48.6, max: 123.0) [2024-03-21 01:41:25,522][03784] Avg episode reward: [(0, '1.284')] [2024-03-21 01:41:29,048][04017] Updated weights for policy 0, policy_version 15035 (0.0022) [2024-03-21 01:41:30,521][03784] Fps is (10 sec: 39321.9, 60 sec: 47513.6, 300 sec: 46430.6). Total num frames: 492732416. Throughput: 0: 45995.7. Samples: 493895900. Policy #0 lag: (min: 3.0, avg: 37.4, max: 85.0) [2024-03-21 01:41:30,522][03784] Avg episode reward: [(0, '0.422')] [2024-03-21 01:41:35,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46967.4, 300 sec: 46430.6). Total num frames: 492896256. Throughput: 0: 45189.0. Samples: 494149800. Policy #0 lag: (min: 3.0, avg: 37.4, max: 85.0) [2024-03-21 01:41:35,522][03784] Avg episode reward: [(0, '1.167')] [2024-03-21 01:41:37,758][04017] Updated weights for policy 0, policy_version 15045 (0.0020) [2024-03-21 01:41:40,521][03784] Fps is (10 sec: 45874.9, 60 sec: 49152.1, 300 sec: 46430.6). Total num frames: 493191168. Throughput: 0: 44826.7. Samples: 494408600. Policy #0 lag: (min: 3.0, avg: 37.4, max: 85.0) [2024-03-21 01:41:40,522][03784] Avg episode reward: [(0, '1.011')] [2024-03-21 01:41:43,781][04017] Updated weights for policy 0, policy_version 15055 (0.0015) [2024-03-21 01:41:45,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46967.6, 300 sec: 46541.7). Total num frames: 493355008. Throughput: 0: 45271.3. Samples: 494550800. Policy #0 lag: (min: 0.0, avg: 33.6, max: 80.0) [2024-03-21 01:41:45,522][03784] Avg episode reward: [(0, '1.149')] [2024-03-21 01:41:50,521][03784] Fps is (10 sec: 29491.2, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 493486080. Throughput: 0: 45871.4. Samples: 494853400. Policy #0 lag: (min: 0.0, avg: 33.6, max: 80.0) [2024-03-21 01:41:50,522][03784] Avg episode reward: [(0, '0.639')] [2024-03-21 01:41:54,688][03995] Signal inference workers to stop experience collection... (9900 times) [2024-03-21 01:41:54,753][04017] InferenceWorker_p0-w0: stopping experience collection (9900 times) [2024-03-21 01:41:54,762][03995] Signal inference workers to resume experience collection... (9900 times) [2024-03-21 01:41:54,804][04017] InferenceWorker_p0-w0: resuming experience collection (9900 times) [2024-03-21 01:41:55,419][04017] Updated weights for policy 0, policy_version 15065 (0.0014) [2024-03-21 01:41:55,521][03784] Fps is (10 sec: 29490.3, 60 sec: 45328.9, 300 sec: 46319.5). Total num frames: 493649920. Throughput: 0: 46411.1. Samples: 495154300. Policy #0 lag: (min: 0.0, avg: 35.5, max: 84.0) [2024-03-21 01:41:55,523][03784] Avg episode reward: [(0, '0.966')] [2024-03-21 01:42:00,521][03784] Fps is (10 sec: 39320.6, 60 sec: 41506.0, 300 sec: 46208.4). Total num frames: 493879296. Throughput: 0: 46466.5. Samples: 495301200. Policy #0 lag: (min: 0.0, avg: 35.5, max: 84.0) [2024-03-21 01:42:00,524][03784] Avg episode reward: [(0, '0.365')] [2024-03-21 01:42:00,765][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000015073_493912064.pth... [2024-03-21 01:42:00,894][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000014739_482967552.pth [2024-03-21 01:42:01,523][04017] Updated weights for policy 0, policy_version 15075 (0.0019) [2024-03-21 01:42:05,521][03784] Fps is (10 sec: 62260.5, 60 sec: 43690.9, 300 sec: 46652.7). Total num frames: 494272512. Throughput: 0: 46322.2. Samples: 495577000. Policy #0 lag: (min: 1.0, avg: 35.6, max: 80.0) [2024-03-21 01:42:05,522][03784] Avg episode reward: [(0, '0.679')] [2024-03-21 01:42:06,708][04017] Updated weights for policy 0, policy_version 15085 (0.0010) [2024-03-21 01:42:10,521][03784] Fps is (10 sec: 68812.8, 60 sec: 44236.8, 300 sec: 46763.8). Total num frames: 494567424. Throughput: 0: 46466.4. Samples: 495848500. Policy #0 lag: (min: 1.0, avg: 35.6, max: 80.0) [2024-03-21 01:42:10,522][03784] Avg episode reward: [(0, '0.979')] [2024-03-21 01:42:11,123][04017] Updated weights for policy 0, policy_version 15095 (0.0016) [2024-03-21 01:42:15,521][03784] Fps is (10 sec: 65536.0, 60 sec: 46421.4, 300 sec: 47319.2). Total num frames: 494927872. Throughput: 0: 45855.5. Samples: 495959400. Policy #0 lag: (min: 1.0, avg: 35.6, max: 80.0) [2024-03-21 01:42:15,522][03784] Avg episode reward: [(0, '0.753')] [2024-03-21 01:42:15,703][04017] Updated weights for policy 0, policy_version 15105 (0.0012) [2024-03-21 01:42:20,521][03784] Fps is (10 sec: 55707.0, 60 sec: 46421.3, 300 sec: 47430.3). Total num frames: 495124480. Throughput: 0: 46722.2. Samples: 496252300. Policy #0 lag: (min: 0.0, avg: 41.5, max: 81.0) [2024-03-21 01:42:20,522][03784] Avg episode reward: [(0, '0.666')] [2024-03-21 01:42:21,840][04017] Updated weights for policy 0, policy_version 15115 (0.0014) [2024-03-21 01:42:25,521][03784] Fps is (10 sec: 42598.6, 60 sec: 48059.7, 300 sec: 46541.7). Total num frames: 495353856. Throughput: 0: 47382.2. Samples: 496540800. Policy #0 lag: (min: 0.0, avg: 41.5, max: 81.0) [2024-03-21 01:42:25,522][03784] Avg episode reward: [(0, '1.230')] [2024-03-21 01:42:29,508][04017] Updated weights for policy 0, policy_version 15125 (0.0010) [2024-03-21 01:42:30,521][03784] Fps is (10 sec: 52428.9, 60 sec: 48605.8, 300 sec: 46874.9). Total num frames: 495648768. Throughput: 0: 47553.3. Samples: 496690700. Policy #0 lag: (min: 0.0, avg: 38.7, max: 78.0) [2024-03-21 01:42:30,522][03784] Avg episode reward: [(0, '1.230')] [2024-03-21 01:42:35,521][03784] Fps is (10 sec: 49151.8, 60 sec: 49152.0, 300 sec: 46319.5). Total num frames: 495845376. Throughput: 0: 47335.5. Samples: 496983500. Policy #0 lag: (min: 0.0, avg: 38.7, max: 78.0) [2024-03-21 01:42:35,522][03784] Avg episode reward: [(0, '0.536')] [2024-03-21 01:42:39,465][04017] Updated weights for policy 0, policy_version 15135 (0.0016) [2024-03-21 01:42:40,521][03784] Fps is (10 sec: 29490.9, 60 sec: 45875.1, 300 sec: 45764.1). Total num frames: 495943680. Throughput: 0: 46642.4. Samples: 497253200. Policy #0 lag: (min: 0.0, avg: 38.7, max: 78.0) [2024-03-21 01:42:40,522][03784] Avg episode reward: [(0, '1.239')] [2024-03-21 01:42:41,741][03995] Signal inference workers to stop experience collection... (9950 times) [2024-03-21 01:42:41,742][03995] Signal inference workers to resume experience collection... (9950 times) [2024-03-21 01:42:41,825][04017] InferenceWorker_p0-w0: stopping experience collection (9950 times) [2024-03-21 01:42:41,826][04017] InferenceWorker_p0-w0: resuming experience collection (9950 times) [2024-03-21 01:42:45,521][03784] Fps is (10 sec: 36044.9, 60 sec: 47513.6, 300 sec: 46319.5). Total num frames: 496205824. Throughput: 0: 45951.4. Samples: 497369000. Policy #0 lag: (min: 0.0, avg: 29.7, max: 82.0) [2024-03-21 01:42:45,522][03784] Avg episode reward: [(0, '0.603')] [2024-03-21 01:42:46,305][04017] Updated weights for policy 0, policy_version 15145 (0.0012) [2024-03-21 01:42:50,521][03784] Fps is (10 sec: 36045.0, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 496304128. Throughput: 0: 46620.0. Samples: 497674900. Policy #0 lag: (min: 0.0, avg: 29.7, max: 82.0) [2024-03-21 01:42:50,522][03784] Avg episode reward: [(0, '1.064')] [2024-03-21 01:42:55,521][03784] Fps is (10 sec: 32768.1, 60 sec: 48059.9, 300 sec: 46097.4). Total num frames: 496533504. Throughput: 0: 46891.4. Samples: 497958600. Policy #0 lag: (min: 1.0, avg: 34.4, max: 85.0) [2024-03-21 01:42:55,522][03784] Avg episode reward: [(0, '0.468')] [2024-03-21 01:42:56,397][04017] Updated weights for policy 0, policy_version 15155 (0.0015) [2024-03-21 01:43:00,521][03784] Fps is (10 sec: 45875.4, 60 sec: 48060.0, 300 sec: 45986.3). Total num frames: 496762880. Throughput: 0: 47340.1. Samples: 498089700. Policy #0 lag: (min: 1.0, avg: 34.4, max: 85.0) [2024-03-21 01:43:00,522][03784] Avg episode reward: [(0, '0.847')] [2024-03-21 01:43:03,751][04017] Updated weights for policy 0, policy_version 15165 (0.0012) [2024-03-21 01:43:05,521][03784] Fps is (10 sec: 49151.9, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 497025024. Throughput: 0: 46875.6. Samples: 498361700. Policy #0 lag: (min: 2.0, avg: 32.2, max: 82.0) [2024-03-21 01:43:05,522][03784] Avg episode reward: [(0, '0.928')] [2024-03-21 01:43:09,496][04017] Updated weights for policy 0, policy_version 15175 (0.0032) [2024-03-21 01:43:10,521][03784] Fps is (10 sec: 52429.0, 60 sec: 45329.3, 300 sec: 46319.5). Total num frames: 497287168. Throughput: 0: 46977.8. Samples: 498654800. Policy #0 lag: (min: 2.0, avg: 32.2, max: 82.0) [2024-03-21 01:43:10,530][03784] Avg episode reward: [(0, '1.238')] [2024-03-21 01:43:15,272][04017] Updated weights for policy 0, policy_version 15185 (0.0016) [2024-03-21 01:43:15,521][03784] Fps is (10 sec: 55705.6, 60 sec: 44236.8, 300 sec: 46652.7). Total num frames: 497582080. Throughput: 0: 46933.3. Samples: 498802700. Policy #0 lag: (min: 2.0, avg: 32.2, max: 82.0) [2024-03-21 01:43:15,522][03784] Avg episode reward: [(0, '1.003')] [2024-03-21 01:43:19,441][04017] Updated weights for policy 0, policy_version 15195 (0.0031) [2024-03-21 01:43:20,521][03784] Fps is (10 sec: 72089.0, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 498008064. Throughput: 0: 46731.1. Samples: 499086400. Policy #0 lag: (min: 3.0, avg: 40.6, max: 94.0) [2024-03-21 01:43:20,522][03784] Avg episode reward: [(0, '1.003')] [2024-03-21 01:43:25,521][03784] Fps is (10 sec: 62259.0, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 498204672. Throughput: 0: 47082.3. Samples: 499371900. Policy #0 lag: (min: 3.0, avg: 40.6, max: 94.0) [2024-03-21 01:43:25,522][03784] Avg episode reward: [(0, '1.003')] [2024-03-21 01:43:26,057][04017] Updated weights for policy 0, policy_version 15205 (0.0020) [2024-03-21 01:43:29,847][03995] Signal inference workers to stop experience collection... (10000 times) [2024-03-21 01:43:29,847][03995] Signal inference workers to resume experience collection... (10000 times) [2024-03-21 01:43:29,889][04017] InferenceWorker_p0-w0: stopping experience collection (10000 times) [2024-03-21 01:43:29,889][04017] InferenceWorker_p0-w0: resuming experience collection (10000 times) [2024-03-21 01:43:30,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45329.0, 300 sec: 46874.9). Total num frames: 498368512. Throughput: 0: 47928.9. Samples: 499525800. Policy #0 lag: (min: 0.0, avg: 41.4, max: 77.0) [2024-03-21 01:43:30,522][03784] Avg episode reward: [(0, '0.545')] [2024-03-21 01:43:33,559][04017] Updated weights for policy 0, policy_version 15215 (0.0012) [2024-03-21 01:43:35,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45329.1, 300 sec: 46097.4). Total num frames: 498565120. Throughput: 0: 47008.9. Samples: 499790300. Policy #0 lag: (min: 0.0, avg: 41.4, max: 77.0) [2024-03-21 01:43:35,522][03784] Avg episode reward: [(0, '1.507')] [2024-03-21 01:43:40,521][03784] Fps is (10 sec: 39321.8, 60 sec: 46967.5, 300 sec: 45986.3). Total num frames: 498761728. Throughput: 0: 47186.7. Samples: 500082000. Policy #0 lag: (min: 0.0, avg: 41.4, max: 77.0) [2024-03-21 01:43:40,522][03784] Avg episode reward: [(0, '0.496')] [2024-03-21 01:43:43,975][04017] Updated weights for policy 0, policy_version 15225 (0.0019) [2024-03-21 01:43:45,521][03784] Fps is (10 sec: 39321.6, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 498958336. Throughput: 0: 47560.0. Samples: 500229900. Policy #0 lag: (min: 0.0, avg: 36.7, max: 76.0) [2024-03-21 01:43:45,522][03784] Avg episode reward: [(0, '0.924')] [2024-03-21 01:43:48,730][04017] Updated weights for policy 0, policy_version 15235 (0.0016) [2024-03-21 01:43:50,521][03784] Fps is (10 sec: 58982.0, 60 sec: 50790.4, 300 sec: 46652.7). Total num frames: 499351552. Throughput: 0: 47666.6. Samples: 500506700. Policy #0 lag: (min: 0.0, avg: 36.7, max: 76.0) [2024-03-21 01:43:50,522][03784] Avg episode reward: [(0, '1.405')] [2024-03-21 01:43:53,069][04017] Updated weights for policy 0, policy_version 15245 (0.0015) [2024-03-21 01:43:55,521][03784] Fps is (10 sec: 62259.2, 60 sec: 50790.4, 300 sec: 46652.8). Total num frames: 499580928. Throughput: 0: 47277.7. Samples: 500782300. Policy #0 lag: (min: 0.0, avg: 36.5, max: 97.0) [2024-03-21 01:43:55,522][03784] Avg episode reward: [(0, '1.133')] [2024-03-21 01:44:00,521][03784] Fps is (10 sec: 36045.0, 60 sec: 49152.0, 300 sec: 46208.4). Total num frames: 499712000. Throughput: 0: 47282.2. Samples: 500930400. Policy #0 lag: (min: 0.0, avg: 36.5, max: 97.0) [2024-03-21 01:44:00,522][03784] Avg episode reward: [(0, '0.734')] [2024-03-21 01:44:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000015250_499712000.pth... [2024-03-21 01:44:00,649][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000014909_488538112.pth [2024-03-21 01:44:04,861][04017] Updated weights for policy 0, policy_version 15255 (0.0011) [2024-03-21 01:44:05,521][03784] Fps is (10 sec: 32767.9, 60 sec: 48059.7, 300 sec: 46097.4). Total num frames: 499908608. Throughput: 0: 46886.7. Samples: 501196300. Policy #0 lag: (min: 0.0, avg: 36.5, max: 97.0) [2024-03-21 01:44:05,522][03784] Avg episode reward: [(0, '1.037')] [2024-03-21 01:44:10,527][03784] Fps is (10 sec: 36024.6, 60 sec: 46417.0, 300 sec: 46207.6). Total num frames: 500072448. Throughput: 0: 46969.8. Samples: 501485800. Policy #0 lag: (min: 0.0, avg: 42.2, max: 93.0) [2024-03-21 01:44:10,527][03784] Avg episode reward: [(0, '1.277')] [2024-03-21 01:44:11,641][04017] Updated weights for policy 0, policy_version 15265 (0.0010) [2024-03-21 01:44:15,521][03784] Fps is (10 sec: 49152.3, 60 sec: 46967.5, 300 sec: 46652.8). Total num frames: 500400128. Throughput: 0: 46551.2. Samples: 501620600. Policy #0 lag: (min: 0.0, avg: 42.2, max: 93.0) [2024-03-21 01:44:15,522][03784] Avg episode reward: [(0, '0.532')] [2024-03-21 01:44:16,458][04017] Updated weights for policy 0, policy_version 15275 (0.0012) [2024-03-21 01:44:20,521][03784] Fps is (10 sec: 68850.7, 60 sec: 45875.2, 300 sec: 47208.1). Total num frames: 500760576. Throughput: 0: 46648.8. Samples: 501889500. Policy #0 lag: (min: 3.0, avg: 40.8, max: 85.0) [2024-03-21 01:44:20,522][03784] Avg episode reward: [(0, '0.716')] [2024-03-21 01:44:25,521][03784] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 46652.8). Total num frames: 500826112. Throughput: 0: 47124.4. Samples: 502202600. Policy #0 lag: (min: 3.0, avg: 40.8, max: 85.0) [2024-03-21 01:44:25,522][03784] Avg episode reward: [(0, '1.281')] [2024-03-21 01:44:25,950][04017] Updated weights for policy 0, policy_version 15285 (0.0018) [2024-03-21 01:44:25,980][03995] Signal inference workers to stop experience collection... (10050 times) [2024-03-21 01:44:26,090][04017] InferenceWorker_p0-w0: stopping experience collection (10050 times) [2024-03-21 01:44:26,198][03995] Signal inference workers to resume experience collection... (10050 times) [2024-03-21 01:44:26,198][04017] InferenceWorker_p0-w0: resuming experience collection (10050 times) [2024-03-21 01:44:30,521][03784] Fps is (10 sec: 26214.5, 60 sec: 44236.8, 300 sec: 46652.7). Total num frames: 501022720. Throughput: 0: 46862.2. Samples: 502338700. Policy #0 lag: (min: 3.0, avg: 40.8, max: 85.0) [2024-03-21 01:44:30,522][03784] Avg episode reward: [(0, '1.300')] [2024-03-21 01:44:32,532][04017] Updated weights for policy 0, policy_version 15295 (0.0012) [2024-03-21 01:44:35,521][03784] Fps is (10 sec: 58981.3, 60 sec: 47513.4, 300 sec: 46985.9). Total num frames: 501415936. Throughput: 0: 47139.8. Samples: 502628000. Policy #0 lag: (min: 0.0, avg: 27.8, max: 59.0) [2024-03-21 01:44:35,522][03784] Avg episode reward: [(0, '1.218')] [2024-03-21 01:44:37,249][04017] Updated weights for policy 0, policy_version 15305 (0.0016) [2024-03-21 01:44:40,521][03784] Fps is (10 sec: 55704.7, 60 sec: 46967.3, 300 sec: 46208.4). Total num frames: 501579776. Throughput: 0: 46915.3. Samples: 502893500. Policy #0 lag: (min: 0.0, avg: 27.8, max: 59.0) [2024-03-21 01:44:40,522][03784] Avg episode reward: [(0, '0.631')] [2024-03-21 01:44:45,387][04017] Updated weights for policy 0, policy_version 15315 (0.0014) [2024-03-21 01:44:45,521][03784] Fps is (10 sec: 42599.3, 60 sec: 48059.7, 300 sec: 46319.5). Total num frames: 501841920. Throughput: 0: 46342.2. Samples: 503015800. Policy #0 lag: (min: 0.0, avg: 32.6, max: 67.0) [2024-03-21 01:44:45,522][03784] Avg episode reward: [(0, '1.035')] [2024-03-21 01:44:50,521][03784] Fps is (10 sec: 49153.2, 60 sec: 45329.1, 300 sec: 46430.6). Total num frames: 502071296. Throughput: 0: 45855.6. Samples: 503259800. Policy #0 lag: (min: 0.0, avg: 32.6, max: 67.0) [2024-03-21 01:44:50,522][03784] Avg episode reward: [(0, '1.065')] [2024-03-21 01:44:51,478][04017] Updated weights for policy 0, policy_version 15325 (0.0018) [2024-03-21 01:44:55,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 502300672. Throughput: 0: 45345.6. Samples: 503526100. Policy #0 lag: (min: 0.0, avg: 40.4, max: 113.0) [2024-03-21 01:44:55,522][03784] Avg episode reward: [(0, '1.159')] [2024-03-21 01:44:58,307][04017] Updated weights for policy 0, policy_version 15335 (0.0011) [2024-03-21 01:45:00,521][03784] Fps is (10 sec: 52428.3, 60 sec: 48059.7, 300 sec: 46652.7). Total num frames: 502595584. Throughput: 0: 45415.5. Samples: 503664300. Policy #0 lag: (min: 0.0, avg: 40.4, max: 113.0) [2024-03-21 01:45:00,522][03784] Avg episode reward: [(0, '1.120')] [2024-03-21 01:45:05,521][03784] Fps is (10 sec: 49151.4, 60 sec: 48059.7, 300 sec: 46430.6). Total num frames: 502792192. Throughput: 0: 46157.7. Samples: 503966600. Policy #0 lag: (min: 0.0, avg: 40.4, max: 113.0) [2024-03-21 01:45:05,522][03784] Avg episode reward: [(0, '1.120')] [2024-03-21 01:45:05,602][04017] Updated weights for policy 0, policy_version 15345 (0.0011) [2024-03-21 01:45:10,521][03784] Fps is (10 sec: 39321.8, 60 sec: 48610.4, 300 sec: 46430.6). Total num frames: 502988800. Throughput: 0: 45411.1. Samples: 504246100. Policy #0 lag: (min: 0.0, avg: 38.6, max: 86.0) [2024-03-21 01:45:10,522][03784] Avg episode reward: [(0, '1.286')] [2024-03-21 01:45:15,521][03784] Fps is (10 sec: 22937.8, 60 sec: 43690.6, 300 sec: 45875.2). Total num frames: 503021568. Throughput: 0: 45835.5. Samples: 504401300. Policy #0 lag: (min: 0.0, avg: 38.6, max: 86.0) [2024-03-21 01:45:15,522][03784] Avg episode reward: [(0, '0.959')] [2024-03-21 01:45:18,413][04017] Updated weights for policy 0, policy_version 15355 (0.0010) [2024-03-21 01:45:20,521][03784] Fps is (10 sec: 26214.1, 60 sec: 41506.1, 300 sec: 45541.9). Total num frames: 503250944. Throughput: 0: 45784.5. Samples: 504688300. Policy #0 lag: (min: 0.0, avg: 20.7, max: 75.0) [2024-03-21 01:45:20,522][03784] Avg episode reward: [(0, '1.194')] [2024-03-21 01:45:25,487][04017] Updated weights for policy 0, policy_version 15365 (0.0017) [2024-03-21 01:45:25,521][03784] Fps is (10 sec: 45874.6, 60 sec: 44236.7, 300 sec: 46097.3). Total num frames: 503480320. Throughput: 0: 45697.8. Samples: 504949900. Policy #0 lag: (min: 0.0, avg: 20.7, max: 75.0) [2024-03-21 01:45:25,522][03784] Avg episode reward: [(0, '0.859')] [2024-03-21 01:45:26,199][03995] Signal inference workers to stop experience collection... (10100 times) [2024-03-21 01:45:26,237][04017] InferenceWorker_p0-w0: stopping experience collection (10100 times) [2024-03-21 01:45:26,519][03995] Signal inference workers to resume experience collection... (10100 times) [2024-03-21 01:45:26,519][04017] InferenceWorker_p0-w0: resuming experience collection (10100 times) [2024-03-21 01:45:29,138][04017] Updated weights for policy 0, policy_version 15375 (0.0013) [2024-03-21 01:45:30,521][03784] Fps is (10 sec: 68813.8, 60 sec: 48605.9, 300 sec: 46986.0). Total num frames: 503939072. Throughput: 0: 45448.9. Samples: 505061000. Policy #0 lag: (min: 3.0, avg: 38.5, max: 75.0) [2024-03-21 01:45:30,522][03784] Avg episode reward: [(0, '0.762')] [2024-03-21 01:45:33,216][04017] Updated weights for policy 0, policy_version 15385 (0.0020) [2024-03-21 01:45:35,521][03784] Fps is (10 sec: 78644.1, 60 sec: 47513.7, 300 sec: 47541.4). Total num frames: 504266752. Throughput: 0: 46133.2. Samples: 505335800. Policy #0 lag: (min: 3.0, avg: 38.5, max: 75.0) [2024-03-21 01:45:35,522][03784] Avg episode reward: [(0, '1.225')] [2024-03-21 01:45:40,521][03784] Fps is (10 sec: 45874.7, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 504397824. Throughput: 0: 46646.6. Samples: 505625200. Policy #0 lag: (min: 3.0, avg: 38.5, max: 75.0) [2024-03-21 01:45:40,522][03784] Avg episode reward: [(0, '0.834')] [2024-03-21 01:45:41,137][04017] Updated weights for policy 0, policy_version 15395 (0.0013) [2024-03-21 01:45:45,521][03784] Fps is (10 sec: 29491.2, 60 sec: 45329.0, 300 sec: 46874.9). Total num frames: 504561664. Throughput: 0: 47037.8. Samples: 505781000. Policy #0 lag: (min: 1.0, avg: 35.3, max: 62.0) [2024-03-21 01:45:45,522][03784] Avg episode reward: [(0, '1.169')] [2024-03-21 01:45:49,611][04017] Updated weights for policy 0, policy_version 15405 (0.0009) [2024-03-21 01:45:50,521][03784] Fps is (10 sec: 45875.6, 60 sec: 46421.3, 300 sec: 47208.1). Total num frames: 504856576. Throughput: 0: 46635.7. Samples: 506065200. Policy #0 lag: (min: 1.0, avg: 35.3, max: 62.0) [2024-03-21 01:45:50,522][03784] Avg episode reward: [(0, '1.235')] [2024-03-21 01:45:55,521][03784] Fps is (10 sec: 52429.1, 60 sec: 46421.3, 300 sec: 46430.6). Total num frames: 505085952. Throughput: 0: 46520.0. Samples: 506339500. Policy #0 lag: (min: 0.0, avg: 50.1, max: 120.0) [2024-03-21 01:45:55,523][03784] Avg episode reward: [(0, '1.127')] [2024-03-21 01:45:55,970][04017] Updated weights for policy 0, policy_version 15415 (0.0013) [2024-03-21 01:46:00,521][03784] Fps is (10 sec: 42598.2, 60 sec: 44782.9, 300 sec: 46208.5). Total num frames: 505282560. Throughput: 0: 45562.2. Samples: 506451600. Policy #0 lag: (min: 0.0, avg: 50.1, max: 120.0) [2024-03-21 01:46:00,522][03784] Avg episode reward: [(0, '1.278')] [2024-03-21 01:46:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000015420_505282560.pth... [2024-03-21 01:46:00,653][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000015073_493912064.pth [2024-03-21 01:46:05,520][04017] Updated weights for policy 0, policy_version 15425 (0.0018) [2024-03-21 01:46:05,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44236.9, 300 sec: 45875.2). Total num frames: 505446400. Throughput: 0: 44946.8. Samples: 506710900. Policy #0 lag: (min: 1.0, avg: 47.7, max: 100.0) [2024-03-21 01:46:05,522][03784] Avg episode reward: [(0, '0.447')] [2024-03-21 01:46:10,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 505708544. Throughput: 0: 44413.5. Samples: 506948500. Policy #0 lag: (min: 1.0, avg: 47.7, max: 100.0) [2024-03-21 01:46:10,522][03784] Avg episode reward: [(0, '0.887')] [2024-03-21 01:46:12,604][04017] Updated weights for policy 0, policy_version 15435 (0.0014) [2024-03-21 01:46:14,455][03995] Signal inference workers to stop experience collection... (10150 times) [2024-03-21 01:46:14,456][03995] Signal inference workers to resume experience collection... (10150 times) [2024-03-21 01:46:14,543][04017] InferenceWorker_p0-w0: stopping experience collection (10150 times) [2024-03-21 01:46:14,543][04017] InferenceWorker_p0-w0: resuming experience collection (10150 times) [2024-03-21 01:46:15,521][03784] Fps is (10 sec: 55705.8, 60 sec: 49698.2, 300 sec: 46319.5). Total num frames: 506003456. Throughput: 0: 45115.6. Samples: 507091200. Policy #0 lag: (min: 1.0, avg: 47.7, max: 100.0) [2024-03-21 01:46:15,522][03784] Avg episode reward: [(0, '0.712')] [2024-03-21 01:46:19,016][04017] Updated weights for policy 0, policy_version 15445 (0.0010) [2024-03-21 01:46:20,521][03784] Fps is (10 sec: 39321.3, 60 sec: 47513.7, 300 sec: 46208.4). Total num frames: 506101760. Throughput: 0: 45357.8. Samples: 507376900. Policy #0 lag: (min: 0.0, avg: 51.6, max: 107.0) [2024-03-21 01:46:20,522][03784] Avg episode reward: [(0, '1.125')] [2024-03-21 01:46:24,990][04017] Updated weights for policy 0, policy_version 15455 (0.0013) [2024-03-21 01:46:25,521][03784] Fps is (10 sec: 45875.3, 60 sec: 49698.3, 300 sec: 46541.7). Total num frames: 506462208. Throughput: 0: 45086.8. Samples: 507654100. Policy #0 lag: (min: 0.0, avg: 51.6, max: 107.0) [2024-03-21 01:46:25,522][03784] Avg episode reward: [(0, '0.813')] [2024-03-21 01:46:30,521][03784] Fps is (10 sec: 45875.0, 60 sec: 43690.6, 300 sec: 46319.5). Total num frames: 506560512. Throughput: 0: 45224.4. Samples: 507816100. Policy #0 lag: (min: 0.0, avg: 40.4, max: 76.0) [2024-03-21 01:46:30,522][03784] Avg episode reward: [(0, '0.813')] [2024-03-21 01:46:35,146][04017] Updated weights for policy 0, policy_version 15465 (0.0011) [2024-03-21 01:46:35,521][03784] Fps is (10 sec: 32767.5, 60 sec: 42052.2, 300 sec: 46097.3). Total num frames: 506789888. Throughput: 0: 45782.1. Samples: 508125400. Policy #0 lag: (min: 0.0, avg: 40.4, max: 76.0) [2024-03-21 01:46:35,522][03784] Avg episode reward: [(0, '1.563')] [2024-03-21 01:46:40,521][03784] Fps is (10 sec: 29491.5, 60 sec: 40960.1, 300 sec: 45764.1). Total num frames: 506855424. Throughput: 0: 46200.0. Samples: 508418500. Policy #0 lag: (min: 0.0, avg: 40.4, max: 76.0) [2024-03-21 01:46:40,522][03784] Avg episode reward: [(0, '1.563')] [2024-03-21 01:46:44,551][04017] Updated weights for policy 0, policy_version 15475 (0.0022) [2024-03-21 01:46:45,521][03784] Fps is (10 sec: 39321.6, 60 sec: 43690.6, 300 sec: 46430.6). Total num frames: 507183104. Throughput: 0: 46975.5. Samples: 508565500. Policy #0 lag: (min: 0.0, avg: 36.8, max: 103.0) [2024-03-21 01:46:45,522][03784] Avg episode reward: [(0, '0.443')] [2024-03-21 01:46:48,382][04017] Updated weights for policy 0, policy_version 15485 (0.0016) [2024-03-21 01:46:50,521][03784] Fps is (10 sec: 62258.7, 60 sec: 43690.6, 300 sec: 46874.9). Total num frames: 507478016. Throughput: 0: 47044.4. Samples: 508827900. Policy #0 lag: (min: 0.0, avg: 36.8, max: 103.0) [2024-03-21 01:46:50,522][03784] Avg episode reward: [(0, '0.827')] [2024-03-21 01:46:54,450][04017] Updated weights for policy 0, policy_version 15495 (0.0009) [2024-03-21 01:46:55,521][03784] Fps is (10 sec: 58983.1, 60 sec: 44782.9, 300 sec: 47097.1). Total num frames: 507772928. Throughput: 0: 47933.3. Samples: 509105500. Policy #0 lag: (min: 3.0, avg: 39.1, max: 73.0) [2024-03-21 01:46:55,522][03784] Avg episode reward: [(0, '0.827')] [2024-03-21 01:46:58,811][04017] Updated weights for policy 0, policy_version 15505 (0.0011) [2024-03-21 01:47:00,123][03995] Signal inference workers to stop experience collection... (10200 times) [2024-03-21 01:47:00,183][03995] Signal inference workers to resume experience collection... (10200 times) [2024-03-21 01:47:00,203][04017] InferenceWorker_p0-w0: stopping experience collection (10200 times) [2024-03-21 01:47:00,253][04017] InferenceWorker_p0-w0: resuming experience collection (10200 times) [2024-03-21 01:47:00,521][03784] Fps is (10 sec: 75367.2, 60 sec: 49152.1, 300 sec: 47319.2). Total num frames: 508231680. Throughput: 0: 47562.2. Samples: 509231500. Policy #0 lag: (min: 3.0, avg: 39.1, max: 73.0) [2024-03-21 01:47:00,522][03784] Avg episode reward: [(0, '1.037')] [2024-03-21 01:47:04,882][04017] Updated weights for policy 0, policy_version 15515 (0.0011) [2024-03-21 01:47:05,521][03784] Fps is (10 sec: 68812.6, 60 sec: 50244.2, 300 sec: 47097.1). Total num frames: 508461056. Throughput: 0: 47742.3. Samples: 509525300. Policy #0 lag: (min: 0.0, avg: 43.0, max: 74.0) [2024-03-21 01:47:05,522][03784] Avg episode reward: [(0, '0.662')] [2024-03-21 01:47:09,949][04017] Updated weights for policy 0, policy_version 15525 (0.0017) [2024-03-21 01:47:10,521][03784] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 46874.9). Total num frames: 508755968. Throughput: 0: 47413.2. Samples: 509787700. Policy #0 lag: (min: 0.0, avg: 43.0, max: 74.0) [2024-03-21 01:47:10,522][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 01:47:15,521][03784] Fps is (10 sec: 42598.5, 60 sec: 48059.7, 300 sec: 46652.7). Total num frames: 508887040. Throughput: 0: 47122.3. Samples: 509936600. Policy #0 lag: (min: 0.0, avg: 43.0, max: 74.0) [2024-03-21 01:47:15,522][03784] Avg episode reward: [(0, '0.950')] [2024-03-21 01:47:19,676][04017] Updated weights for policy 0, policy_version 15535 (0.0011) [2024-03-21 01:47:20,521][03784] Fps is (10 sec: 29491.4, 60 sec: 49152.1, 300 sec: 46430.6). Total num frames: 509050880. Throughput: 0: 46435.7. Samples: 510215000. Policy #0 lag: (min: 0.0, avg: 49.0, max: 84.0) [2024-03-21 01:47:20,522][03784] Avg episode reward: [(0, '1.194')] [2024-03-21 01:47:25,521][03784] Fps is (10 sec: 36044.9, 60 sec: 46421.3, 300 sec: 46097.4). Total num frames: 509247488. Throughput: 0: 45860.0. Samples: 510482200. Policy #0 lag: (min: 0.0, avg: 49.0, max: 84.0) [2024-03-21 01:47:25,522][03784] Avg episode reward: [(0, '0.736')] [2024-03-21 01:47:30,521][03784] Fps is (10 sec: 22937.5, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 509280256. Throughput: 0: 45791.2. Samples: 510626100. Policy #0 lag: (min: 0.0, avg: 49.0, max: 84.0) [2024-03-21 01:47:30,522][03784] Avg episode reward: [(0, '0.587')] [2024-03-21 01:47:34,273][04017] Updated weights for policy 0, policy_version 15545 (0.0012) [2024-03-21 01:47:35,521][03784] Fps is (10 sec: 22937.5, 60 sec: 44783.0, 300 sec: 45875.2). Total num frames: 509476864. Throughput: 0: 46628.9. Samples: 510926200. Policy #0 lag: (min: 0.0, avg: 41.5, max: 85.0) [2024-03-21 01:47:35,522][03784] Avg episode reward: [(0, '0.800')] [2024-03-21 01:47:40,285][04017] Updated weights for policy 0, policy_version 15555 (0.0011) [2024-03-21 01:47:40,521][03784] Fps is (10 sec: 42598.5, 60 sec: 47513.6, 300 sec: 45764.1). Total num frames: 509706240. Throughput: 0: 46915.5. Samples: 511216700. Policy #0 lag: (min: 0.0, avg: 41.5, max: 85.0) [2024-03-21 01:47:40,522][03784] Avg episode reward: [(0, '1.333')] [2024-03-21 01:47:44,620][04017] Updated weights for policy 0, policy_version 15565 (0.0024) [2024-03-21 01:47:45,521][03784] Fps is (10 sec: 62258.9, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 510099456. Throughput: 0: 47211.0. Samples: 511356000. Policy #0 lag: (min: 2.0, avg: 26.7, max: 66.0) [2024-03-21 01:47:45,522][03784] Avg episode reward: [(0, '0.892')] [2024-03-21 01:47:50,521][03784] Fps is (10 sec: 62259.5, 60 sec: 47513.7, 300 sec: 46763.8). Total num frames: 510328832. Throughput: 0: 47046.7. Samples: 511642400. Policy #0 lag: (min: 2.0, avg: 26.7, max: 66.0) [2024-03-21 01:47:50,522][03784] Avg episode reward: [(0, '1.336')] [2024-03-21 01:47:50,703][04017] Updated weights for policy 0, policy_version 15575 (0.0011) [2024-03-21 01:47:55,521][03784] Fps is (10 sec: 49152.4, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 510590976. Throughput: 0: 47682.3. Samples: 511933400. Policy #0 lag: (min: 2.0, avg: 26.7, max: 66.0) [2024-03-21 01:47:55,522][03784] Avg episode reward: [(0, '1.336')] [2024-03-21 01:47:56,790][04017] Updated weights for policy 0, policy_version 15585 (0.0011) [2024-03-21 01:48:00,521][03784] Fps is (10 sec: 49151.8, 60 sec: 43144.5, 300 sec: 46763.8). Total num frames: 510820352. Throughput: 0: 47544.4. Samples: 512076100. Policy #0 lag: (min: 0.0, avg: 36.7, max: 68.0) [2024-03-21 01:48:00,522][03784] Avg episode reward: [(0, '1.051')] [2024-03-21 01:48:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000015589_510820352.pth... [2024-03-21 01:48:00,666][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000015250_499712000.pth [2024-03-21 01:48:02,648][03995] Signal inference workers to stop experience collection... (10250 times) [2024-03-21 01:48:02,649][03995] Signal inference workers to resume experience collection... (10250 times) [2024-03-21 01:48:02,706][04017] InferenceWorker_p0-w0: stopping experience collection (10250 times) [2024-03-21 01:48:02,706][04017] InferenceWorker_p0-w0: resuming experience collection (10250 times) [2024-03-21 01:48:03,647][04017] Updated weights for policy 0, policy_version 15595 (0.0020) [2024-03-21 01:48:05,521][03784] Fps is (10 sec: 55705.2, 60 sec: 44782.9, 300 sec: 46986.0). Total num frames: 511148032. Throughput: 0: 47426.6. Samples: 512349200. Policy #0 lag: (min: 0.0, avg: 36.7, max: 68.0) [2024-03-21 01:48:05,522][03784] Avg episode reward: [(0, '1.133')] [2024-03-21 01:48:08,413][04017] Updated weights for policy 0, policy_version 15605 (0.0011) [2024-03-21 01:48:10,521][03784] Fps is (10 sec: 65535.4, 60 sec: 45329.0, 300 sec: 47097.0). Total num frames: 511475712. Throughput: 0: 47224.3. Samples: 512607300. Policy #0 lag: (min: 0.0, avg: 46.1, max: 82.0) [2024-03-21 01:48:10,522][03784] Avg episode reward: [(0, '0.860')] [2024-03-21 01:48:15,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 45986.3). Total num frames: 511574016. Throughput: 0: 47271.1. Samples: 512753300. Policy #0 lag: (min: 0.0, avg: 46.1, max: 82.0) [2024-03-21 01:48:15,522][03784] Avg episode reward: [(0, '1.012')] [2024-03-21 01:48:16,454][04017] Updated weights for policy 0, policy_version 15615 (0.0010) [2024-03-21 01:48:20,521][03784] Fps is (10 sec: 29491.3, 60 sec: 45329.0, 300 sec: 45986.3). Total num frames: 511770624. Throughput: 0: 47253.3. Samples: 513052600. Policy #0 lag: (min: 0.0, avg: 46.1, max: 82.0) [2024-03-21 01:48:20,522][03784] Avg episode reward: [(0, '1.012')] [2024-03-21 01:48:24,340][04017] Updated weights for policy 0, policy_version 15625 (0.0010) [2024-03-21 01:48:25,521][03784] Fps is (10 sec: 52429.4, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 512098304. Throughput: 0: 47257.8. Samples: 513343300. Policy #0 lag: (min: 0.0, avg: 51.0, max: 117.0) [2024-03-21 01:48:25,522][03784] Avg episode reward: [(0, '0.869')] [2024-03-21 01:48:29,348][04017] Updated weights for policy 0, policy_version 15635 (0.0013) [2024-03-21 01:48:30,521][03784] Fps is (10 sec: 55706.0, 60 sec: 50790.4, 300 sec: 46652.7). Total num frames: 512327680. Throughput: 0: 47144.5. Samples: 513477500. Policy #0 lag: (min: 0.0, avg: 51.0, max: 117.0) [2024-03-21 01:48:30,522][03784] Avg episode reward: [(0, '0.975')] [2024-03-21 01:48:35,521][03784] Fps is (10 sec: 39321.1, 60 sec: 50244.2, 300 sec: 46541.7). Total num frames: 512491520. Throughput: 0: 46757.7. Samples: 513746500. Policy #0 lag: (min: 0.0, avg: 41.8, max: 85.0) [2024-03-21 01:48:35,522][03784] Avg episode reward: [(0, '1.265')] [2024-03-21 01:48:39,069][04017] Updated weights for policy 0, policy_version 15645 (0.0011) [2024-03-21 01:48:40,521][03784] Fps is (10 sec: 36044.7, 60 sec: 49698.1, 300 sec: 46541.7). Total num frames: 512688128. Throughput: 0: 46600.0. Samples: 514030400. Policy #0 lag: (min: 0.0, avg: 41.8, max: 85.0) [2024-03-21 01:48:40,522][03784] Avg episode reward: [(0, '0.675')] [2024-03-21 01:48:45,371][04017] Updated weights for policy 0, policy_version 15655 (0.0015) [2024-03-21 01:48:45,521][03784] Fps is (10 sec: 49152.7, 60 sec: 48059.8, 300 sec: 46208.5). Total num frames: 512983040. Throughput: 0: 46582.3. Samples: 514172300. Policy #0 lag: (min: 0.0, avg: 25.7, max: 69.0) [2024-03-21 01:48:45,521][03784] Avg episode reward: [(0, '1.105')] [2024-03-21 01:48:50,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 513114112. Throughput: 0: 46917.8. Samples: 514460500. Policy #0 lag: (min: 0.0, avg: 25.7, max: 69.0) [2024-03-21 01:48:50,522][03784] Avg episode reward: [(0, '0.764')] [2024-03-21 01:48:53,798][04017] Updated weights for policy 0, policy_version 15665 (0.0015) [2024-03-21 01:48:55,521][03784] Fps is (10 sec: 42598.1, 60 sec: 46967.4, 300 sec: 46430.6). Total num frames: 513409024. Throughput: 0: 46960.1. Samples: 514720500. Policy #0 lag: (min: 0.0, avg: 25.7, max: 69.0) [2024-03-21 01:48:55,522][03784] Avg episode reward: [(0, '1.281')] [2024-03-21 01:48:55,916][03995] Signal inference workers to stop experience collection... (10300 times) [2024-03-21 01:48:55,978][04017] InferenceWorker_p0-w0: stopping experience collection (10300 times) [2024-03-21 01:48:55,981][03995] Signal inference workers to resume experience collection... (10300 times) [2024-03-21 01:48:56,023][04017] InferenceWorker_p0-w0: resuming experience collection (10300 times) [2024-03-21 01:48:58,927][04017] Updated weights for policy 0, policy_version 15675 (0.0011) [2024-03-21 01:49:00,521][03784] Fps is (10 sec: 55705.4, 60 sec: 47513.6, 300 sec: 46652.7). Total num frames: 513671168. Throughput: 0: 46855.6. Samples: 514861800. Policy #0 lag: (min: 1.0, avg: 49.7, max: 118.0) [2024-03-21 01:49:00,522][03784] Avg episode reward: [(0, '1.088')] [2024-03-21 01:49:05,521][03784] Fps is (10 sec: 49151.3, 60 sec: 45875.1, 300 sec: 46875.8). Total num frames: 513900544. Throughput: 0: 46668.8. Samples: 515152700. Policy #0 lag: (min: 1.0, avg: 49.7, max: 118.0) [2024-03-21 01:49:05,522][03784] Avg episode reward: [(0, '1.175')] [2024-03-21 01:49:08,047][04017] Updated weights for policy 0, policy_version 15685 (0.0015) [2024-03-21 01:49:10,521][03784] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 46319.5). Total num frames: 514064384. Throughput: 0: 46968.7. Samples: 515456900. Policy #0 lag: (min: 0.0, avg: 41.2, max: 86.0) [2024-03-21 01:49:10,522][03784] Avg episode reward: [(0, '1.175')] [2024-03-21 01:49:13,694][04017] Updated weights for policy 0, policy_version 15695 (0.0012) [2024-03-21 01:49:15,521][03784] Fps is (10 sec: 49152.5, 60 sec: 46967.5, 300 sec: 46208.4). Total num frames: 514392064. Throughput: 0: 46911.1. Samples: 515588500. Policy #0 lag: (min: 0.0, avg: 41.2, max: 86.0) [2024-03-21 01:49:15,522][03784] Avg episode reward: [(0, '0.687')] [2024-03-21 01:49:19,502][04017] Updated weights for policy 0, policy_version 15705 (0.0012) [2024-03-21 01:49:20,521][03784] Fps is (10 sec: 58983.2, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 514654208. Throughput: 0: 46795.7. Samples: 515852300. Policy #0 lag: (min: 0.0, avg: 41.2, max: 86.0) [2024-03-21 01:49:20,522][03784] Avg episode reward: [(0, '0.687')] [2024-03-21 01:49:25,521][03784] Fps is (10 sec: 52429.0, 60 sec: 46967.4, 300 sec: 47097.1). Total num frames: 514916352. Throughput: 0: 45788.9. Samples: 516090900. Policy #0 lag: (min: 0.0, avg: 47.6, max: 102.0) [2024-03-21 01:49:25,522][03784] Avg episode reward: [(0, '1.160')] [2024-03-21 01:49:26,257][04017] Updated weights for policy 0, policy_version 15715 (0.0017) [2024-03-21 01:49:30,521][03784] Fps is (10 sec: 49151.6, 60 sec: 46967.4, 300 sec: 46541.7). Total num frames: 515145728. Throughput: 0: 45873.2. Samples: 516236600. Policy #0 lag: (min: 0.0, avg: 47.6, max: 102.0) [2024-03-21 01:49:30,522][03784] Avg episode reward: [(0, '0.973')] [2024-03-21 01:49:32,260][04017] Updated weights for policy 0, policy_version 15725 (0.0012) [2024-03-21 01:49:35,521][03784] Fps is (10 sec: 39321.8, 60 sec: 46967.6, 300 sec: 46541.7). Total num frames: 515309568. Throughput: 0: 45668.9. Samples: 516515600. Policy #0 lag: (min: 0.0, avg: 40.9, max: 83.0) [2024-03-21 01:49:35,522][03784] Avg episode reward: [(0, '0.897')] [2024-03-21 01:49:40,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46967.4, 300 sec: 46319.5). Total num frames: 515506176. Throughput: 0: 46211.1. Samples: 516800000. Policy #0 lag: (min: 0.0, avg: 40.9, max: 83.0) [2024-03-21 01:49:40,522][03784] Avg episode reward: [(0, '1.036')] [2024-03-21 01:49:41,671][04017] Updated weights for policy 0, policy_version 15735 (0.0016) [2024-03-21 01:49:45,521][03784] Fps is (10 sec: 52428.4, 60 sec: 47513.5, 300 sec: 46652.7). Total num frames: 515833856. Throughput: 0: 45980.0. Samples: 516930900. Policy #0 lag: (min: 1.0, avg: 42.1, max: 118.0) [2024-03-21 01:49:45,522][03784] Avg episode reward: [(0, '0.995')] [2024-03-21 01:49:48,447][04017] Updated weights for policy 0, policy_version 15745 (0.0019) [2024-03-21 01:49:50,521][03784] Fps is (10 sec: 45875.6, 60 sec: 47513.6, 300 sec: 46319.5). Total num frames: 515964928. Throughput: 0: 45782.4. Samples: 517212900. Policy #0 lag: (min: 1.0, avg: 42.1, max: 118.0) [2024-03-21 01:49:50,522][03784] Avg episode reward: [(0, '1.154')] [2024-03-21 01:49:53,442][03995] Signal inference workers to stop experience collection... (10350 times) [2024-03-21 01:49:53,442][03995] Signal inference workers to resume experience collection... (10350 times) [2024-03-21 01:49:53,501][04017] InferenceWorker_p0-w0: stopping experience collection (10350 times) [2024-03-21 01:49:53,501][04017] InferenceWorker_p0-w0: resuming experience collection (10350 times) [2024-03-21 01:49:55,521][03784] Fps is (10 sec: 32767.7, 60 sec: 45875.1, 300 sec: 45986.3). Total num frames: 516161536. Throughput: 0: 45377.8. Samples: 517498900. Policy #0 lag: (min: 1.0, avg: 42.1, max: 118.0) [2024-03-21 01:49:55,522][03784] Avg episode reward: [(0, '1.190')] [2024-03-21 01:49:59,645][04017] Updated weights for policy 0, policy_version 15755 (0.0010) [2024-03-21 01:50:00,521][03784] Fps is (10 sec: 32767.7, 60 sec: 43690.7, 300 sec: 45764.1). Total num frames: 516292608. Throughput: 0: 45877.8. Samples: 517653000. Policy #0 lag: (min: 0.0, avg: 30.8, max: 73.0) [2024-03-21 01:50:00,522][03784] Avg episode reward: [(0, '1.318')] [2024-03-21 01:50:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000015756_516292608.pth... [2024-03-21 01:50:00,662][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000015420_505282560.pth [2024-03-21 01:50:05,521][03784] Fps is (10 sec: 32768.4, 60 sec: 43144.6, 300 sec: 45764.1). Total num frames: 516489216. Throughput: 0: 46133.3. Samples: 517928300. Policy #0 lag: (min: 0.0, avg: 30.8, max: 73.0) [2024-03-21 01:50:05,522][03784] Avg episode reward: [(0, '0.939')] [2024-03-21 01:50:06,657][04017] Updated weights for policy 0, policy_version 15765 (0.0011) [2024-03-21 01:50:10,136][04017] Updated weights for policy 0, policy_version 15775 (0.0010) [2024-03-21 01:50:10,521][03784] Fps is (10 sec: 62258.0, 60 sec: 47513.5, 300 sec: 47097.0). Total num frames: 516915200. Throughput: 0: 46490.9. Samples: 518183000. Policy #0 lag: (min: 5.0, avg: 35.0, max: 70.0) [2024-03-21 01:50:10,522][03784] Avg episode reward: [(0, '0.775')] [2024-03-21 01:50:15,521][03784] Fps is (10 sec: 72089.6, 60 sec: 46967.5, 300 sec: 47319.2). Total num frames: 517210112. Throughput: 0: 46075.6. Samples: 518310000. Policy #0 lag: (min: 5.0, avg: 35.0, max: 70.0) [2024-03-21 01:50:15,522][03784] Avg episode reward: [(0, '1.023')] [2024-03-21 01:50:15,610][04017] Updated weights for policy 0, policy_version 15785 (0.0018) [2024-03-21 01:50:20,521][03784] Fps is (10 sec: 55706.4, 60 sec: 46967.4, 300 sec: 47430.3). Total num frames: 517472256. Throughput: 0: 46362.1. Samples: 518601900. Policy #0 lag: (min: 5.0, avg: 35.0, max: 70.0) [2024-03-21 01:50:20,522][03784] Avg episode reward: [(0, '0.979')] [2024-03-21 01:50:25,521][03784] Fps is (10 sec: 29491.1, 60 sec: 43144.5, 300 sec: 45986.3). Total num frames: 517505024. Throughput: 0: 46642.2. Samples: 518898900. Policy #0 lag: (min: 0.0, avg: 35.0, max: 77.0) [2024-03-21 01:50:25,522][03784] Avg episode reward: [(0, '0.514')] [2024-03-21 01:50:27,552][04017] Updated weights for policy 0, policy_version 15795 (0.0011) [2024-03-21 01:50:30,521][03784] Fps is (10 sec: 29491.3, 60 sec: 43690.7, 300 sec: 45764.1). Total num frames: 517767168. Throughput: 0: 46788.9. Samples: 519036400. Policy #0 lag: (min: 0.0, avg: 35.0, max: 77.0) [2024-03-21 01:50:30,522][03784] Avg episode reward: [(0, '1.348')] [2024-03-21 01:50:34,718][04017] Updated weights for policy 0, policy_version 15805 (0.0012) [2024-03-21 01:50:35,521][03784] Fps is (10 sec: 45875.6, 60 sec: 44236.8, 300 sec: 45986.3). Total num frames: 517963776. Throughput: 0: 46364.5. Samples: 519299300. Policy #0 lag: (min: 0.0, avg: 35.0, max: 77.0) [2024-03-21 01:50:35,522][03784] Avg episode reward: [(0, '0.658')] [2024-03-21 01:50:40,521][03784] Fps is (10 sec: 32768.2, 60 sec: 43144.6, 300 sec: 45875.2). Total num frames: 518094848. Throughput: 0: 45346.8. Samples: 519539500. Policy #0 lag: (min: 0.0, avg: 49.3, max: 114.0) [2024-03-21 01:50:40,522][03784] Avg episode reward: [(0, '1.248')] [2024-03-21 01:50:41,889][04017] Updated weights for policy 0, policy_version 15815 (0.0014) [2024-03-21 01:50:43,453][03995] Signal inference workers to stop experience collection... (10400 times) [2024-03-21 01:50:43,454][03995] Signal inference workers to resume experience collection... (10400 times) [2024-03-21 01:50:43,512][04017] InferenceWorker_p0-w0: stopping experience collection (10400 times) [2024-03-21 01:50:43,512][04017] InferenceWorker_p0-w0: resuming experience collection (10400 times) [2024-03-21 01:50:45,521][03784] Fps is (10 sec: 55705.4, 60 sec: 44783.0, 300 sec: 46319.5). Total num frames: 518520832. Throughput: 0: 44226.7. Samples: 519643200. Policy #0 lag: (min: 0.0, avg: 49.3, max: 114.0) [2024-03-21 01:50:45,522][03784] Avg episode reward: [(0, '1.160')] [2024-03-21 01:50:45,571][04017] Updated weights for policy 0, policy_version 15825 (0.0021) [2024-03-21 01:50:49,812][04017] Updated weights for policy 0, policy_version 15835 (0.0015) [2024-03-21 01:50:50,521][03784] Fps is (10 sec: 85197.1, 60 sec: 49698.1, 300 sec: 46986.0). Total num frames: 518946816. Throughput: 0: 43791.2. Samples: 519898900. Policy #0 lag: (min: 0.0, avg: 53.8, max: 115.0) [2024-03-21 01:50:50,522][03784] Avg episode reward: [(0, '1.124')] [2024-03-21 01:50:55,521][03784] Fps is (10 sec: 52428.9, 60 sec: 48059.9, 300 sec: 46652.8). Total num frames: 519045120. Throughput: 0: 44669.2. Samples: 520193100. Policy #0 lag: (min: 0.0, avg: 53.8, max: 115.0) [2024-03-21 01:50:55,522][03784] Avg episode reward: [(0, '1.173')] [2024-03-21 01:50:58,783][04017] Updated weights for policy 0, policy_version 15845 (0.0009) [2024-03-21 01:51:00,521][03784] Fps is (10 sec: 36044.6, 60 sec: 50244.3, 300 sec: 46986.0). Total num frames: 519307264. Throughput: 0: 45235.6. Samples: 520345600. Policy #0 lag: (min: 0.0, avg: 53.8, max: 115.0) [2024-03-21 01:51:00,522][03784] Avg episode reward: [(0, '1.085')] [2024-03-21 01:51:05,521][03784] Fps is (10 sec: 36044.6, 60 sec: 48605.9, 300 sec: 46430.6). Total num frames: 519405568. Throughput: 0: 45482.3. Samples: 520648600. Policy #0 lag: (min: 0.0, avg: 34.6, max: 79.0) [2024-03-21 01:51:05,522][03784] Avg episode reward: [(0, '1.256')] [2024-03-21 01:51:10,521][03784] Fps is (10 sec: 16384.0, 60 sec: 42598.6, 300 sec: 45653.0). Total num frames: 519471104. Throughput: 0: 45797.8. Samples: 520959800. Policy #0 lag: (min: 0.0, avg: 34.6, max: 79.0) [2024-03-21 01:51:10,522][03784] Avg episode reward: [(0, '1.256')] [2024-03-21 01:51:11,700][04017] Updated weights for policy 0, policy_version 15855 (0.0010) [2024-03-21 01:51:15,521][03784] Fps is (10 sec: 26214.5, 60 sec: 40960.0, 300 sec: 45986.3). Total num frames: 519667712. Throughput: 0: 45733.4. Samples: 521094400. Policy #0 lag: (min: 0.0, avg: 34.6, max: 79.0) [2024-03-21 01:51:15,522][03784] Avg episode reward: [(0, '0.615')] [2024-03-21 01:51:20,521][03784] Fps is (10 sec: 29491.1, 60 sec: 38229.4, 300 sec: 45097.6). Total num frames: 519766016. Throughput: 0: 46097.7. Samples: 521373700. Policy #0 lag: (min: 0.0, avg: 20.3, max: 65.0) [2024-03-21 01:51:20,522][03784] Avg episode reward: [(0, '1.039')] [2024-03-21 01:51:21,931][04017] Updated weights for policy 0, policy_version 15865 (0.0010) [2024-03-21 01:51:25,521][03784] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 45875.2). Total num frames: 520093696. Throughput: 0: 47277.9. Samples: 521667000. Policy #0 lag: (min: 0.0, avg: 20.3, max: 65.0) [2024-03-21 01:51:25,522][03784] Avg episode reward: [(0, '0.551')] [2024-03-21 01:51:26,119][04017] Updated weights for policy 0, policy_version 15875 (0.0025) [2024-03-21 01:51:30,521][03784] Fps is (10 sec: 72090.1, 60 sec: 45329.1, 300 sec: 46430.6). Total num frames: 520486912. Throughput: 0: 47820.0. Samples: 521795100. Policy #0 lag: (min: 4.0, avg: 49.9, max: 115.0) [2024-03-21 01:51:30,522][03784] Avg episode reward: [(0, '1.096')] [2024-03-21 01:51:30,699][04017] Updated weights for policy 0, policy_version 15885 (0.0016) [2024-03-21 01:51:32,839][03995] Signal inference workers to stop experience collection... (10450 times) [2024-03-21 01:51:32,892][04017] InferenceWorker_p0-w0: stopping experience collection (10450 times) [2024-03-21 01:51:32,907][03995] Signal inference workers to resume experience collection... (10450 times) [2024-03-21 01:51:32,938][04017] InferenceWorker_p0-w0: resuming experience collection (10450 times) [2024-03-21 01:51:34,562][04017] Updated weights for policy 0, policy_version 15895 (0.0024) [2024-03-21 01:51:35,521][03784] Fps is (10 sec: 85195.9, 60 sec: 49698.1, 300 sec: 47763.5). Total num frames: 520945664. Throughput: 0: 48173.3. Samples: 522066700. Policy #0 lag: (min: 4.0, avg: 49.9, max: 115.0) [2024-03-21 01:51:35,522][03784] Avg episode reward: [(0, '0.879')] [2024-03-21 01:51:40,521][03784] Fps is (10 sec: 62258.7, 60 sec: 50244.2, 300 sec: 47208.1). Total num frames: 521109504. Throughput: 0: 47644.4. Samples: 522337100. Policy #0 lag: (min: 4.0, avg: 49.9, max: 115.0) [2024-03-21 01:51:40,522][03784] Avg episode reward: [(0, '1.177')] [2024-03-21 01:51:42,280][04017] Updated weights for policy 0, policy_version 15905 (0.0010) [2024-03-21 01:51:45,521][03784] Fps is (10 sec: 45874.8, 60 sec: 48059.6, 300 sec: 47208.1). Total num frames: 521404416. Throughput: 0: 47479.9. Samples: 522482200. Policy #0 lag: (min: 0.0, avg: 40.3, max: 66.0) [2024-03-21 01:51:45,522][03784] Avg episode reward: [(0, '0.597')] [2024-03-21 01:51:46,538][04017] Updated weights for policy 0, policy_version 15915 (0.0018) [2024-03-21 01:51:50,521][03784] Fps is (10 sec: 58982.9, 60 sec: 45875.2, 300 sec: 47208.1). Total num frames: 521699328. Throughput: 0: 46644.5. Samples: 522747600. Policy #0 lag: (min: 0.0, avg: 40.3, max: 66.0) [2024-03-21 01:51:50,522][03784] Avg episode reward: [(0, '1.051')] [2024-03-21 01:51:53,454][04017] Updated weights for policy 0, policy_version 15925 (0.0011) [2024-03-21 01:51:55,521][03784] Fps is (10 sec: 45875.8, 60 sec: 46967.4, 300 sec: 46208.4). Total num frames: 521863168. Throughput: 0: 45926.7. Samples: 523026500. Policy #0 lag: (min: 0.0, avg: 48.0, max: 93.0) [2024-03-21 01:51:55,522][03784] Avg episode reward: [(0, '0.941')] [2024-03-21 01:52:00,521][03784] Fps is (10 sec: 36044.3, 60 sec: 45875.1, 300 sec: 46097.3). Total num frames: 522059776. Throughput: 0: 46142.1. Samples: 523170800. Policy #0 lag: (min: 0.0, avg: 48.0, max: 93.0) [2024-03-21 01:52:00,522][03784] Avg episode reward: [(0, '1.303')] [2024-03-21 01:52:00,542][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000015932_522059776.pth... [2024-03-21 01:52:00,675][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000015589_510820352.pth [2024-03-21 01:52:05,521][03784] Fps is (10 sec: 26214.4, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 522125312. Throughput: 0: 46508.9. Samples: 523466600. Policy #0 lag: (min: 0.0, avg: 48.0, max: 93.0) [2024-03-21 01:52:05,522][03784] Avg episode reward: [(0, '0.716')] [2024-03-21 01:52:06,136][04017] Updated weights for policy 0, policy_version 15935 (0.0012) [2024-03-21 01:52:10,521][03784] Fps is (10 sec: 29491.2, 60 sec: 48059.7, 300 sec: 45653.0). Total num frames: 522354688. Throughput: 0: 45908.7. Samples: 523732900. Policy #0 lag: (min: 0.0, avg: 28.4, max: 80.0) [2024-03-21 01:52:10,522][03784] Avg episode reward: [(0, '0.925')] [2024-03-21 01:52:12,188][04017] Updated weights for policy 0, policy_version 15945 (0.0017) [2024-03-21 01:52:15,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49698.1, 300 sec: 46097.4). Total num frames: 522649600. Throughput: 0: 46131.1. Samples: 523871000. Policy #0 lag: (min: 0.0, avg: 28.4, max: 80.0) [2024-03-21 01:52:15,522][03784] Avg episode reward: [(0, '1.178')] [2024-03-21 01:52:18,265][04017] Updated weights for policy 0, policy_version 15955 (0.0014) [2024-03-21 01:52:20,521][03784] Fps is (10 sec: 52429.1, 60 sec: 51882.7, 300 sec: 46208.4). Total num frames: 522878976. Throughput: 0: 46491.1. Samples: 524158800. Policy #0 lag: (min: 0.0, avg: 32.1, max: 76.0) [2024-03-21 01:52:20,522][03784] Avg episode reward: [(0, '0.747')] [2024-03-21 01:52:25,521][03784] Fps is (10 sec: 39321.8, 60 sec: 49152.0, 300 sec: 46652.8). Total num frames: 523042816. Throughput: 0: 46811.2. Samples: 524443600. Policy #0 lag: (min: 0.0, avg: 32.1, max: 76.0) [2024-03-21 01:52:25,522][03784] Avg episode reward: [(0, '0.472')] [2024-03-21 01:52:27,549][03995] Signal inference workers to stop experience collection... (10500 times) [2024-03-21 01:52:27,613][04017] InferenceWorker_p0-w0: stopping experience collection (10500 times) [2024-03-21 01:52:27,811][03995] Signal inference workers to resume experience collection... (10500 times) [2024-03-21 01:52:27,811][04017] InferenceWorker_p0-w0: resuming experience collection (10500 times) [2024-03-21 01:52:28,117][04017] Updated weights for policy 0, policy_version 15965 (0.0010) [2024-03-21 01:52:30,521][03784] Fps is (10 sec: 39321.8, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 523272192. Throughput: 0: 46640.1. Samples: 524581000. Policy #0 lag: (min: 0.0, avg: 30.5, max: 77.0) [2024-03-21 01:52:30,522][03784] Avg episode reward: [(0, '0.904')] [2024-03-21 01:52:32,366][04017] Updated weights for policy 0, policy_version 15975 (0.0014) [2024-03-21 01:52:35,521][03784] Fps is (10 sec: 62259.0, 60 sec: 45329.1, 300 sec: 47319.2). Total num frames: 523665408. Throughput: 0: 47384.4. Samples: 524879900. Policy #0 lag: (min: 0.0, avg: 30.5, max: 77.0) [2024-03-21 01:52:35,522][03784] Avg episode reward: [(0, '0.904')] [2024-03-21 01:52:38,372][04017] Updated weights for policy 0, policy_version 15985 (0.0014) [2024-03-21 01:52:40,521][03784] Fps is (10 sec: 58982.4, 60 sec: 45875.2, 300 sec: 46652.8). Total num frames: 523862016. Throughput: 0: 47126.7. Samples: 525147200. Policy #0 lag: (min: 0.0, avg: 30.5, max: 77.0) [2024-03-21 01:52:40,522][03784] Avg episode reward: [(0, '0.904')] [2024-03-21 01:52:45,521][03784] Fps is (10 sec: 36044.2, 60 sec: 43690.6, 300 sec: 46430.6). Total num frames: 524025856. Throughput: 0: 47542.1. Samples: 525310200. Policy #0 lag: (min: 0.0, avg: 49.9, max: 118.0) [2024-03-21 01:52:45,522][03784] Avg episode reward: [(0, '0.858')] [2024-03-21 01:52:46,374][04017] Updated weights for policy 0, policy_version 15995 (0.0017) [2024-03-21 01:52:50,521][03784] Fps is (10 sec: 45874.8, 60 sec: 43690.6, 300 sec: 46541.7). Total num frames: 524320768. Throughput: 0: 47053.3. Samples: 525584000. Policy #0 lag: (min: 0.0, avg: 49.9, max: 118.0) [2024-03-21 01:52:50,522][03784] Avg episode reward: [(0, '1.252')] [2024-03-21 01:52:54,863][04017] Updated weights for policy 0, policy_version 16005 (0.0011) [2024-03-21 01:52:55,521][03784] Fps is (10 sec: 49152.5, 60 sec: 44236.7, 300 sec: 46430.6). Total num frames: 524517376. Throughput: 0: 47073.3. Samples: 525851200. Policy #0 lag: (min: 0.0, avg: 49.9, max: 118.0) [2024-03-21 01:52:55,522][03784] Avg episode reward: [(0, '0.524')] [2024-03-21 01:52:58,533][04017] Updated weights for policy 0, policy_version 16015 (0.0020) [2024-03-21 01:53:00,521][03784] Fps is (10 sec: 52428.9, 60 sec: 46421.4, 300 sec: 46430.6). Total num frames: 524845056. Throughput: 0: 46293.3. Samples: 525954200. Policy #0 lag: (min: 0.0, avg: 47.2, max: 92.0) [2024-03-21 01:53:00,522][03784] Avg episode reward: [(0, '0.995')] [2024-03-21 01:53:04,894][04017] Updated weights for policy 0, policy_version 16025 (0.0015) [2024-03-21 01:53:05,521][03784] Fps is (10 sec: 62259.2, 60 sec: 50244.2, 300 sec: 46319.5). Total num frames: 525139968. Throughput: 0: 46431.1. Samples: 526248200. Policy #0 lag: (min: 0.0, avg: 47.2, max: 92.0) [2024-03-21 01:53:05,522][03784] Avg episode reward: [(0, '1.185')] [2024-03-21 01:53:10,521][03784] Fps is (10 sec: 55705.5, 60 sec: 50790.4, 300 sec: 46874.9). Total num frames: 525402112. Throughput: 0: 46593.2. Samples: 526540300. Policy #0 lag: (min: 1.0, avg: 38.0, max: 84.0) [2024-03-21 01:53:10,522][03784] Avg episode reward: [(0, '1.185')] [2024-03-21 01:53:11,299][04017] Updated weights for policy 0, policy_version 16035 (0.0014) [2024-03-21 01:53:14,618][03995] Signal inference workers to stop experience collection... (10550 times) [2024-03-21 01:53:14,685][03995] Signal inference workers to resume experience collection... (10550 times) [2024-03-21 01:53:14,820][04017] InferenceWorker_p0-w0: stopping experience collection (10550 times) [2024-03-21 01:53:14,912][04017] InferenceWorker_p0-w0: resuming experience collection (10550 times) [2024-03-21 01:53:15,521][03784] Fps is (10 sec: 45875.3, 60 sec: 49152.0, 300 sec: 46874.9). Total num frames: 525598720. Throughput: 0: 46886.6. Samples: 526690900. Policy #0 lag: (min: 1.0, avg: 38.0, max: 84.0) [2024-03-21 01:53:15,522][03784] Avg episode reward: [(0, '1.305')] [2024-03-21 01:53:17,080][04017] Updated weights for policy 0, policy_version 16045 (0.0012) [2024-03-21 01:53:20,521][03784] Fps is (10 sec: 45875.5, 60 sec: 49698.1, 300 sec: 46652.7). Total num frames: 525860864. Throughput: 0: 45913.3. Samples: 526946000. Policy #0 lag: (min: 1.0, avg: 38.0, max: 84.0) [2024-03-21 01:53:20,522][03784] Avg episode reward: [(0, '0.427')] [2024-03-21 01:53:25,521][03784] Fps is (10 sec: 36044.8, 60 sec: 48605.8, 300 sec: 46208.4). Total num frames: 525959168. Throughput: 0: 46320.0. Samples: 527231600. Policy #0 lag: (min: 0.0, avg: 37.1, max: 78.0) [2024-03-21 01:53:25,522][03784] Avg episode reward: [(0, '0.535')] [2024-03-21 01:53:30,521][03784] Fps is (10 sec: 19660.7, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 526057472. Throughput: 0: 45886.8. Samples: 527375100. Policy #0 lag: (min: 0.0, avg: 37.1, max: 78.0) [2024-03-21 01:53:30,522][03784] Avg episode reward: [(0, '0.975')] [2024-03-21 01:53:33,485][04017] Updated weights for policy 0, policy_version 16055 (0.0010) [2024-03-21 01:53:35,521][03784] Fps is (10 sec: 22937.4, 60 sec: 42052.2, 300 sec: 45764.1). Total num frames: 526188544. Throughput: 0: 46795.5. Samples: 527689800. Policy #0 lag: (min: 0.0, avg: 35.8, max: 88.0) [2024-03-21 01:53:35,522][03784] Avg episode reward: [(0, '1.318')] [2024-03-21 01:53:39,312][04017] Updated weights for policy 0, policy_version 16065 (0.0016) [2024-03-21 01:53:40,521][03784] Fps is (10 sec: 49152.1, 60 sec: 44782.9, 300 sec: 45986.3). Total num frames: 526548992. Throughput: 0: 46728.9. Samples: 527954000. Policy #0 lag: (min: 0.0, avg: 35.8, max: 88.0) [2024-03-21 01:53:40,522][03784] Avg episode reward: [(0, '0.857')] [2024-03-21 01:53:42,341][04017] Updated weights for policy 0, policy_version 16075 (0.0015) [2024-03-21 01:53:45,521][03784] Fps is (10 sec: 58982.9, 60 sec: 45875.3, 300 sec: 46319.5). Total num frames: 526778368. Throughput: 0: 47266.7. Samples: 528081200. Policy #0 lag: (min: 0.0, avg: 35.8, max: 88.0) [2024-03-21 01:53:45,522][03784] Avg episode reward: [(0, '0.857')] [2024-03-21 01:53:49,206][04017] Updated weights for policy 0, policy_version 16085 (0.0016) [2024-03-21 01:53:50,521][03784] Fps is (10 sec: 62258.9, 60 sec: 47513.6, 300 sec: 46652.7). Total num frames: 527171584. Throughput: 0: 46244.4. Samples: 528329200. Policy #0 lag: (min: 2.0, avg: 43.8, max: 105.0) [2024-03-21 01:53:50,522][03784] Avg episode reward: [(0, '0.954')] [2024-03-21 01:53:55,521][03784] Fps is (10 sec: 55705.4, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 527335424. Throughput: 0: 45526.7. Samples: 528589000. Policy #0 lag: (min: 2.0, avg: 43.8, max: 105.0) [2024-03-21 01:53:55,522][03784] Avg episode reward: [(0, '0.885')] [2024-03-21 01:53:55,968][04017] Updated weights for policy 0, policy_version 16095 (0.0017) [2024-03-21 01:54:00,521][03784] Fps is (10 sec: 45875.6, 60 sec: 46421.4, 300 sec: 46541.7). Total num frames: 527630336. Throughput: 0: 44973.4. Samples: 528714700. Policy #0 lag: (min: 2.0, avg: 43.9, max: 78.0) [2024-03-21 01:54:00,522][03784] Avg episode reward: [(0, '1.230')] [2024-03-21 01:54:00,537][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000016103_527663104.pth... [2024-03-21 01:54:00,652][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000015756_516292608.pth [2024-03-21 01:54:01,021][03995] Signal inference workers to stop experience collection... (10600 times) [2024-03-21 01:54:01,087][03995] Signal inference workers to resume experience collection... (10600 times) [2024-03-21 01:54:01,122][04017] InferenceWorker_p0-w0: stopping experience collection (10600 times) [2024-03-21 01:54:01,174][04017] InferenceWorker_p0-w0: resuming experience collection (10600 times) [2024-03-21 01:54:01,459][04017] Updated weights for policy 0, policy_version 16105 (0.0012) [2024-03-21 01:54:05,521][03784] Fps is (10 sec: 52429.3, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 527859712. Throughput: 0: 45051.2. Samples: 528973300. Policy #0 lag: (min: 2.0, avg: 43.9, max: 78.0) [2024-03-21 01:54:05,522][03784] Avg episode reward: [(0, '1.603')] [2024-03-21 01:54:10,521][03784] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 46208.4). Total num frames: 528023552. Throughput: 0: 44657.7. Samples: 529241200. Policy #0 lag: (min: 2.0, avg: 43.9, max: 78.0) [2024-03-21 01:54:10,522][03784] Avg episode reward: [(0, '1.201')] [2024-03-21 01:54:10,775][04017] Updated weights for policy 0, policy_version 16115 (0.0016) [2024-03-21 01:54:15,521][03784] Fps is (10 sec: 36044.6, 60 sec: 43690.7, 300 sec: 45986.3). Total num frames: 528220160. Throughput: 0: 44204.5. Samples: 529364300. Policy #0 lag: (min: 1.0, avg: 45.1, max: 111.0) [2024-03-21 01:54:15,522][03784] Avg episode reward: [(0, '1.183')] [2024-03-21 01:54:17,677][04017] Updated weights for policy 0, policy_version 16125 (0.0011) [2024-03-21 01:54:20,521][03784] Fps is (10 sec: 45875.7, 60 sec: 43690.7, 300 sec: 45986.3). Total num frames: 528482304. Throughput: 0: 43586.8. Samples: 529651200. Policy #0 lag: (min: 1.0, avg: 45.1, max: 111.0) [2024-03-21 01:54:20,522][03784] Avg episode reward: [(0, '0.843')] [2024-03-21 01:54:23,270][04017] Updated weights for policy 0, policy_version 16135 (0.0012) [2024-03-21 01:54:25,521][03784] Fps is (10 sec: 55705.3, 60 sec: 46967.4, 300 sec: 46208.4). Total num frames: 528777216. Throughput: 0: 43835.5. Samples: 529926600. Policy #0 lag: (min: 0.0, avg: 46.9, max: 116.0) [2024-03-21 01:54:25,522][03784] Avg episode reward: [(0, '1.441')] [2024-03-21 01:54:30,521][03784] Fps is (10 sec: 36044.0, 60 sec: 46421.2, 300 sec: 45875.2). Total num frames: 528842752. Throughput: 0: 44222.0. Samples: 530071200. Policy #0 lag: (min: 0.0, avg: 46.9, max: 116.0) [2024-03-21 01:54:30,522][03784] Avg episode reward: [(0, '1.267')] [2024-03-21 01:54:34,571][04017] Updated weights for policy 0, policy_version 16145 (0.0015) [2024-03-21 01:54:35,521][03784] Fps is (10 sec: 29491.3, 60 sec: 48059.8, 300 sec: 45986.3). Total num frames: 529072128. Throughput: 0: 45291.2. Samples: 530367300. Policy #0 lag: (min: 2.0, avg: 39.5, max: 88.0) [2024-03-21 01:54:35,522][03784] Avg episode reward: [(0, '0.537')] [2024-03-21 01:54:40,521][03784] Fps is (10 sec: 36045.4, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 529203200. Throughput: 0: 45380.0. Samples: 530631100. Policy #0 lag: (min: 2.0, avg: 39.5, max: 88.0) [2024-03-21 01:54:40,522][03784] Avg episode reward: [(0, '1.176')] [2024-03-21 01:54:44,841][04017] Updated weights for policy 0, policy_version 16155 (0.0011) [2024-03-21 01:54:45,521][03784] Fps is (10 sec: 36045.1, 60 sec: 44236.8, 300 sec: 45653.0). Total num frames: 529432576. Throughput: 0: 45684.5. Samples: 530770500. Policy #0 lag: (min: 2.0, avg: 39.5, max: 88.0) [2024-03-21 01:54:45,522][03784] Avg episode reward: [(0, '1.445')] [2024-03-21 01:54:49,526][04017] Updated weights for policy 0, policy_version 16165 (0.0016) [2024-03-21 01:54:50,521][03784] Fps is (10 sec: 52428.8, 60 sec: 42598.4, 300 sec: 45986.3). Total num frames: 529727488. Throughput: 0: 45904.3. Samples: 531039000. Policy #0 lag: (min: 0.0, avg: 29.5, max: 63.0) [2024-03-21 01:54:50,522][03784] Avg episode reward: [(0, '1.502')] [2024-03-21 01:54:55,446][04017] Updated weights for policy 0, policy_version 16175 (0.0011) [2024-03-21 01:54:55,521][03784] Fps is (10 sec: 58982.0, 60 sec: 44782.9, 300 sec: 46541.7). Total num frames: 530022400. Throughput: 0: 46046.7. Samples: 531313300. Policy #0 lag: (min: 0.0, avg: 29.5, max: 63.0) [2024-03-21 01:54:55,522][03784] Avg episode reward: [(0, '0.911')] [2024-03-21 01:55:00,521][03784] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 46652.7). Total num frames: 530251776. Throughput: 0: 46475.5. Samples: 531455700. Policy #0 lag: (min: 0.0, avg: 31.7, max: 66.0) [2024-03-21 01:55:00,522][03784] Avg episode reward: [(0, '0.570')] [2024-03-21 01:55:01,428][03995] Signal inference workers to stop experience collection... (10650 times) [2024-03-21 01:55:01,535][04017] InferenceWorker_p0-w0: stopping experience collection (10650 times) [2024-03-21 01:55:01,622][03995] Signal inference workers to resume experience collection... (10650 times) [2024-03-21 01:55:01,622][04017] InferenceWorker_p0-w0: resuming experience collection (10650 times) [2024-03-21 01:55:01,624][04017] Updated weights for policy 0, policy_version 16185 (0.0014) [2024-03-21 01:55:05,521][03784] Fps is (10 sec: 55706.0, 60 sec: 45329.1, 300 sec: 46319.6). Total num frames: 530579456. Throughput: 0: 45982.3. Samples: 531720400. Policy #0 lag: (min: 0.0, avg: 31.7, max: 66.0) [2024-03-21 01:55:05,522][03784] Avg episode reward: [(0, '0.723')] [2024-03-21 01:55:06,997][04017] Updated weights for policy 0, policy_version 16195 (0.0016) [2024-03-21 01:55:10,521][03784] Fps is (10 sec: 58982.5, 60 sec: 46967.5, 300 sec: 46208.4). Total num frames: 530841600. Throughput: 0: 46260.0. Samples: 532008300. Policy #0 lag: (min: 0.0, avg: 31.7, max: 66.0) [2024-03-21 01:55:10,522][03784] Avg episode reward: [(0, '0.723')] [2024-03-21 01:55:15,521][03784] Fps is (10 sec: 36044.6, 60 sec: 45329.1, 300 sec: 45653.1). Total num frames: 530939904. Throughput: 0: 46391.3. Samples: 532158800. Policy #0 lag: (min: 0.0, avg: 32.5, max: 61.0) [2024-03-21 01:55:15,522][03784] Avg episode reward: [(0, '0.494')] [2024-03-21 01:55:17,545][04017] Updated weights for policy 0, policy_version 16205 (0.0011) [2024-03-21 01:55:20,521][03784] Fps is (10 sec: 32767.7, 60 sec: 44782.8, 300 sec: 46319.5). Total num frames: 531169280. Throughput: 0: 45808.8. Samples: 532428700. Policy #0 lag: (min: 0.0, avg: 32.5, max: 61.0) [2024-03-21 01:55:20,522][03784] Avg episode reward: [(0, '0.847')] [2024-03-21 01:55:21,993][04017] Updated weights for policy 0, policy_version 16215 (0.0021) [2024-03-21 01:55:25,521][03784] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 46208.4). Total num frames: 531398656. Throughput: 0: 45168.9. Samples: 532663700. Policy #0 lag: (min: 0.0, avg: 32.5, max: 61.0) [2024-03-21 01:55:25,522][03784] Avg episode reward: [(0, '1.213')] [2024-03-21 01:55:30,521][03784] Fps is (10 sec: 42598.6, 60 sec: 45875.3, 300 sec: 46208.4). Total num frames: 531595264. Throughput: 0: 45493.2. Samples: 532817700. Policy #0 lag: (min: 0.0, avg: 41.8, max: 91.0) [2024-03-21 01:55:30,522][03784] Avg episode reward: [(0, '1.026')] [2024-03-21 01:55:31,183][04017] Updated weights for policy 0, policy_version 16225 (0.0010) [2024-03-21 01:55:35,521][03784] Fps is (10 sec: 55705.7, 60 sec: 48059.8, 300 sec: 46986.0). Total num frames: 531955712. Throughput: 0: 45473.4. Samples: 533085300. Policy #0 lag: (min: 0.0, avg: 41.8, max: 91.0) [2024-03-21 01:55:35,522][03784] Avg episode reward: [(0, '0.470')] [2024-03-21 01:55:35,558][04017] Updated weights for policy 0, policy_version 16235 (0.0011) [2024-03-21 01:55:40,521][03784] Fps is (10 sec: 62259.7, 60 sec: 50244.3, 300 sec: 46430.6). Total num frames: 532217856. Throughput: 0: 45637.8. Samples: 533367000. Policy #0 lag: (min: 1.0, avg: 37.6, max: 79.0) [2024-03-21 01:55:40,522][03784] Avg episode reward: [(0, '0.719')] [2024-03-21 01:55:44,432][04017] Updated weights for policy 0, policy_version 16245 (0.0011) [2024-03-21 01:55:45,521][03784] Fps is (10 sec: 36044.8, 60 sec: 48059.7, 300 sec: 45319.8). Total num frames: 532316160. Throughput: 0: 45966.7. Samples: 533524200. Policy #0 lag: (min: 1.0, avg: 37.6, max: 79.0) [2024-03-21 01:55:45,522][03784] Avg episode reward: [(0, '0.880')] [2024-03-21 01:55:50,521][03784] Fps is (10 sec: 36044.8, 60 sec: 47513.6, 300 sec: 45875.2). Total num frames: 532578304. Throughput: 0: 46731.1. Samples: 533823300. Policy #0 lag: (min: 2.0, avg: 28.4, max: 56.0) [2024-03-21 01:55:50,522][03784] Avg episode reward: [(0, '0.869')] [2024-03-21 01:55:51,231][04017] Updated weights for policy 0, policy_version 16255 (0.0015) [2024-03-21 01:55:55,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 532774912. Throughput: 0: 46733.4. Samples: 534111300. Policy #0 lag: (min: 2.0, avg: 28.4, max: 56.0) [2024-03-21 01:55:55,522][03784] Avg episode reward: [(0, '1.155')] [2024-03-21 01:56:00,002][03995] Signal inference workers to stop experience collection... (10700 times) [2024-03-21 01:56:00,002][03995] Signal inference workers to resume experience collection... (10700 times) [2024-03-21 01:56:00,075][04017] InferenceWorker_p0-w0: stopping experience collection (10700 times) [2024-03-21 01:56:00,075][04017] InferenceWorker_p0-w0: resuming experience collection (10700 times) [2024-03-21 01:56:00,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44783.0, 300 sec: 45875.2). Total num frames: 532938752. Throughput: 0: 46908.9. Samples: 534269700. Policy #0 lag: (min: 2.0, avg: 28.4, max: 56.0) [2024-03-21 01:56:00,522][03784] Avg episode reward: [(0, '1.155')] [2024-03-21 01:56:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000016264_532938752.pth... [2024-03-21 01:56:00,651][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000015932_522059776.pth [2024-03-21 01:56:01,275][04017] Updated weights for policy 0, policy_version 16265 (0.0011) [2024-03-21 01:56:05,521][03784] Fps is (10 sec: 36045.0, 60 sec: 42598.4, 300 sec: 46319.5). Total num frames: 533135360. Throughput: 0: 47455.7. Samples: 534564200. Policy #0 lag: (min: 0.0, avg: 26.1, max: 62.0) [2024-03-21 01:56:05,522][03784] Avg episode reward: [(0, '0.569')] [2024-03-21 01:56:08,882][04017] Updated weights for policy 0, policy_version 16275 (0.0023) [2024-03-21 01:56:10,521][03784] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 46541.7). Total num frames: 533397504. Throughput: 0: 48377.8. Samples: 534840700. Policy #0 lag: (min: 0.0, avg: 26.1, max: 62.0) [2024-03-21 01:56:10,522][03784] Avg episode reward: [(0, '0.748')] [2024-03-21 01:56:13,018][04017] Updated weights for policy 0, policy_version 16285 (0.0011) [2024-03-21 01:56:15,521][03784] Fps is (10 sec: 62258.8, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 533757952. Throughput: 0: 47808.9. Samples: 534969100. Policy #0 lag: (min: 1.0, avg: 31.1, max: 63.0) [2024-03-21 01:56:15,522][03784] Avg episode reward: [(0, '0.509')] [2024-03-21 01:56:18,711][04017] Updated weights for policy 0, policy_version 16295 (0.0015) [2024-03-21 01:56:20,521][03784] Fps is (10 sec: 62258.9, 60 sec: 47513.7, 300 sec: 47208.1). Total num frames: 534020096. Throughput: 0: 48326.7. Samples: 535260000. Policy #0 lag: (min: 1.0, avg: 31.1, max: 63.0) [2024-03-21 01:56:20,522][03784] Avg episode reward: [(0, '1.067')] [2024-03-21 01:56:24,134][04017] Updated weights for policy 0, policy_version 16305 (0.0020) [2024-03-21 01:56:25,521][03784] Fps is (10 sec: 58982.4, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 534347776. Throughput: 0: 48502.2. Samples: 535549600. Policy #0 lag: (min: 1.0, avg: 31.1, max: 63.0) [2024-03-21 01:56:25,522][03784] Avg episode reward: [(0, '0.629')] [2024-03-21 01:56:30,403][04017] Updated weights for policy 0, policy_version 16315 (0.0016) [2024-03-21 01:56:30,521][03784] Fps is (10 sec: 58981.8, 60 sec: 50244.2, 300 sec: 46319.5). Total num frames: 534609920. Throughput: 0: 48197.6. Samples: 535693100. Policy #0 lag: (min: 0.0, avg: 46.4, max: 89.0) [2024-03-21 01:56:30,531][03784] Avg episode reward: [(0, '0.427')] [2024-03-21 01:56:35,521][03784] Fps is (10 sec: 45875.4, 60 sec: 47513.6, 300 sec: 46430.6). Total num frames: 534806528. Throughput: 0: 47642.2. Samples: 535967200. Policy #0 lag: (min: 0.0, avg: 46.4, max: 89.0) [2024-03-21 01:56:35,522][03784] Avg episode reward: [(0, '0.427')] [2024-03-21 01:56:38,515][04017] Updated weights for policy 0, policy_version 16325 (0.0016) [2024-03-21 01:56:40,521][03784] Fps is (10 sec: 36045.3, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 534970368. Throughput: 0: 47417.8. Samples: 536245100. Policy #0 lag: (min: 0.0, avg: 45.2, max: 86.0) [2024-03-21 01:56:40,522][03784] Avg episode reward: [(0, '1.150')] [2024-03-21 01:56:45,521][03784] Fps is (10 sec: 42598.3, 60 sec: 48605.9, 300 sec: 45875.2). Total num frames: 535232512. Throughput: 0: 46831.1. Samples: 536377100. Policy #0 lag: (min: 0.0, avg: 45.2, max: 86.0) [2024-03-21 01:56:45,522][03784] Avg episode reward: [(0, '1.034')] [2024-03-21 01:56:45,630][04017] Updated weights for policy 0, policy_version 16335 (0.0010) [2024-03-21 01:56:47,703][03995] Signal inference workers to stop experience collection... (10750 times) [2024-03-21 01:56:47,779][03995] Signal inference workers to resume experience collection... (10750 times) [2024-03-21 01:56:47,805][04017] InferenceWorker_p0-w0: stopping experience collection (10750 times) [2024-03-21 01:56:47,857][04017] InferenceWorker_p0-w0: resuming experience collection (10750 times) [2024-03-21 01:56:50,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48605.9, 300 sec: 46208.4). Total num frames: 535494656. Throughput: 0: 46582.2. Samples: 536660400. Policy #0 lag: (min: 0.0, avg: 45.2, max: 86.0) [2024-03-21 01:56:50,522][03784] Avg episode reward: [(0, '1.145')] [2024-03-21 01:56:52,654][04017] Updated weights for policy 0, policy_version 16345 (0.0012) [2024-03-21 01:56:55,521][03784] Fps is (10 sec: 39321.5, 60 sec: 47513.6, 300 sec: 45986.3). Total num frames: 535625728. Throughput: 0: 47828.8. Samples: 536993000. Policy #0 lag: (min: 0.0, avg: 39.7, max: 81.0) [2024-03-21 01:56:55,522][03784] Avg episode reward: [(0, '1.145')] [2024-03-21 01:57:00,521][03784] Fps is (10 sec: 39321.4, 60 sec: 49152.0, 300 sec: 46652.7). Total num frames: 535887872. Throughput: 0: 48328.9. Samples: 537143900. Policy #0 lag: (min: 0.0, avg: 39.7, max: 81.0) [2024-03-21 01:57:00,522][03784] Avg episode reward: [(0, '1.174')] [2024-03-21 01:57:00,741][04017] Updated weights for policy 0, policy_version 16355 (0.0019) [2024-03-21 01:57:05,521][03784] Fps is (10 sec: 45875.1, 60 sec: 49151.9, 300 sec: 46541.7). Total num frames: 536084480. Throughput: 0: 47846.6. Samples: 537413100. Policy #0 lag: (min: 0.0, avg: 38.1, max: 100.0) [2024-03-21 01:57:05,522][03784] Avg episode reward: [(0, '1.143')] [2024-03-21 01:57:07,300][04017] Updated weights for policy 0, policy_version 16365 (0.0010) [2024-03-21 01:57:10,521][03784] Fps is (10 sec: 42598.8, 60 sec: 48605.9, 300 sec: 46319.5). Total num frames: 536313856. Throughput: 0: 47431.2. Samples: 537684000. Policy #0 lag: (min: 0.0, avg: 38.1, max: 100.0) [2024-03-21 01:57:10,522][03784] Avg episode reward: [(0, '1.033')] [2024-03-21 01:57:15,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45875.2, 300 sec: 46208.4). Total num frames: 536510464. Throughput: 0: 47453.5. Samples: 537828500. Policy #0 lag: (min: 0.0, avg: 38.1, max: 100.0) [2024-03-21 01:57:15,522][03784] Avg episode reward: [(0, '0.436')] [2024-03-21 01:57:17,616][04017] Updated weights for policy 0, policy_version 16375 (0.0011) [2024-03-21 01:57:20,521][03784] Fps is (10 sec: 36044.5, 60 sec: 44236.8, 300 sec: 46208.4). Total num frames: 536674304. Throughput: 0: 47424.4. Samples: 538101300. Policy #0 lag: (min: 0.0, avg: 39.3, max: 92.0) [2024-03-21 01:57:20,522][03784] Avg episode reward: [(0, '1.111')] [2024-03-21 01:57:25,397][04017] Updated weights for policy 0, policy_version 16385 (0.0015) [2024-03-21 01:57:25,521][03784] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 46208.4). Total num frames: 536903680. Throughput: 0: 47908.9. Samples: 538401000. Policy #0 lag: (min: 0.0, avg: 39.3, max: 92.0) [2024-03-21 01:57:25,522][03784] Avg episode reward: [(0, '1.088')] [2024-03-21 01:57:30,482][04017] Updated weights for policy 0, policy_version 16395 (0.0018) [2024-03-21 01:57:30,521][03784] Fps is (10 sec: 55705.4, 60 sec: 43690.7, 300 sec: 45986.3). Total num frames: 537231360. Throughput: 0: 47851.0. Samples: 538530400. Policy #0 lag: (min: 0.0, avg: 46.9, max: 118.0) [2024-03-21 01:57:30,522][03784] Avg episode reward: [(0, '1.184')] [2024-03-21 01:57:34,692][04017] Updated weights for policy 0, policy_version 16405 (0.0015) [2024-03-21 01:57:35,521][03784] Fps is (10 sec: 68812.8, 60 sec: 46421.3, 300 sec: 46541.7). Total num frames: 537591808. Throughput: 0: 47384.5. Samples: 538792700. Policy #0 lag: (min: 0.0, avg: 46.9, max: 118.0) [2024-03-21 01:57:35,522][03784] Avg episode reward: [(0, '0.664')] [2024-03-21 01:57:37,859][03995] Signal inference workers to stop experience collection... (10800 times) [2024-03-21 01:57:37,973][04017] InferenceWorker_p0-w0: stopping experience collection (10800 times) [2024-03-21 01:57:38,054][03995] Signal inference workers to resume experience collection... (10800 times) [2024-03-21 01:57:38,059][04017] InferenceWorker_p0-w0: resuming experience collection (10800 times) [2024-03-21 01:57:39,206][04017] Updated weights for policy 0, policy_version 16415 (0.0025) [2024-03-21 01:57:40,521][03784] Fps is (10 sec: 78643.9, 60 sec: 50790.4, 300 sec: 47430.3). Total num frames: 538017792. Throughput: 0: 46140.1. Samples: 539069300. Policy #0 lag: (min: 0.0, avg: 46.9, max: 118.0) [2024-03-21 01:57:40,522][03784] Avg episode reward: [(0, '0.976')] [2024-03-21 01:57:45,521][03784] Fps is (10 sec: 55705.3, 60 sec: 48605.8, 300 sec: 46874.9). Total num frames: 538148864. Throughput: 0: 45995.6. Samples: 539213700. Policy #0 lag: (min: 0.0, avg: 49.0, max: 87.0) [2024-03-21 01:57:45,522][03784] Avg episode reward: [(0, '0.442')] [2024-03-21 01:57:47,014][04017] Updated weights for policy 0, policy_version 16425 (0.0010) [2024-03-21 01:57:50,521][03784] Fps is (10 sec: 42598.4, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 538443776. Throughput: 0: 45949.0. Samples: 539480800. Policy #0 lag: (min: 0.0, avg: 49.0, max: 87.0) [2024-03-21 01:57:50,522][03784] Avg episode reward: [(0, '0.562')] [2024-03-21 01:57:53,904][04017] Updated weights for policy 0, policy_version 16435 (0.0011) [2024-03-21 01:57:55,521][03784] Fps is (10 sec: 45875.2, 60 sec: 49698.1, 300 sec: 46652.7). Total num frames: 538607616. Throughput: 0: 46373.2. Samples: 539770800. Policy #0 lag: (min: 0.0, avg: 39.1, max: 83.0) [2024-03-21 01:57:55,522][03784] Avg episode reward: [(0, '0.531')] [2024-03-21 01:58:00,521][03784] Fps is (10 sec: 26214.0, 60 sec: 46967.4, 300 sec: 45986.3). Total num frames: 538705920. Throughput: 0: 46453.2. Samples: 539918900. Policy #0 lag: (min: 0.0, avg: 39.1, max: 83.0) [2024-03-21 01:58:00,522][03784] Avg episode reward: [(0, '0.531')] [2024-03-21 01:58:00,721][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000016441_538738688.pth... [2024-03-21 01:58:00,838][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000016103_527663104.pth [2024-03-21 01:58:03,382][04017] Updated weights for policy 0, policy_version 16445 (0.0010) [2024-03-21 01:58:05,521][03784] Fps is (10 sec: 29491.2, 60 sec: 46967.5, 300 sec: 45764.1). Total num frames: 538902528. Throughput: 0: 46435.5. Samples: 540190900. Policy #0 lag: (min: 0.0, avg: 39.1, max: 83.0) [2024-03-21 01:58:05,522][03784] Avg episode reward: [(0, '0.915')] [2024-03-21 01:58:10,521][03784] Fps is (10 sec: 42599.0, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 539131904. Throughput: 0: 46195.5. Samples: 540479800. Policy #0 lag: (min: 0.0, avg: 27.5, max: 81.0) [2024-03-21 01:58:10,522][03784] Avg episode reward: [(0, '0.762')] [2024-03-21 01:58:12,767][04017] Updated weights for policy 0, policy_version 16455 (0.0010) [2024-03-21 01:58:15,521][03784] Fps is (10 sec: 36045.2, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 539262976. Throughput: 0: 46837.9. Samples: 540638100. Policy #0 lag: (min: 0.0, avg: 27.5, max: 81.0) [2024-03-21 01:58:15,522][03784] Avg episode reward: [(0, '0.586')] [2024-03-21 01:58:19,132][04017] Updated weights for policy 0, policy_version 16465 (0.0010) [2024-03-21 01:58:20,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48605.9, 300 sec: 46208.4). Total num frames: 539590656. Throughput: 0: 47222.2. Samples: 540917700. Policy #0 lag: (min: 1.0, avg: 43.6, max: 103.0) [2024-03-21 01:58:20,522][03784] Avg episode reward: [(0, '0.984')] [2024-03-21 01:58:24,298][04017] Updated weights for policy 0, policy_version 16475 (0.0016) [2024-03-21 01:58:25,521][03784] Fps is (10 sec: 62259.1, 60 sec: 49698.2, 300 sec: 46874.9). Total num frames: 539885568. Throughput: 0: 47117.8. Samples: 541189600. Policy #0 lag: (min: 1.0, avg: 43.6, max: 103.0) [2024-03-21 01:58:25,522][03784] Avg episode reward: [(0, '0.716')] [2024-03-21 01:58:28,970][04017] Updated weights for policy 0, policy_version 16485 (0.0013) [2024-03-21 01:58:30,521][03784] Fps is (10 sec: 58981.8, 60 sec: 49152.0, 300 sec: 47430.3). Total num frames: 540180480. Throughput: 0: 46908.8. Samples: 541324600. Policy #0 lag: (min: 1.0, avg: 43.6, max: 103.0) [2024-03-21 01:58:30,522][03784] Avg episode reward: [(0, '1.119')] [2024-03-21 01:58:32,884][03995] Signal inference workers to stop experience collection... (10850 times) [2024-03-21 01:58:33,031][04017] InferenceWorker_p0-w0: stopping experience collection (10850 times) [2024-03-21 01:58:33,080][03995] Signal inference workers to resume experience collection... (10850 times) [2024-03-21 01:58:33,094][04017] InferenceWorker_p0-w0: resuming experience collection (10850 times) [2024-03-21 01:58:35,521][03784] Fps is (10 sec: 52428.7, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 540409856. Throughput: 0: 46537.8. Samples: 541575000. Policy #0 lag: (min: 0.0, avg: 33.9, max: 64.0) [2024-03-21 01:58:35,522][03784] Avg episode reward: [(0, '0.928')] [2024-03-21 01:58:36,612][04017] Updated weights for policy 0, policy_version 16495 (0.0020) [2024-03-21 01:58:40,521][03784] Fps is (10 sec: 49152.6, 60 sec: 44236.8, 300 sec: 47097.1). Total num frames: 540672000. Throughput: 0: 46137.8. Samples: 541847000. Policy #0 lag: (min: 0.0, avg: 33.9, max: 64.0) [2024-03-21 01:58:40,522][03784] Avg episode reward: [(0, '1.370')] [2024-03-21 01:58:42,274][04017] Updated weights for policy 0, policy_version 16505 (0.0017) [2024-03-21 01:58:45,521][03784] Fps is (10 sec: 49152.2, 60 sec: 45875.3, 300 sec: 46541.7). Total num frames: 540901376. Throughput: 0: 45849.1. Samples: 541982100. Policy #0 lag: (min: 0.0, avg: 42.8, max: 99.0) [2024-03-21 01:58:45,522][03784] Avg episode reward: [(0, '1.270')] [2024-03-21 01:58:50,521][03784] Fps is (10 sec: 32767.9, 60 sec: 42598.4, 300 sec: 46319.5). Total num frames: 540999680. Throughput: 0: 46691.2. Samples: 542292000. Policy #0 lag: (min: 0.0, avg: 42.8, max: 99.0) [2024-03-21 01:58:50,522][03784] Avg episode reward: [(0, '1.270')] [2024-03-21 01:58:53,164][04017] Updated weights for policy 0, policy_version 16515 (0.0015) [2024-03-21 01:58:55,521][03784] Fps is (10 sec: 32767.9, 60 sec: 43690.7, 300 sec: 46097.4). Total num frames: 541229056. Throughput: 0: 46648.9. Samples: 542579000. Policy #0 lag: (min: 0.0, avg: 42.8, max: 99.0) [2024-03-21 01:58:55,522][03784] Avg episode reward: [(0, '0.852')] [2024-03-21 01:58:59,670][04017] Updated weights for policy 0, policy_version 16525 (0.0019) [2024-03-21 01:59:00,521][03784] Fps is (10 sec: 58982.3, 60 sec: 48059.8, 300 sec: 46541.7). Total num frames: 541589504. Throughput: 0: 46111.0. Samples: 542713100. Policy #0 lag: (min: 1.0, avg: 44.3, max: 109.0) [2024-03-21 01:59:00,522][03784] Avg episode reward: [(0, '0.768')] [2024-03-21 01:59:05,476][04017] Updated weights for policy 0, policy_version 16535 (0.0013) [2024-03-21 01:59:05,521][03784] Fps is (10 sec: 58982.3, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 541818880. Throughput: 0: 45866.7. Samples: 542981700. Policy #0 lag: (min: 1.0, avg: 44.3, max: 109.0) [2024-03-21 01:59:05,522][03784] Avg episode reward: [(0, '0.830')] [2024-03-21 01:59:10,521][03784] Fps is (10 sec: 42598.7, 60 sec: 48059.8, 300 sec: 46763.8). Total num frames: 542015488. Throughput: 0: 46146.7. Samples: 543266200. Policy #0 lag: (min: 0.0, avg: 31.5, max: 113.0) [2024-03-21 01:59:10,522][03784] Avg episode reward: [(0, '1.123')] [2024-03-21 01:59:13,172][04017] Updated weights for policy 0, policy_version 16545 (0.0025) [2024-03-21 01:59:15,521][03784] Fps is (10 sec: 42598.2, 60 sec: 49698.0, 300 sec: 46652.7). Total num frames: 542244864. Throughput: 0: 46326.7. Samples: 543409300. Policy #0 lag: (min: 0.0, avg: 31.5, max: 113.0) [2024-03-21 01:59:15,522][03784] Avg episode reward: [(0, '1.467')] [2024-03-21 01:59:20,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46967.5, 300 sec: 46208.4). Total num frames: 542408704. Throughput: 0: 47066.7. Samples: 543693000. Policy #0 lag: (min: 0.0, avg: 31.5, max: 113.0) [2024-03-21 01:59:20,522][03784] Avg episode reward: [(0, '1.039')] [2024-03-21 01:59:21,640][04017] Updated weights for policy 0, policy_version 16555 (0.0010) [2024-03-21 01:59:25,521][03784] Fps is (10 sec: 39322.0, 60 sec: 45875.2, 300 sec: 46763.9). Total num frames: 542638080. Throughput: 0: 47502.3. Samples: 543984600. Policy #0 lag: (min: 1.0, avg: 35.7, max: 108.0) [2024-03-21 01:59:25,522][03784] Avg episode reward: [(0, '1.102')] [2024-03-21 01:59:29,390][04017] Updated weights for policy 0, policy_version 16565 (0.0015) [2024-03-21 01:59:30,521][03784] Fps is (10 sec: 39321.0, 60 sec: 43690.7, 300 sec: 46541.7). Total num frames: 542801920. Throughput: 0: 47608.7. Samples: 544124500. Policy #0 lag: (min: 1.0, avg: 35.7, max: 108.0) [2024-03-21 01:59:30,522][03784] Avg episode reward: [(0, '1.202')] [2024-03-21 01:59:35,521][03784] Fps is (10 sec: 42597.7, 60 sec: 44236.7, 300 sec: 46986.0). Total num frames: 543064064. Throughput: 0: 46535.4. Samples: 544386100. Policy #0 lag: (min: 1.0, avg: 43.9, max: 109.0) [2024-03-21 01:59:35,522][03784] Avg episode reward: [(0, '0.878')] [2024-03-21 01:59:36,242][04017] Updated weights for policy 0, policy_version 16575 (0.0015) [2024-03-21 01:59:37,760][03995] Signal inference workers to stop experience collection... (10900 times) [2024-03-21 01:59:37,827][03995] Signal inference workers to resume experience collection... (10900 times) [2024-03-21 01:59:37,841][04017] InferenceWorker_p0-w0: stopping experience collection (10900 times) [2024-03-21 01:59:37,881][04017] InferenceWorker_p0-w0: resuming experience collection (10900 times) [2024-03-21 01:59:40,521][03784] Fps is (10 sec: 52429.3, 60 sec: 44236.8, 300 sec: 47097.0). Total num frames: 543326208. Throughput: 0: 46522.1. Samples: 544672500. Policy #0 lag: (min: 1.0, avg: 43.9, max: 109.0) [2024-03-21 01:59:40,522][03784] Avg episode reward: [(0, '1.270')] [2024-03-21 01:59:42,659][04017] Updated weights for policy 0, policy_version 16585 (0.0012) [2024-03-21 01:59:45,521][03784] Fps is (10 sec: 45875.6, 60 sec: 43690.6, 300 sec: 46763.8). Total num frames: 543522816. Throughput: 0: 46380.0. Samples: 544800200. Policy #0 lag: (min: 1.0, avg: 43.9, max: 109.0) [2024-03-21 01:59:45,522][03784] Avg episode reward: [(0, '1.416')] [2024-03-21 01:59:49,196][04017] Updated weights for policy 0, policy_version 16595 (0.0012) [2024-03-21 01:59:50,521][03784] Fps is (10 sec: 49152.0, 60 sec: 46967.4, 300 sec: 46763.8). Total num frames: 543817728. Throughput: 0: 46182.2. Samples: 545059900. Policy #0 lag: (min: 2.0, avg: 38.7, max: 86.0) [2024-03-21 01:59:50,522][03784] Avg episode reward: [(0, '0.795')] [2024-03-21 01:59:55,521][03784] Fps is (10 sec: 52428.9, 60 sec: 46967.4, 300 sec: 46763.8). Total num frames: 544047104. Throughput: 0: 45744.4. Samples: 545324700. Policy #0 lag: (min: 2.0, avg: 38.7, max: 86.0) [2024-03-21 01:59:55,522][03784] Avg episode reward: [(0, '1.059')] [2024-03-21 01:59:56,238][04017] Updated weights for policy 0, policy_version 16605 (0.0011) [2024-03-21 02:00:00,521][03784] Fps is (10 sec: 45875.0, 60 sec: 44782.9, 300 sec: 46430.6). Total num frames: 544276480. Throughput: 0: 45646.6. Samples: 545463400. Policy #0 lag: (min: 0.0, avg: 37.5, max: 107.0) [2024-03-21 02:00:00,522][03784] Avg episode reward: [(0, '0.899')] [2024-03-21 02:00:00,871][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000016612_544342016.pth... [2024-03-21 02:00:00,998][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000016264_532938752.pth [2024-03-21 02:00:01,861][04017] Updated weights for policy 0, policy_version 16615 (0.0014) [2024-03-21 02:00:04,912][04017] Updated weights for policy 0, policy_version 16625 (0.0017) [2024-03-21 02:00:05,521][03784] Fps is (10 sec: 72089.7, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 544768000. Throughput: 0: 44644.4. Samples: 545702000. Policy #0 lag: (min: 0.0, avg: 37.5, max: 107.0) [2024-03-21 02:00:05,522][03784] Avg episode reward: [(0, '0.991')] [2024-03-21 02:00:10,521][03784] Fps is (10 sec: 62259.4, 60 sec: 48059.7, 300 sec: 47319.2). Total num frames: 544899072. Throughput: 0: 44997.7. Samples: 546009500. Policy #0 lag: (min: 0.0, avg: 38.6, max: 82.0) [2024-03-21 02:00:10,522][03784] Avg episode reward: [(0, '0.834')] [2024-03-21 02:00:14,846][04017] Updated weights for policy 0, policy_version 16635 (0.0010) [2024-03-21 02:00:15,521][03784] Fps is (10 sec: 39321.4, 60 sec: 48605.9, 300 sec: 47430.3). Total num frames: 545161216. Throughput: 0: 45404.5. Samples: 546167700. Policy #0 lag: (min: 0.0, avg: 38.6, max: 82.0) [2024-03-21 02:00:15,522][03784] Avg episode reward: [(0, '0.834')] [2024-03-21 02:00:20,521][03784] Fps is (10 sec: 39321.4, 60 sec: 48059.6, 300 sec: 47097.0). Total num frames: 545292288. Throughput: 0: 46115.6. Samples: 546461300. Policy #0 lag: (min: 0.0, avg: 38.6, max: 82.0) [2024-03-21 02:00:20,522][03784] Avg episode reward: [(0, '1.352')] [2024-03-21 02:00:24,135][04017] Updated weights for policy 0, policy_version 16645 (0.0011) [2024-03-21 02:00:25,521][03784] Fps is (10 sec: 26214.7, 60 sec: 46421.3, 300 sec: 46874.9). Total num frames: 545423360. Throughput: 0: 45715.6. Samples: 546729700. Policy #0 lag: (min: 0.0, avg: 32.2, max: 114.0) [2024-03-21 02:00:25,522][03784] Avg episode reward: [(0, '0.787')] [2024-03-21 02:00:30,427][03995] Signal inference workers to stop experience collection... (10950 times) [2024-03-21 02:00:30,504][04017] InferenceWorker_p0-w0: stopping experience collection (10950 times) [2024-03-21 02:00:30,521][03784] Fps is (10 sec: 22937.4, 60 sec: 45329.0, 300 sec: 45986.2). Total num frames: 545521664. Throughput: 0: 45868.7. Samples: 546864300. Policy #0 lag: (min: 0.0, avg: 32.2, max: 114.0) [2024-03-21 02:00:30,522][03784] Avg episode reward: [(0, '1.046')] [2024-03-21 02:00:30,701][03995] Signal inference workers to resume experience collection... (10950 times) [2024-03-21 02:00:30,701][04017] InferenceWorker_p0-w0: resuming experience collection (10950 times) [2024-03-21 02:00:35,521][03784] Fps is (10 sec: 29490.8, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 545718272. Throughput: 0: 46326.6. Samples: 547144600. Policy #0 lag: (min: 0.0, avg: 32.2, max: 114.0) [2024-03-21 02:00:35,522][03784] Avg episode reward: [(0, '1.083')] [2024-03-21 02:00:36,940][04017] Updated weights for policy 0, policy_version 16655 (0.0017) [2024-03-21 02:00:40,521][03784] Fps is (10 sec: 42599.2, 60 sec: 43690.7, 300 sec: 46208.4). Total num frames: 545947648. Throughput: 0: 46853.3. Samples: 547433100. Policy #0 lag: (min: 0.0, avg: 33.7, max: 87.0) [2024-03-21 02:00:40,522][03784] Avg episode reward: [(0, '1.014')] [2024-03-21 02:00:44,551][04017] Updated weights for policy 0, policy_version 16665 (0.0016) [2024-03-21 02:00:45,521][03784] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 45986.3). Total num frames: 546144256. Throughput: 0: 46849.0. Samples: 547571600. Policy #0 lag: (min: 0.0, avg: 33.7, max: 87.0) [2024-03-21 02:00:45,522][03784] Avg episode reward: [(0, '1.298')] [2024-03-21 02:00:50,521][03784] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 45986.3). Total num frames: 546340864. Throughput: 0: 47635.5. Samples: 547845600. Policy #0 lag: (min: 0.0, avg: 33.7, max: 87.0) [2024-03-21 02:00:50,522][03784] Avg episode reward: [(0, '0.791')] [2024-03-21 02:00:50,864][04017] Updated weights for policy 0, policy_version 16675 (0.0018) [2024-03-21 02:00:54,420][04017] Updated weights for policy 0, policy_version 16685 (0.0015) [2024-03-21 02:00:55,521][03784] Fps is (10 sec: 65535.8, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 546799616. Throughput: 0: 46726.7. Samples: 548112200. Policy #0 lag: (min: 0.0, avg: 28.3, max: 67.0) [2024-03-21 02:00:55,522][03784] Avg episode reward: [(0, '0.976')] [2024-03-21 02:00:57,876][04017] Updated weights for policy 0, policy_version 16695 (0.0010) [2024-03-21 02:01:00,521][03784] Fps is (10 sec: 95027.6, 60 sec: 50244.3, 300 sec: 47985.7). Total num frames: 547291136. Throughput: 0: 45922.3. Samples: 548234200. Policy #0 lag: (min: 0.0, avg: 28.3, max: 67.0) [2024-03-21 02:01:00,522][03784] Avg episode reward: [(0, '0.976')] [2024-03-21 02:01:01,420][04017] Updated weights for policy 0, policy_version 16705 (0.0018) [2024-03-21 02:01:05,521][03784] Fps is (10 sec: 72089.2, 60 sec: 45875.1, 300 sec: 47874.6). Total num frames: 547520512. Throughput: 0: 45346.7. Samples: 548501900. Policy #0 lag: (min: 0.0, avg: 54.1, max: 103.0) [2024-03-21 02:01:05,522][03784] Avg episode reward: [(0, '0.957')] [2024-03-21 02:01:10,521][03784] Fps is (10 sec: 39321.0, 60 sec: 46421.3, 300 sec: 47208.1). Total num frames: 547684352. Throughput: 0: 45877.6. Samples: 548794200. Policy #0 lag: (min: 0.0, avg: 54.1, max: 103.0) [2024-03-21 02:01:10,522][03784] Avg episode reward: [(0, '1.287')] [2024-03-21 02:01:10,914][04017] Updated weights for policy 0, policy_version 16715 (0.0015) [2024-03-21 02:01:15,521][03784] Fps is (10 sec: 29491.9, 60 sec: 44236.9, 300 sec: 46763.8). Total num frames: 547815424. Throughput: 0: 46073.7. Samples: 548937600. Policy #0 lag: (min: 0.0, avg: 54.1, max: 103.0) [2024-03-21 02:01:15,522][03784] Avg episode reward: [(0, '1.158')] [2024-03-21 02:01:20,179][03995] Signal inference workers to stop experience collection... (11000 times) [2024-03-21 02:01:20,265][04017] InferenceWorker_p0-w0: stopping experience collection (11000 times) [2024-03-21 02:01:20,383][03995] Signal inference workers to resume experience collection... (11000 times) [2024-03-21 02:01:20,383][04017] InferenceWorker_p0-w0: resuming experience collection (11000 times) [2024-03-21 02:01:20,521][03784] Fps is (10 sec: 19660.9, 60 sec: 43144.5, 300 sec: 45875.2). Total num frames: 547880960. Throughput: 0: 46213.4. Samples: 549224200. Policy #0 lag: (min: 0.0, avg: 42.2, max: 85.0) [2024-03-21 02:01:20,522][03784] Avg episode reward: [(0, '0.767')] [2024-03-21 02:01:24,070][04017] Updated weights for policy 0, policy_version 16725 (0.0010) [2024-03-21 02:01:25,521][03784] Fps is (10 sec: 36044.5, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 548175872. Throughput: 0: 45417.8. Samples: 549476900. Policy #0 lag: (min: 0.0, avg: 42.2, max: 85.0) [2024-03-21 02:01:25,522][03784] Avg episode reward: [(0, '0.817')] [2024-03-21 02:01:30,521][03784] Fps is (10 sec: 39322.1, 60 sec: 45875.4, 300 sec: 45653.0). Total num frames: 548274176. Throughput: 0: 46008.9. Samples: 549642000. Policy #0 lag: (min: 0.0, avg: 42.2, max: 85.0) [2024-03-21 02:01:30,522][03784] Avg episode reward: [(0, '0.817')] [2024-03-21 02:01:32,067][04017] Updated weights for policy 0, policy_version 16735 (0.0011) [2024-03-21 02:01:35,521][03784] Fps is (10 sec: 45874.7, 60 sec: 48605.9, 300 sec: 46319.5). Total num frames: 548634624. Throughput: 0: 46664.4. Samples: 549945500. Policy #0 lag: (min: 0.0, avg: 31.9, max: 86.0) [2024-03-21 02:01:35,522][03784] Avg episode reward: [(0, '0.755')] [2024-03-21 02:01:36,399][04017] Updated weights for policy 0, policy_version 16745 (0.0015) [2024-03-21 02:01:40,521][03784] Fps is (10 sec: 68812.1, 60 sec: 50244.2, 300 sec: 46541.7). Total num frames: 548962304. Throughput: 0: 47017.7. Samples: 550228000. Policy #0 lag: (min: 0.0, avg: 31.9, max: 86.0) [2024-03-21 02:01:40,522][03784] Avg episode reward: [(0, '0.755')] [2024-03-21 02:01:42,079][04017] Updated weights for policy 0, policy_version 16755 (0.0010) [2024-03-21 02:01:45,521][03784] Fps is (10 sec: 65535.8, 60 sec: 52428.7, 300 sec: 46763.8). Total num frames: 549289984. Throughput: 0: 47342.1. Samples: 550364600. Policy #0 lag: (min: 3.0, avg: 50.4, max: 104.0) [2024-03-21 02:01:45,522][03784] Avg episode reward: [(0, '0.451')] [2024-03-21 02:01:46,202][04017] Updated weights for policy 0, policy_version 16765 (0.0011) [2024-03-21 02:01:50,521][03784] Fps is (10 sec: 55706.0, 60 sec: 52975.0, 300 sec: 47097.1). Total num frames: 549519360. Throughput: 0: 47593.4. Samples: 550643600. Policy #0 lag: (min: 3.0, avg: 50.4, max: 104.0) [2024-03-21 02:01:50,522][03784] Avg episode reward: [(0, '0.451')] [2024-03-21 02:01:55,521][03784] Fps is (10 sec: 32768.0, 60 sec: 46967.4, 300 sec: 46541.7). Total num frames: 549617664. Throughput: 0: 47380.1. Samples: 550926300. Policy #0 lag: (min: 3.0, avg: 50.4, max: 104.0) [2024-03-21 02:01:55,522][03784] Avg episode reward: [(0, '1.123')] [2024-03-21 02:01:55,981][04017] Updated weights for policy 0, policy_version 16775 (0.0013) [2024-03-21 02:02:00,521][03784] Fps is (10 sec: 36044.4, 60 sec: 43144.5, 300 sec: 46763.8). Total num frames: 549879808. Throughput: 0: 47370.9. Samples: 551069300. Policy #0 lag: (min: 0.0, avg: 46.4, max: 106.0) [2024-03-21 02:02:00,522][03784] Avg episode reward: [(0, '0.914')] [2024-03-21 02:02:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000016781_549879808.pth... [2024-03-21 02:02:00,667][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000016441_538738688.pth [2024-03-21 02:02:02,627][04017] Updated weights for policy 0, policy_version 16785 (0.0015) [2024-03-21 02:02:05,521][03784] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 46652.7). Total num frames: 550076416. Throughput: 0: 47751.2. Samples: 551373000. Policy #0 lag: (min: 0.0, avg: 46.4, max: 106.0) [2024-03-21 02:02:05,522][03784] Avg episode reward: [(0, '0.978')] [2024-03-21 02:02:10,521][03784] Fps is (10 sec: 36045.2, 60 sec: 42598.5, 300 sec: 46541.7). Total num frames: 550240256. Throughput: 0: 49015.5. Samples: 551682600. Policy #0 lag: (min: 0.0, avg: 46.4, max: 106.0) [2024-03-21 02:02:10,522][03784] Avg episode reward: [(0, '1.204')] [2024-03-21 02:02:12,107][04017] Updated weights for policy 0, policy_version 16795 (0.0019) [2024-03-21 02:02:13,047][03995] Signal inference workers to stop experience collection... (11050 times) [2024-03-21 02:02:13,111][04017] InferenceWorker_p0-w0: stopping experience collection (11050 times) [2024-03-21 02:02:13,295][03995] Signal inference workers to resume experience collection... (11050 times) [2024-03-21 02:02:13,295][04017] InferenceWorker_p0-w0: resuming experience collection (11050 times) [2024-03-21 02:02:15,521][03784] Fps is (10 sec: 45875.2, 60 sec: 45328.9, 300 sec: 46986.0). Total num frames: 550535168. Throughput: 0: 48315.5. Samples: 551816200. Policy #0 lag: (min: 0.0, avg: 34.2, max: 74.0) [2024-03-21 02:02:15,522][03784] Avg episode reward: [(0, '1.111')] [2024-03-21 02:02:16,777][04017] Updated weights for policy 0, policy_version 16805 (0.0010) [2024-03-21 02:02:20,521][03784] Fps is (10 sec: 65536.3, 60 sec: 50244.4, 300 sec: 47430.3). Total num frames: 550895616. Throughput: 0: 47931.2. Samples: 552102400. Policy #0 lag: (min: 0.0, avg: 34.2, max: 74.0) [2024-03-21 02:02:20,522][03784] Avg episode reward: [(0, '1.066')] [2024-03-21 02:02:21,781][04017] Updated weights for policy 0, policy_version 16815 (0.0011) [2024-03-21 02:02:25,521][03784] Fps is (10 sec: 65535.9, 60 sec: 50244.2, 300 sec: 47319.2). Total num frames: 551190528. Throughput: 0: 48064.5. Samples: 552390900. Policy #0 lag: (min: 1.0, avg: 54.7, max: 112.0) [2024-03-21 02:02:25,522][03784] Avg episode reward: [(0, '0.800')] [2024-03-21 02:02:27,797][04017] Updated weights for policy 0, policy_version 16825 (0.0015) [2024-03-21 02:02:30,521][03784] Fps is (10 sec: 55705.3, 60 sec: 52974.9, 300 sec: 46986.0). Total num frames: 551452672. Throughput: 0: 48026.8. Samples: 552525800. Policy #0 lag: (min: 1.0, avg: 54.7, max: 112.0) [2024-03-21 02:02:30,522][03784] Avg episode reward: [(0, '1.068')] [2024-03-21 02:02:34,735][04017] Updated weights for policy 0, policy_version 16835 (0.0019) [2024-03-21 02:02:35,521][03784] Fps is (10 sec: 45875.0, 60 sec: 50244.3, 300 sec: 46208.4). Total num frames: 551649280. Throughput: 0: 48424.3. Samples: 552822700. Policy #0 lag: (min: 1.0, avg: 54.7, max: 112.0) [2024-03-21 02:02:35,522][03784] Avg episode reward: [(0, '1.068')] [2024-03-21 02:02:40,521][03784] Fps is (10 sec: 39321.6, 60 sec: 48059.8, 300 sec: 46430.6). Total num frames: 551845888. Throughput: 0: 48146.8. Samples: 553092900. Policy #0 lag: (min: 0.0, avg: 45.1, max: 90.0) [2024-03-21 02:02:40,522][03784] Avg episode reward: [(0, '1.073')] [2024-03-21 02:02:41,670][04017] Updated weights for policy 0, policy_version 16845 (0.0010) [2024-03-21 02:02:45,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 552108032. Throughput: 0: 48046.7. Samples: 553231400. Policy #0 lag: (min: 0.0, avg: 45.1, max: 90.0) [2024-03-21 02:02:45,522][03784] Avg episode reward: [(0, '1.146')] [2024-03-21 02:02:50,521][03784] Fps is (10 sec: 32768.0, 60 sec: 44236.8, 300 sec: 45986.3). Total num frames: 552173568. Throughput: 0: 47606.7. Samples: 553515300. Policy #0 lag: (min: 0.0, avg: 45.1, max: 90.0) [2024-03-21 02:02:50,522][03784] Avg episode reward: [(0, '1.319')] [2024-03-21 02:02:55,242][04017] Updated weights for policy 0, policy_version 16855 (0.0011) [2024-03-21 02:02:55,522][03784] Fps is (10 sec: 22936.0, 60 sec: 45328.5, 300 sec: 46208.3). Total num frames: 552337408. Throughput: 0: 47383.6. Samples: 553814900. Policy #0 lag: (min: 0.0, avg: 32.2, max: 70.0) [2024-03-21 02:02:55,523][03784] Avg episode reward: [(0, '1.139')] [2024-03-21 02:02:58,467][04017] Updated weights for policy 0, policy_version 16865 (0.0034) [2024-03-21 02:03:00,164][03995] Signal inference workers to stop experience collection... (11100 times) [2024-03-21 02:03:00,271][04017] InferenceWorker_p0-w0: stopping experience collection (11100 times) [2024-03-21 02:03:00,371][03995] Signal inference workers to resume experience collection... (11100 times) [2024-03-21 02:03:00,372][04017] InferenceWorker_p0-w0: resuming experience collection (11100 times) [2024-03-21 02:03:00,521][03784] Fps is (10 sec: 65534.6, 60 sec: 49151.9, 300 sec: 47208.1). Total num frames: 552828928. Throughput: 0: 46839.8. Samples: 553924000. Policy #0 lag: (min: 0.0, avg: 32.2, max: 70.0) [2024-03-21 02:03:00,522][03784] Avg episode reward: [(0, '1.107')] [2024-03-21 02:03:01,641][04017] Updated weights for policy 0, policy_version 16875 (0.0012) [2024-03-21 02:03:05,521][03784] Fps is (10 sec: 72094.8, 60 sec: 49698.1, 300 sec: 47208.1). Total num frames: 553058304. Throughput: 0: 46228.8. Samples: 554182700. Policy #0 lag: (min: 0.0, avg: 45.4, max: 85.0) [2024-03-21 02:03:05,522][03784] Avg episode reward: [(0, '1.295')] [2024-03-21 02:03:10,521][03784] Fps is (10 sec: 36045.5, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 553189376. Throughput: 0: 45988.9. Samples: 554460400. Policy #0 lag: (min: 0.0, avg: 45.4, max: 85.0) [2024-03-21 02:03:10,522][03784] Avg episode reward: [(0, '0.991')] [2024-03-21 02:03:12,252][04017] Updated weights for policy 0, policy_version 16885 (0.0012) [2024-03-21 02:03:15,521][03784] Fps is (10 sec: 36045.2, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 553418752. Throughput: 0: 46217.9. Samples: 554605600. Policy #0 lag: (min: 0.0, avg: 45.4, max: 85.0) [2024-03-21 02:03:15,522][03784] Avg episode reward: [(0, '0.484')] [2024-03-21 02:03:18,562][04017] Updated weights for policy 0, policy_version 16895 (0.0015) [2024-03-21 02:03:20,521][03784] Fps is (10 sec: 52428.2, 60 sec: 46967.3, 300 sec: 46874.9). Total num frames: 553713664. Throughput: 0: 45697.7. Samples: 554879100. Policy #0 lag: (min: 0.0, avg: 40.2, max: 89.0) [2024-03-21 02:03:20,522][03784] Avg episode reward: [(0, '0.484')] [2024-03-21 02:03:23,498][04017] Updated weights for policy 0, policy_version 16905 (0.0011) [2024-03-21 02:03:25,521][03784] Fps is (10 sec: 65536.0, 60 sec: 48059.8, 300 sec: 47097.1). Total num frames: 554074112. Throughput: 0: 46055.6. Samples: 555165400. Policy #0 lag: (min: 0.0, avg: 40.2, max: 89.0) [2024-03-21 02:03:25,522][03784] Avg episode reward: [(0, '1.014')] [2024-03-21 02:03:30,521][03784] Fps is (10 sec: 45875.3, 60 sec: 45329.0, 300 sec: 46652.7). Total num frames: 554172416. Throughput: 0: 46277.7. Samples: 555313900. Policy #0 lag: (min: 0.0, avg: 41.8, max: 75.0) [2024-03-21 02:03:30,522][03784] Avg episode reward: [(0, '1.308')] [2024-03-21 02:03:32,503][04017] Updated weights for policy 0, policy_version 16915 (0.0015) [2024-03-21 02:03:35,521][03784] Fps is (10 sec: 19660.6, 60 sec: 43690.7, 300 sec: 46097.4). Total num frames: 554270720. Throughput: 0: 46404.4. Samples: 555603500. Policy #0 lag: (min: 0.0, avg: 41.8, max: 75.0) [2024-03-21 02:03:35,522][03784] Avg episode reward: [(0, '1.173')] [2024-03-21 02:03:39,574][04017] Updated weights for policy 0, policy_version 16925 (0.0012) [2024-03-21 02:03:40,521][03784] Fps is (10 sec: 42599.0, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 554598400. Throughput: 0: 45654.1. Samples: 555869300. Policy #0 lag: (min: 0.0, avg: 41.8, max: 75.0) [2024-03-21 02:03:40,522][03784] Avg episode reward: [(0, '1.371')] [2024-03-21 02:03:45,521][03784] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 46430.6). Total num frames: 554696704. Throughput: 0: 46522.3. Samples: 556017500. Policy #0 lag: (min: 0.0, avg: 38.8, max: 80.0) [2024-03-21 02:03:45,522][03784] Avg episode reward: [(0, '1.200')] [2024-03-21 02:03:48,129][04017] Updated weights for policy 0, policy_version 16935 (0.0015) [2024-03-21 02:03:50,521][03784] Fps is (10 sec: 45874.9, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 555057152. Throughput: 0: 46293.4. Samples: 556265900. Policy #0 lag: (min: 0.0, avg: 38.8, max: 80.0) [2024-03-21 02:03:50,522][03784] Avg episode reward: [(0, '1.368')] [2024-03-21 02:03:55,521][03784] Fps is (10 sec: 45875.6, 60 sec: 46968.0, 300 sec: 45986.3). Total num frames: 555155456. Throughput: 0: 46693.3. Samples: 556561600. Policy #0 lag: (min: 0.0, avg: 38.8, max: 80.0) [2024-03-21 02:03:55,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 02:03:56,921][04017] Updated weights for policy 0, policy_version 16945 (0.0011) [2024-03-21 02:03:59,386][03995] Signal inference workers to stop experience collection... (11150 times) [2024-03-21 02:03:59,451][04017] InferenceWorker_p0-w0: stopping experience collection (11150 times) [2024-03-21 02:03:59,621][03995] Signal inference workers to resume experience collection... (11150 times) [2024-03-21 02:03:59,622][04017] InferenceWorker_p0-w0: resuming experience collection (11150 times) [2024-03-21 02:04:00,521][03784] Fps is (10 sec: 45875.4, 60 sec: 44783.1, 300 sec: 46430.6). Total num frames: 555515904. Throughput: 0: 46499.9. Samples: 556698100. Policy #0 lag: (min: 0.0, avg: 33.2, max: 111.0) [2024-03-21 02:04:00,522][03784] Avg episode reward: [(0, '1.030')] [2024-03-21 02:04:00,586][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000016954_555548672.pth... [2024-03-21 02:04:00,699][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000016612_544342016.pth [2024-03-21 02:04:03,109][04017] Updated weights for policy 0, policy_version 16955 (0.0016) [2024-03-21 02:04:05,521][03784] Fps is (10 sec: 49152.4, 60 sec: 43144.6, 300 sec: 46208.4). Total num frames: 555646976. Throughput: 0: 46184.6. Samples: 556957400. Policy #0 lag: (min: 0.0, avg: 33.2, max: 111.0) [2024-03-21 02:04:05,522][03784] Avg episode reward: [(0, '0.650')] [2024-03-21 02:04:09,997][04017] Updated weights for policy 0, policy_version 16965 (0.0017) [2024-03-21 02:04:10,521][03784] Fps is (10 sec: 39321.5, 60 sec: 45329.1, 300 sec: 46319.5). Total num frames: 555909120. Throughput: 0: 45373.3. Samples: 557207200. Policy #0 lag: (min: 3.0, avg: 27.0, max: 107.0) [2024-03-21 02:04:10,522][03784] Avg episode reward: [(0, '1.467')] [2024-03-21 02:04:15,521][03784] Fps is (10 sec: 55705.6, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 556204032. Throughput: 0: 45351.3. Samples: 557354700. Policy #0 lag: (min: 3.0, avg: 27.0, max: 107.0) [2024-03-21 02:04:15,522][03784] Avg episode reward: [(0, '1.012')] [2024-03-21 02:04:15,927][04017] Updated weights for policy 0, policy_version 16976 (0.0015) [2024-03-21 02:04:20,521][03784] Fps is (10 sec: 62259.1, 60 sec: 46967.6, 300 sec: 47097.0). Total num frames: 556531712. Throughput: 0: 44957.8. Samples: 557626600. Policy #0 lag: (min: 3.0, avg: 27.0, max: 107.0) [2024-03-21 02:04:20,522][03784] Avg episode reward: [(0, '0.993')] [2024-03-21 02:04:21,608][04017] Updated weights for policy 0, policy_version 16986 (0.0011) [2024-03-21 02:04:25,521][03784] Fps is (10 sec: 55705.1, 60 sec: 44782.9, 300 sec: 47319.2). Total num frames: 556761088. Throughput: 0: 44795.5. Samples: 557885100. Policy #0 lag: (min: 0.0, avg: 40.0, max: 88.0) [2024-03-21 02:04:25,522][03784] Avg episode reward: [(0, '0.622')] [2024-03-21 02:04:29,373][04017] Updated weights for policy 0, policy_version 16996 (0.0011) [2024-03-21 02:04:30,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46967.5, 300 sec: 47208.2). Total num frames: 556990464. Throughput: 0: 44617.9. Samples: 558025300. Policy #0 lag: (min: 0.0, avg: 40.0, max: 88.0) [2024-03-21 02:04:30,522][03784] Avg episode reward: [(0, '0.525')] [2024-03-21 02:04:35,521][03784] Fps is (10 sec: 39321.8, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 557154304. Throughput: 0: 45266.7. Samples: 558302900. Policy #0 lag: (min: 0.0, avg: 35.6, max: 68.0) [2024-03-21 02:04:35,522][03784] Avg episode reward: [(0, '1.302')] [2024-03-21 02:04:38,182][04017] Updated weights for policy 0, policy_version 17006 (0.0011) [2024-03-21 02:04:40,521][03784] Fps is (10 sec: 39321.7, 60 sec: 46421.3, 300 sec: 46986.0). Total num frames: 557383680. Throughput: 0: 44895.6. Samples: 558581900. Policy #0 lag: (min: 0.0, avg: 35.6, max: 68.0) [2024-03-21 02:04:40,522][03784] Avg episode reward: [(0, '1.226')] [2024-03-21 02:04:44,568][04017] Updated weights for policy 0, policy_version 17016 (0.0012) [2024-03-21 02:04:45,521][03784] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 46874.9). Total num frames: 557645824. Throughput: 0: 45053.3. Samples: 558725500. Policy #0 lag: (min: 0.0, avg: 35.6, max: 68.0) [2024-03-21 02:04:45,522][03784] Avg episode reward: [(0, '1.123')] [2024-03-21 02:04:50,521][03784] Fps is (10 sec: 42598.2, 60 sec: 45875.2, 300 sec: 46652.7). Total num frames: 557809664. Throughput: 0: 45595.5. Samples: 559009200. Policy #0 lag: (min: 0.0, avg: 38.3, max: 79.0) [2024-03-21 02:04:50,522][03784] Avg episode reward: [(0, '0.771')] [2024-03-21 02:04:51,194][04017] Updated weights for policy 0, policy_version 17026 (0.0012) [2024-03-21 02:04:52,409][03995] Signal inference workers to stop experience collection... (11200 times) [2024-03-21 02:04:52,410][03995] Signal inference workers to resume experience collection... (11200 times) [2024-03-21 02:04:52,478][04017] InferenceWorker_p0-w0: stopping experience collection (11200 times) [2024-03-21 02:04:52,483][04017] InferenceWorker_p0-w0: resuming experience collection (11200 times) [2024-03-21 02:04:55,521][03784] Fps is (10 sec: 42598.2, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 558071808. Throughput: 0: 46044.4. Samples: 559279200. Policy #0 lag: (min: 0.0, avg: 38.3, max: 79.0) [2024-03-21 02:04:55,522][03784] Avg episode reward: [(0, '0.477')] [2024-03-21 02:05:00,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44782.9, 300 sec: 45542.0). Total num frames: 558202880. Throughput: 0: 46035.5. Samples: 559426300. Policy #0 lag: (min: 0.0, avg: 30.8, max: 76.0) [2024-03-21 02:05:00,522][03784] Avg episode reward: [(0, '1.171')] [2024-03-21 02:05:01,981][04017] Updated weights for policy 0, policy_version 17036 (0.0016) [2024-03-21 02:05:05,521][03784] Fps is (10 sec: 29491.5, 60 sec: 45329.1, 300 sec: 45653.1). Total num frames: 558366720. Throughput: 0: 46389.0. Samples: 559714100. Policy #0 lag: (min: 0.0, avg: 30.8, max: 76.0) [2024-03-21 02:05:05,522][03784] Avg episode reward: [(0, '1.068')] [2024-03-21 02:05:10,314][04017] Updated weights for policy 0, policy_version 17046 (0.0019) [2024-03-21 02:05:10,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44236.8, 300 sec: 45430.9). Total num frames: 558563328. Throughput: 0: 47353.4. Samples: 560016000. Policy #0 lag: (min: 0.0, avg: 30.8, max: 76.0) [2024-03-21 02:05:10,522][03784] Avg episode reward: [(0, '1.068')] [2024-03-21 02:05:14,895][04017] Updated weights for policy 0, policy_version 17056 (0.0011) [2024-03-21 02:05:15,521][03784] Fps is (10 sec: 55704.9, 60 sec: 45329.0, 300 sec: 46208.4). Total num frames: 558923776. Throughput: 0: 47455.5. Samples: 560160800. Policy #0 lag: (min: 3.0, avg: 28.2, max: 109.0) [2024-03-21 02:05:15,522][03784] Avg episode reward: [(0, '0.461')] [2024-03-21 02:05:20,521][03784] Fps is (10 sec: 62259.3, 60 sec: 44236.8, 300 sec: 46652.7). Total num frames: 559185920. Throughput: 0: 47195.6. Samples: 560426700. Policy #0 lag: (min: 3.0, avg: 28.2, max: 109.0) [2024-03-21 02:05:20,522][03784] Avg episode reward: [(0, '1.089')] [2024-03-21 02:05:21,252][04017] Updated weights for policy 0, policy_version 17066 (0.0011) [2024-03-21 02:05:25,521][03784] Fps is (10 sec: 45875.4, 60 sec: 43690.7, 300 sec: 46986.0). Total num frames: 559382528. Throughput: 0: 47804.4. Samples: 560733100. Policy #0 lag: (min: 3.0, avg: 28.2, max: 109.0) [2024-03-21 02:05:25,522][03784] Avg episode reward: [(0, '1.252')] [2024-03-21 02:05:28,154][04017] Updated weights for policy 0, policy_version 17076 (0.0014) [2024-03-21 02:05:30,521][03784] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 46986.0). Total num frames: 559579136. Throughput: 0: 47942.1. Samples: 560882900. Policy #0 lag: (min: 0.0, avg: 31.7, max: 71.0) [2024-03-21 02:05:30,522][03784] Avg episode reward: [(0, '1.055')] [2024-03-21 02:05:35,265][04017] Updated weights for policy 0, policy_version 17086 (0.0011) [2024-03-21 02:05:35,521][03784] Fps is (10 sec: 49151.6, 60 sec: 45329.0, 300 sec: 47208.1). Total num frames: 559874048. Throughput: 0: 48324.4. Samples: 561183800. Policy #0 lag: (min: 0.0, avg: 31.7, max: 71.0) [2024-03-21 02:05:35,522][03784] Avg episode reward: [(0, '1.055')] [2024-03-21 02:05:39,736][04017] Updated weights for policy 0, policy_version 17096 (0.0019) [2024-03-21 02:05:40,521][03784] Fps is (10 sec: 68812.1, 60 sec: 48059.6, 300 sec: 47874.6). Total num frames: 560267264. Throughput: 0: 48219.9. Samples: 561449100. Policy #0 lag: (min: 1.0, avg: 37.6, max: 71.0) [2024-03-21 02:05:40,522][03784] Avg episode reward: [(0, '1.193')] [2024-03-21 02:05:44,377][04017] Updated weights for policy 0, policy_version 17106 (0.0012) [2024-03-21 02:05:45,521][03784] Fps is (10 sec: 65536.7, 60 sec: 48059.7, 300 sec: 48096.8). Total num frames: 560529408. Throughput: 0: 47942.3. Samples: 561583700. Policy #0 lag: (min: 1.0, avg: 37.6, max: 71.0) [2024-03-21 02:05:45,522][03784] Avg episode reward: [(0, '1.311')] [2024-03-21 02:05:48,245][03995] Signal inference workers to stop experience collection... (11250 times) [2024-03-21 02:05:48,246][03995] Signal inference workers to resume experience collection... (11250 times) [2024-03-21 02:05:48,292][04017] InferenceWorker_p0-w0: stopping experience collection (11250 times) [2024-03-21 02:05:48,292][04017] InferenceWorker_p0-w0: resuming experience collection (11250 times) [2024-03-21 02:05:50,521][03784] Fps is (10 sec: 45875.7, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 560726016. Throughput: 0: 47946.5. Samples: 561871700. Policy #0 lag: (min: 1.0, avg: 37.6, max: 71.0) [2024-03-21 02:05:50,522][03784] Avg episode reward: [(0, '1.166')] [2024-03-21 02:05:53,297][04017] Updated weights for policy 0, policy_version 17116 (0.0015) [2024-03-21 02:05:55,521][03784] Fps is (10 sec: 45875.2, 60 sec: 48605.9, 300 sec: 46430.6). Total num frames: 560988160. Throughput: 0: 47453.4. Samples: 562151400. Policy #0 lag: (min: 0.0, avg: 37.6, max: 75.0) [2024-03-21 02:05:55,522][03784] Avg episode reward: [(0, '1.095')] [2024-03-21 02:06:00,521][03784] Fps is (10 sec: 36044.7, 60 sec: 48059.7, 300 sec: 45986.3). Total num frames: 561086464. Throughput: 0: 47764.4. Samples: 562310200. Policy #0 lag: (min: 0.0, avg: 37.6, max: 75.0) [2024-03-21 02:06:00,522][03784] Avg episode reward: [(0, '1.179')] [2024-03-21 02:06:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000017123_561086464.pth... [2024-03-21 02:06:00,651][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000016781_549879808.pth [2024-03-21 02:06:02,880][04017] Updated weights for policy 0, policy_version 17126 (0.0011) [2024-03-21 02:06:05,521][03784] Fps is (10 sec: 29491.1, 60 sec: 48605.8, 300 sec: 46097.4). Total num frames: 561283072. Throughput: 0: 48433.3. Samples: 562606200. Policy #0 lag: (min: 0.0, avg: 27.6, max: 75.0) [2024-03-21 02:06:05,522][03784] Avg episode reward: [(0, '0.943')] [2024-03-21 02:06:09,739][04017] Updated weights for policy 0, policy_version 17136 (0.0021) [2024-03-21 02:06:10,521][03784] Fps is (10 sec: 45875.5, 60 sec: 49698.1, 300 sec: 46541.6). Total num frames: 561545216. Throughput: 0: 48062.2. Samples: 562895900. Policy #0 lag: (min: 0.0, avg: 27.6, max: 75.0) [2024-03-21 02:06:10,522][03784] Avg episode reward: [(0, '1.118')] [2024-03-21 02:06:15,521][03784] Fps is (10 sec: 45875.5, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 561741824. Throughput: 0: 48060.1. Samples: 563045600. Policy #0 lag: (min: 0.0, avg: 27.6, max: 75.0) [2024-03-21 02:06:15,522][03784] Avg episode reward: [(0, '0.780')] [2024-03-21 02:06:17,114][04017] Updated weights for policy 0, policy_version 17146 (0.0015) [2024-03-21 02:06:20,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 561971200. Throughput: 0: 47857.8. Samples: 563337400. Policy #0 lag: (min: 1.0, avg: 37.0, max: 97.0) [2024-03-21 02:06:20,522][03784] Avg episode reward: [(0, '0.829')] [2024-03-21 02:06:23,997][04017] Updated weights for policy 0, policy_version 17156 (0.0011) [2024-03-21 02:06:25,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46967.5, 300 sec: 47208.1). Total num frames: 562200576. Throughput: 0: 48237.9. Samples: 563619800. Policy #0 lag: (min: 1.0, avg: 37.0, max: 97.0) [2024-03-21 02:06:25,522][03784] Avg episode reward: [(0, '1.022')] [2024-03-21 02:06:30,521][03784] Fps is (10 sec: 45875.4, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 562429952. Throughput: 0: 48131.1. Samples: 563749600. Policy #0 lag: (min: 0.0, avg: 46.2, max: 116.0) [2024-03-21 02:06:30,522][03784] Avg episode reward: [(0, '1.383')] [2024-03-21 02:06:31,059][04017] Updated weights for policy 0, policy_version 17166 (0.0010) [2024-03-21 02:06:34,810][04017] Updated weights for policy 0, policy_version 17176 (0.0020) [2024-03-21 02:06:35,521][03784] Fps is (10 sec: 68812.6, 60 sec: 50244.3, 300 sec: 47208.1). Total num frames: 562888704. Throughput: 0: 47928.9. Samples: 564028500. Policy #0 lag: (min: 0.0, avg: 46.2, max: 116.0) [2024-03-21 02:06:35,522][03784] Avg episode reward: [(0, '1.114')] [2024-03-21 02:06:40,191][04017] Updated weights for policy 0, policy_version 17186 (0.0023) [2024-03-21 02:06:40,521][03784] Fps is (10 sec: 75366.8, 60 sec: 48606.0, 300 sec: 47097.1). Total num frames: 563183616. Throughput: 0: 47940.0. Samples: 564308700. Policy #0 lag: (min: 0.0, avg: 46.2, max: 116.0) [2024-03-21 02:06:40,522][03784] Avg episode reward: [(0, '1.114')] [2024-03-21 02:06:45,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45329.1, 300 sec: 46541.7). Total num frames: 563249152. Throughput: 0: 47740.1. Samples: 564458500. Policy #0 lag: (min: 1.0, avg: 45.2, max: 80.0) [2024-03-21 02:06:45,522][03784] Avg episode reward: [(0, '1.254')] [2024-03-21 02:06:50,521][03784] Fps is (10 sec: 16383.9, 60 sec: 43690.7, 300 sec: 46541.7). Total num frames: 563347456. Throughput: 0: 48015.6. Samples: 564766900. Policy #0 lag: (min: 1.0, avg: 45.2, max: 80.0) [2024-03-21 02:06:50,522][03784] Avg episode reward: [(0, '0.545')] [2024-03-21 02:06:53,493][04017] Updated weights for policy 0, policy_version 17196 (0.0011) [2024-03-21 02:06:54,446][03995] Signal inference workers to stop experience collection... (11300 times) [2024-03-21 02:06:54,447][03995] Signal inference workers to resume experience collection... (11300 times) [2024-03-21 02:06:54,506][04017] InferenceWorker_p0-w0: stopping experience collection (11300 times) [2024-03-21 02:06:54,506][04017] InferenceWorker_p0-w0: resuming experience collection (11300 times) [2024-03-21 02:06:55,521][03784] Fps is (10 sec: 36044.5, 60 sec: 43690.6, 300 sec: 46541.7). Total num frames: 563609600. Throughput: 0: 47382.2. Samples: 565028100. Policy #0 lag: (min: 0.0, avg: 27.3, max: 65.0) [2024-03-21 02:06:55,522][03784] Avg episode reward: [(0, '0.702')] [2024-03-21 02:06:57,626][04017] Updated weights for policy 0, policy_version 17206 (0.0017) [2024-03-21 02:07:00,521][03784] Fps is (10 sec: 58982.0, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 563937280. Throughput: 0: 47077.7. Samples: 565164100. Policy #0 lag: (min: 0.0, avg: 27.3, max: 65.0) [2024-03-21 02:07:00,522][03784] Avg episode reward: [(0, '0.720')] [2024-03-21 02:07:05,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 564101120. Throughput: 0: 47095.6. Samples: 565456700. Policy #0 lag: (min: 0.0, avg: 27.3, max: 65.0) [2024-03-21 02:07:05,522][03784] Avg episode reward: [(0, '1.229')] [2024-03-21 02:07:05,777][04017] Updated weights for policy 0, policy_version 17216 (0.0015) [2024-03-21 02:07:10,521][03784] Fps is (10 sec: 42598.6, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 564363264. Throughput: 0: 46760.0. Samples: 565724000. Policy #0 lag: (min: 1.0, avg: 39.8, max: 86.0) [2024-03-21 02:07:10,522][03784] Avg episode reward: [(0, '1.172')] [2024-03-21 02:07:12,852][04017] Updated weights for policy 0, policy_version 17226 (0.0019) [2024-03-21 02:07:15,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46421.3, 300 sec: 46208.4). Total num frames: 564527104. Throughput: 0: 46913.3. Samples: 565860700. Policy #0 lag: (min: 1.0, avg: 39.8, max: 86.0) [2024-03-21 02:07:15,522][03784] Avg episode reward: [(0, '1.297')] [2024-03-21 02:07:20,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 564756480. Throughput: 0: 46993.3. Samples: 566143200. Policy #0 lag: (min: 1.0, avg: 38.8, max: 83.0) [2024-03-21 02:07:20,522][03784] Avg episode reward: [(0, '0.893')] [2024-03-21 02:07:21,675][04017] Updated weights for policy 0, policy_version 17236 (0.0010) [2024-03-21 02:07:25,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 564985856. Throughput: 0: 46575.5. Samples: 566404600. Policy #0 lag: (min: 1.0, avg: 38.8, max: 83.0) [2024-03-21 02:07:25,522][03784] Avg episode reward: [(0, '0.741')] [2024-03-21 02:07:27,182][04017] Updated weights for policy 0, policy_version 17246 (0.0017) [2024-03-21 02:07:30,521][03784] Fps is (10 sec: 49152.3, 60 sec: 46967.5, 300 sec: 46097.4). Total num frames: 565248000. Throughput: 0: 45920.0. Samples: 566524900. Policy #0 lag: (min: 1.0, avg: 38.8, max: 83.0) [2024-03-21 02:07:30,522][03784] Avg episode reward: [(0, '0.991')] [2024-03-21 02:07:35,521][03784] Fps is (10 sec: 36044.7, 60 sec: 40960.0, 300 sec: 45764.1). Total num frames: 565346304. Throughput: 0: 45655.5. Samples: 566821400. Policy #0 lag: (min: 0.0, avg: 60.3, max: 108.0) [2024-03-21 02:07:35,522][03784] Avg episode reward: [(0, '0.991')] [2024-03-21 02:07:36,539][04017] Updated weights for policy 0, policy_version 17256 (0.0017) [2024-03-21 02:07:40,521][03784] Fps is (10 sec: 32768.0, 60 sec: 39867.7, 300 sec: 45653.1). Total num frames: 565575680. Throughput: 0: 45966.7. Samples: 567096600. Policy #0 lag: (min: 0.0, avg: 60.3, max: 108.0) [2024-03-21 02:07:40,522][03784] Avg episode reward: [(0, '0.969')] [2024-03-21 02:07:43,478][04017] Updated weights for policy 0, policy_version 17266 (0.0019) [2024-03-21 02:07:45,521][03784] Fps is (10 sec: 55706.1, 60 sec: 44236.8, 300 sec: 46541.7). Total num frames: 565903360. Throughput: 0: 45517.8. Samples: 567212400. Policy #0 lag: (min: 2.0, avg: 60.3, max: 108.0) [2024-03-21 02:07:45,522][03784] Avg episode reward: [(0, '1.094')] [2024-03-21 02:07:47,102][03995] Signal inference workers to stop experience collection... (11350 times) [2024-03-21 02:07:47,180][04017] InferenceWorker_p0-w0: stopping experience collection (11350 times) [2024-03-21 02:07:47,330][03995] Signal inference workers to resume experience collection... (11350 times) [2024-03-21 02:07:47,330][04017] InferenceWorker_p0-w0: resuming experience collection (11350 times) [2024-03-21 02:07:48,066][04017] Updated weights for policy 0, policy_version 17276 (0.0018) [2024-03-21 02:07:50,521][03784] Fps is (10 sec: 72090.2, 60 sec: 49152.1, 300 sec: 47319.4). Total num frames: 566296576. Throughput: 0: 44635.7. Samples: 567465300. Policy #0 lag: (min: 2.0, avg: 60.3, max: 108.0) [2024-03-21 02:07:50,521][03784] Avg episode reward: [(0, '0.948')] [2024-03-21 02:07:53,632][04017] Updated weights for policy 0, policy_version 17286 (0.0015) [2024-03-21 02:07:55,521][03784] Fps is (10 sec: 55704.6, 60 sec: 47513.5, 300 sec: 46208.4). Total num frames: 566460416. Throughput: 0: 45262.1. Samples: 567760800. Policy #0 lag: (min: 2.0, avg: 60.3, max: 108.0) [2024-03-21 02:07:55,522][03784] Avg episode reward: [(0, '1.192')] [2024-03-21 02:08:00,521][03784] Fps is (10 sec: 36044.6, 60 sec: 45329.1, 300 sec: 46097.4). Total num frames: 566657024. Throughput: 0: 45391.2. Samples: 567903300. Policy #0 lag: (min: 0.0, avg: 41.0, max: 90.0) [2024-03-21 02:08:00,522][03784] Avg episode reward: [(0, '0.915')] [2024-03-21 02:08:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000017293_566657024.pth... [2024-03-21 02:08:00,649][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000016954_555548672.pth [2024-03-21 02:08:01,927][04017] Updated weights for policy 0, policy_version 17296 (0.0015) [2024-03-21 02:08:05,521][03784] Fps is (10 sec: 45874.9, 60 sec: 46967.3, 300 sec: 46541.6). Total num frames: 566919168. Throughput: 0: 45453.2. Samples: 568188600. Policy #0 lag: (min: 0.0, avg: 41.0, max: 90.0) [2024-03-21 02:08:05,522][03784] Avg episode reward: [(0, '0.681')] [2024-03-21 02:08:08,769][04017] Updated weights for policy 0, policy_version 17306 (0.0011) [2024-03-21 02:08:10,521][03784] Fps is (10 sec: 55705.3, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 567214080. Throughput: 0: 45328.9. Samples: 568444400. Policy #0 lag: (min: 3.0, avg: 43.0, max: 95.0) [2024-03-21 02:08:10,522][03784] Avg episode reward: [(0, '0.590')] [2024-03-21 02:08:15,306][04017] Updated weights for policy 0, policy_version 17316 (0.0012) [2024-03-21 02:08:15,521][03784] Fps is (10 sec: 49153.0, 60 sec: 48059.8, 300 sec: 46430.6). Total num frames: 567410688. Throughput: 0: 45811.1. Samples: 568586400. Policy #0 lag: (min: 3.0, avg: 43.0, max: 95.0) [2024-03-21 02:08:15,522][03784] Avg episode reward: [(0, '0.587')] [2024-03-21 02:08:20,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 567541760. Throughput: 0: 45962.2. Samples: 568889700. Policy #0 lag: (min: 3.0, avg: 43.0, max: 95.0) [2024-03-21 02:08:20,522][03784] Avg episode reward: [(0, '0.569')] [2024-03-21 02:08:25,521][03784] Fps is (10 sec: 22937.5, 60 sec: 44236.8, 300 sec: 45653.0). Total num frames: 567640064. Throughput: 0: 46411.0. Samples: 569185100. Policy #0 lag: (min: 3.0, avg: 43.0, max: 95.0) [2024-03-21 02:08:25,522][03784] Avg episode reward: [(0, '1.279')] [2024-03-21 02:08:26,457][04017] Updated weights for policy 0, policy_version 17326 (0.0014) [2024-03-21 02:08:30,236][04017] Updated weights for policy 0, policy_version 17336 (0.0011) [2024-03-21 02:08:30,521][03784] Fps is (10 sec: 55706.0, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 568098816. Throughput: 0: 46380.0. Samples: 569299500. Policy #0 lag: (min: 0.0, avg: 34.7, max: 94.0) [2024-03-21 02:08:30,522][03784] Avg episode reward: [(0, '1.257')] [2024-03-21 02:08:35,521][03784] Fps is (10 sec: 55706.2, 60 sec: 47513.7, 300 sec: 46097.4). Total num frames: 568197120. Throughput: 0: 47508.8. Samples: 569603200. Policy #0 lag: (min: 0.0, avg: 34.7, max: 94.0) [2024-03-21 02:08:35,522][03784] Avg episode reward: [(0, '1.253')] [2024-03-21 02:08:40,521][03784] Fps is (10 sec: 26214.1, 60 sec: 46421.3, 300 sec: 46319.5). Total num frames: 568360960. Throughput: 0: 47473.4. Samples: 569897100. Policy #0 lag: (min: 0.0, avg: 34.6, max: 81.0) [2024-03-21 02:08:40,522][03784] Avg episode reward: [(0, '1.253')] [2024-03-21 02:08:41,907][04017] Updated weights for policy 0, policy_version 17346 (0.0011) [2024-03-21 02:08:45,331][03995] Signal inference workers to stop experience collection... (11400 times) [2024-03-21 02:08:45,415][03995] Signal inference workers to resume experience collection... (11400 times) [2024-03-21 02:08:45,422][04017] InferenceWorker_p0-w0: stopping experience collection (11400 times) [2024-03-21 02:08:45,464][04017] InferenceWorker_p0-w0: resuming experience collection (11400 times) [2024-03-21 02:08:45,521][03784] Fps is (10 sec: 36044.2, 60 sec: 44236.7, 300 sec: 45764.1). Total num frames: 568557568. Throughput: 0: 47428.7. Samples: 570037600. Policy #0 lag: (min: 0.0, avg: 34.6, max: 81.0) [2024-03-21 02:08:45,522][03784] Avg episode reward: [(0, '0.708')] [2024-03-21 02:08:48,440][04017] Updated weights for policy 0, policy_version 17356 (0.0016) [2024-03-21 02:08:50,521][03784] Fps is (10 sec: 52428.8, 60 sec: 43144.4, 300 sec: 46541.7). Total num frames: 568885248. Throughput: 0: 46693.5. Samples: 570289800. Policy #0 lag: (min: 0.0, avg: 34.6, max: 81.0) [2024-03-21 02:08:50,522][03784] Avg episode reward: [(0, '0.781')] [2024-03-21 02:08:52,189][04017] Updated weights for policy 0, policy_version 17366 (0.0012) [2024-03-21 02:08:55,521][03784] Fps is (10 sec: 72090.6, 60 sec: 46967.6, 300 sec: 46652.7). Total num frames: 569278464. Throughput: 0: 46348.9. Samples: 570530100. Policy #0 lag: (min: 2.0, avg: 32.5, max: 66.0) [2024-03-21 02:08:55,522][03784] Avg episode reward: [(0, '0.727')] [2024-03-21 02:08:56,812][04017] Updated weights for policy 0, policy_version 17376 (0.0011) [2024-03-21 02:09:00,521][03784] Fps is (10 sec: 65536.8, 60 sec: 48059.7, 300 sec: 47097.1). Total num frames: 569540608. Throughput: 0: 46240.1. Samples: 570667200. Policy #0 lag: (min: 2.0, avg: 32.5, max: 66.0) [2024-03-21 02:09:00,522][03784] Avg episode reward: [(0, '1.245')] [2024-03-21 02:09:05,521][03784] Fps is (10 sec: 39321.6, 60 sec: 45875.4, 300 sec: 46652.7). Total num frames: 569671680. Throughput: 0: 45900.1. Samples: 570955200. Policy #0 lag: (min: 0.0, avg: 67.7, max: 118.0) [2024-03-21 02:09:05,522][03784] Avg episode reward: [(0, '0.973')] [2024-03-21 02:09:07,820][04017] Updated weights for policy 0, policy_version 17386 (0.0016) [2024-03-21 02:09:10,521][03784] Fps is (10 sec: 36044.3, 60 sec: 44782.9, 300 sec: 46430.6). Total num frames: 569901056. Throughput: 0: 45842.2. Samples: 571248000. Policy #0 lag: (min: 0.0, avg: 67.7, max: 118.0) [2024-03-21 02:09:10,522][03784] Avg episode reward: [(0, '0.640')] [2024-03-21 02:09:11,556][04017] Updated weights for policy 0, policy_version 17396 (0.0019) [2024-03-21 02:09:15,521][03784] Fps is (10 sec: 52428.7, 60 sec: 46421.3, 300 sec: 46319.5). Total num frames: 570195968. Throughput: 0: 46308.8. Samples: 571383400. Policy #0 lag: (min: 0.0, avg: 67.7, max: 118.0) [2024-03-21 02:09:15,522][03784] Avg episode reward: [(0, '0.869')] [2024-03-21 02:09:17,829][04017] Updated weights for policy 0, policy_version 17406 (0.0017) [2024-03-21 02:09:20,521][03784] Fps is (10 sec: 58982.0, 60 sec: 49151.9, 300 sec: 46541.6). Total num frames: 570490880. Throughput: 0: 45630.9. Samples: 571656600. Policy #0 lag: (min: 1.0, avg: 35.6, max: 69.0) [2024-03-21 02:09:20,523][03784] Avg episode reward: [(0, '0.745')] [2024-03-21 02:09:25,521][03784] Fps is (10 sec: 39321.8, 60 sec: 49152.1, 300 sec: 46097.4). Total num frames: 570589184. Throughput: 0: 45482.3. Samples: 571943800. Policy #0 lag: (min: 1.0, avg: 35.6, max: 69.0) [2024-03-21 02:09:25,522][03784] Avg episode reward: [(0, '0.801')] [2024-03-21 02:09:27,977][04017] Updated weights for policy 0, policy_version 17416 (0.0013) [2024-03-21 02:09:30,521][03784] Fps is (10 sec: 32768.4, 60 sec: 45329.0, 300 sec: 46319.5). Total num frames: 570818560. Throughput: 0: 45584.5. Samples: 572088900. Policy #0 lag: (min: 2.0, avg: 29.2, max: 65.0) [2024-03-21 02:09:30,522][03784] Avg episode reward: [(0, '1.399')] [2024-03-21 02:09:32,634][03995] Signal inference workers to stop experience collection... (11450 times) [2024-03-21 02:09:32,635][03995] Signal inference workers to resume experience collection... (11450 times) [2024-03-21 02:09:32,706][04017] InferenceWorker_p0-w0: stopping experience collection (11450 times) [2024-03-21 02:09:32,706][04017] InferenceWorker_p0-w0: resuming experience collection (11450 times) [2024-03-21 02:09:33,919][04017] Updated weights for policy 0, policy_version 17426 (0.0016) [2024-03-21 02:09:35,521][03784] Fps is (10 sec: 58982.0, 60 sec: 49698.1, 300 sec: 46763.8). Total num frames: 571179008. Throughput: 0: 46295.6. Samples: 572373100. Policy #0 lag: (min: 2.0, avg: 29.2, max: 65.0) [2024-03-21 02:09:35,522][03784] Avg episode reward: [(0, '1.399')] [2024-03-21 02:09:39,399][04017] Updated weights for policy 0, policy_version 17436 (0.0009) [2024-03-21 02:09:40,521][03784] Fps is (10 sec: 58982.8, 60 sec: 50790.5, 300 sec: 46652.7). Total num frames: 571408384. Throughput: 0: 47004.5. Samples: 572645300. Policy #0 lag: (min: 2.0, avg: 29.2, max: 65.0) [2024-03-21 02:09:40,522][03784] Avg episode reward: [(0, '1.625')] [2024-03-21 02:09:45,521][03784] Fps is (10 sec: 42598.1, 60 sec: 50790.4, 300 sec: 46763.8). Total num frames: 571604992. Throughput: 0: 47302.1. Samples: 572795800. Policy #0 lag: (min: 0.0, avg: 47.2, max: 103.0) [2024-03-21 02:09:45,522][03784] Avg episode reward: [(0, '1.001')] [2024-03-21 02:09:48,239][04017] Updated weights for policy 0, policy_version 17446 (0.0015) [2024-03-21 02:09:50,521][03784] Fps is (10 sec: 26214.4, 60 sec: 46421.4, 300 sec: 46097.4). Total num frames: 571670528. Throughput: 0: 47680.0. Samples: 573100800. Policy #0 lag: (min: 0.0, avg: 47.2, max: 103.0) [2024-03-21 02:09:50,522][03784] Avg episode reward: [(0, '0.574')] [2024-03-21 02:09:55,521][03784] Fps is (10 sec: 29491.3, 60 sec: 43690.6, 300 sec: 46430.6). Total num frames: 571899904. Throughput: 0: 47593.4. Samples: 573389700. Policy #0 lag: (min: 0.0, avg: 47.2, max: 103.0) [2024-03-21 02:09:55,522][03784] Avg episode reward: [(0, '1.276')] [2024-03-21 02:09:57,601][04017] Updated weights for policy 0, policy_version 17456 (0.0011) [2024-03-21 02:10:00,521][03784] Fps is (10 sec: 55704.6, 60 sec: 44782.8, 300 sec: 46985.9). Total num frames: 572227584. Throughput: 0: 47479.8. Samples: 573520000. Policy #0 lag: (min: 0.0, avg: 27.6, max: 65.0) [2024-03-21 02:10:00,522][03784] Avg episode reward: [(0, '0.837')] [2024-03-21 02:10:00,680][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000017464_572260352.pth... [2024-03-21 02:10:00,793][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000017123_561086464.pth [2024-03-21 02:10:02,352][04017] Updated weights for policy 0, policy_version 17466 (0.0015) [2024-03-21 02:10:05,521][03784] Fps is (10 sec: 49153.0, 60 sec: 45329.2, 300 sec: 46874.9). Total num frames: 572391424. Throughput: 0: 47502.5. Samples: 573794200. Policy #0 lag: (min: 0.0, avg: 27.6, max: 65.0) [2024-03-21 02:10:05,521][03784] Avg episode reward: [(0, '0.702')] [2024-03-21 02:10:09,620][04017] Updated weights for policy 0, policy_version 17476 (0.0014) [2024-03-21 02:10:10,521][03784] Fps is (10 sec: 49152.4, 60 sec: 46967.5, 300 sec: 46763.8). Total num frames: 572719104. Throughput: 0: 47137.7. Samples: 574065000. Policy #0 lag: (min: 2.0, avg: 43.3, max: 89.0) [2024-03-21 02:10:10,522][03784] Avg episode reward: [(0, '1.025')] [2024-03-21 02:10:15,521][03784] Fps is (10 sec: 55704.8, 60 sec: 45875.2, 300 sec: 46652.7). Total num frames: 572948480. Throughput: 0: 46802.3. Samples: 574195000. Policy #0 lag: (min: 2.0, avg: 43.3, max: 89.0) [2024-03-21 02:10:15,522][03784] Avg episode reward: [(0, '0.845')] [2024-03-21 02:10:15,620][04017] Updated weights for policy 0, policy_version 17486 (0.0016) [2024-03-21 02:10:20,521][03784] Fps is (10 sec: 55705.8, 60 sec: 46421.4, 300 sec: 47097.0). Total num frames: 573276160. Throughput: 0: 46593.3. Samples: 574469800. Policy #0 lag: (min: 2.0, avg: 43.3, max: 89.0) [2024-03-21 02:10:20,522][03784] Avg episode reward: [(0, '1.248')] [2024-03-21 02:10:21,097][04017] Updated weights for policy 0, policy_version 17496 (0.0016) [2024-03-21 02:10:21,386][03995] Signal inference workers to stop experience collection... (11500 times) [2024-03-21 02:10:21,465][03995] Signal inference workers to resume experience collection... (11500 times) [2024-03-21 02:10:21,481][04017] InferenceWorker_p0-w0: stopping experience collection (11500 times) [2024-03-21 02:10:21,524][04017] InferenceWorker_p0-w0: resuming experience collection (11500 times) [2024-03-21 02:10:25,521][03784] Fps is (10 sec: 62259.4, 60 sec: 49698.1, 300 sec: 47430.3). Total num frames: 573571072. Throughput: 0: 47064.5. Samples: 574763200. Policy #0 lag: (min: 1.0, avg: 56.7, max: 106.0) [2024-03-21 02:10:25,522][03784] Avg episode reward: [(0, '1.248')] [2024-03-21 02:10:27,789][04017] Updated weights for policy 0, policy_version 17506 (0.0015) [2024-03-21 02:10:30,521][03784] Fps is (10 sec: 39321.8, 60 sec: 47513.7, 300 sec: 46763.8). Total num frames: 573669376. Throughput: 0: 46891.2. Samples: 574905900. Policy #0 lag: (min: 1.0, avg: 56.7, max: 106.0) [2024-03-21 02:10:30,522][03784] Avg episode reward: [(0, '1.242')] [2024-03-21 02:10:35,521][03784] Fps is (10 sec: 32767.8, 60 sec: 45329.1, 300 sec: 46208.5). Total num frames: 573898752. Throughput: 0: 46328.9. Samples: 575185600. Policy #0 lag: (min: 0.0, avg: 39.5, max: 77.0) [2024-03-21 02:10:35,522][03784] Avg episode reward: [(0, '0.676')] [2024-03-21 02:10:36,186][04017] Updated weights for policy 0, policy_version 17516 (0.0010) [2024-03-21 02:10:40,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 46097.4). Total num frames: 574128128. Throughput: 0: 46006.7. Samples: 575460000. Policy #0 lag: (min: 0.0, avg: 39.5, max: 77.0) [2024-03-21 02:10:40,522][03784] Avg episode reward: [(0, '1.115')] [2024-03-21 02:10:45,098][04017] Updated weights for policy 0, policy_version 17526 (0.0010) [2024-03-21 02:10:45,521][03784] Fps is (10 sec: 39320.7, 60 sec: 44782.8, 300 sec: 45986.2). Total num frames: 574291968. Throughput: 0: 46466.6. Samples: 575611000. Policy #0 lag: (min: 0.0, avg: 39.5, max: 77.0) [2024-03-21 02:10:45,522][03784] Avg episode reward: [(0, '1.165')] [2024-03-21 02:10:50,165][04017] Updated weights for policy 0, policy_version 17536 (0.0016) [2024-03-21 02:10:50,521][03784] Fps is (10 sec: 52428.4, 60 sec: 49698.1, 300 sec: 46319.5). Total num frames: 574652416. Throughput: 0: 46444.2. Samples: 575884200. Policy #0 lag: (min: 0.0, avg: 38.9, max: 89.0) [2024-03-21 02:10:50,522][03784] Avg episode reward: [(0, '0.523')] [2024-03-21 02:10:55,521][03784] Fps is (10 sec: 45876.4, 60 sec: 47513.7, 300 sec: 46319.5). Total num frames: 574750720. Throughput: 0: 46882.3. Samples: 576174700. Policy #0 lag: (min: 0.0, avg: 38.9, max: 89.0) [2024-03-21 02:10:55,522][03784] Avg episode reward: [(0, '0.523')] [2024-03-21 02:11:00,521][03784] Fps is (10 sec: 16384.2, 60 sec: 43144.7, 300 sec: 45875.2). Total num frames: 574816256. Throughput: 0: 47331.1. Samples: 576324900. Policy #0 lag: (min: 0.0, avg: 38.9, max: 89.0) [2024-03-21 02:11:00,522][03784] Avg episode reward: [(0, '0.804')] [2024-03-21 02:11:02,077][04017] Updated weights for policy 0, policy_version 17546 (0.0011) [2024-03-21 02:11:05,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45329.0, 300 sec: 45986.3). Total num frames: 575111168. Throughput: 0: 47493.4. Samples: 576607000. Policy #0 lag: (min: 0.0, avg: 25.5, max: 71.0) [2024-03-21 02:11:05,522][03784] Avg episode reward: [(0, '0.723')] [2024-03-21 02:11:07,860][04017] Updated weights for policy 0, policy_version 17556 (0.0011) [2024-03-21 02:11:10,521][03784] Fps is (10 sec: 72088.7, 60 sec: 46967.5, 300 sec: 46763.8). Total num frames: 575537152. Throughput: 0: 46962.1. Samples: 576876500. Policy #0 lag: (min: 0.0, avg: 25.5, max: 71.0) [2024-03-21 02:11:10,522][03784] Avg episode reward: [(0, '1.117')] [2024-03-21 02:11:10,998][04017] Updated weights for policy 0, policy_version 17566 (0.0021) [2024-03-21 02:11:13,100][03995] Signal inference workers to stop experience collection... (11550 times) [2024-03-21 02:11:13,202][04017] InferenceWorker_p0-w0: stopping experience collection (11550 times) [2024-03-21 02:11:13,343][03995] Signal inference workers to resume experience collection... (11550 times) [2024-03-21 02:11:13,344][04017] InferenceWorker_p0-w0: resuming experience collection (11550 times) [2024-03-21 02:11:14,178][04017] Updated weights for policy 0, policy_version 17576 (0.0016) [2024-03-21 02:11:15,521][03784] Fps is (10 sec: 95027.1, 60 sec: 51882.7, 300 sec: 47763.5). Total num frames: 576061440. Throughput: 0: 46853.3. Samples: 577014300. Policy #0 lag: (min: 3.0, avg: 44.9, max: 92.0) [2024-03-21 02:11:15,522][03784] Avg episode reward: [(0, '1.510')] [2024-03-21 02:11:20,521][03784] Fps is (10 sec: 58983.1, 60 sec: 47513.7, 300 sec: 47208.1). Total num frames: 576126976. Throughput: 0: 46826.7. Samples: 577292800. Policy #0 lag: (min: 3.0, avg: 44.9, max: 92.0) [2024-03-21 02:11:20,522][03784] Avg episode reward: [(0, '1.064')] [2024-03-21 02:11:25,349][04017] Updated weights for policy 0, policy_version 17586 (0.0014) [2024-03-21 02:11:25,521][03784] Fps is (10 sec: 19660.7, 60 sec: 44782.9, 300 sec: 46874.9). Total num frames: 576258048. Throughput: 0: 47806.6. Samples: 577611300. Policy #0 lag: (min: 3.0, avg: 44.9, max: 92.0) [2024-03-21 02:11:25,522][03784] Avg episode reward: [(0, '1.064')] [2024-03-21 02:11:30,521][03784] Fps is (10 sec: 26214.4, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 576389120. Throughput: 0: 47504.8. Samples: 577748700. Policy #0 lag: (min: 1.0, avg: 36.7, max: 68.0) [2024-03-21 02:11:30,522][03784] Avg episode reward: [(0, '0.988')] [2024-03-21 02:11:33,125][04017] Updated weights for policy 0, policy_version 17596 (0.0011) [2024-03-21 02:11:35,521][03784] Fps is (10 sec: 42598.6, 60 sec: 46421.4, 300 sec: 45764.1). Total num frames: 576684032. Throughput: 0: 47797.9. Samples: 578035100. Policy #0 lag: (min: 1.0, avg: 36.7, max: 68.0) [2024-03-21 02:11:35,522][03784] Avg episode reward: [(0, '1.012')] [2024-03-21 02:11:38,654][04017] Updated weights for policy 0, policy_version 17606 (0.0010) [2024-03-21 02:11:40,521][03784] Fps is (10 sec: 62258.7, 60 sec: 48059.7, 300 sec: 46652.7). Total num frames: 577011712. Throughput: 0: 47175.5. Samples: 578297600. Policy #0 lag: (min: 0.0, avg: 38.3, max: 95.0) [2024-03-21 02:11:40,522][03784] Avg episode reward: [(0, '0.763')] [2024-03-21 02:11:45,521][03784] Fps is (10 sec: 49152.2, 60 sec: 48060.0, 300 sec: 46874.9). Total num frames: 577175552. Throughput: 0: 47388.9. Samples: 578457400. Policy #0 lag: (min: 0.0, avg: 38.3, max: 95.0) [2024-03-21 02:11:45,522][03784] Avg episode reward: [(0, '0.763')] [2024-03-21 02:11:46,125][04017] Updated weights for policy 0, policy_version 17616 (0.0012) [2024-03-21 02:11:50,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48059.8, 300 sec: 47208.1). Total num frames: 577536000. Throughput: 0: 47342.1. Samples: 578737400. Policy #0 lag: (min: 0.0, avg: 38.3, max: 95.0) [2024-03-21 02:11:50,522][03784] Avg episode reward: [(0, '0.939')] [2024-03-21 02:11:51,486][04017] Updated weights for policy 0, policy_version 17626 (0.0010) [2024-03-21 02:11:55,521][03784] Fps is (10 sec: 42598.2, 60 sec: 47513.6, 300 sec: 46319.5). Total num frames: 577601536. Throughput: 0: 47762.3. Samples: 579025800. Policy #0 lag: (min: 0.0, avg: 39.5, max: 82.0) [2024-03-21 02:11:55,522][03784] Avg episode reward: [(0, '0.432')] [2024-03-21 02:12:00,521][03784] Fps is (10 sec: 29491.5, 60 sec: 50244.3, 300 sec: 46541.7). Total num frames: 577830912. Throughput: 0: 47493.4. Samples: 579151500. Policy #0 lag: (min: 0.0, avg: 39.5, max: 82.0) [2024-03-21 02:12:00,522][03784] Avg episode reward: [(0, '0.668')] [2024-03-21 02:12:00,548][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000017635_577863680.pth... [2024-03-21 02:12:00,671][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000017293_566657024.pth [2024-03-21 02:12:01,117][04017] Updated weights for policy 0, policy_version 17636 (0.0016) [2024-03-21 02:12:05,521][03784] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 46430.6). Total num frames: 578060288. Throughput: 0: 47744.4. Samples: 579441300. Policy #0 lag: (min: 1.0, avg: 67.1, max: 118.0) [2024-03-21 02:12:05,522][03784] Avg episode reward: [(0, '0.814')] [2024-03-21 02:12:07,700][04017] Updated weights for policy 0, policy_version 17646 (0.0011) [2024-03-21 02:12:10,521][03784] Fps is (10 sec: 52428.5, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 578355200. Throughput: 0: 46000.0. Samples: 579681300. Policy #0 lag: (min: 1.0, avg: 67.1, max: 118.0) [2024-03-21 02:12:10,522][03784] Avg episode reward: [(0, '1.167')] [2024-03-21 02:12:15,521][03784] Fps is (10 sec: 39321.9, 60 sec: 39867.8, 300 sec: 46430.6). Total num frames: 578453504. Throughput: 0: 46091.1. Samples: 579822800. Policy #0 lag: (min: 1.0, avg: 67.1, max: 118.0) [2024-03-21 02:12:15,522][03784] Avg episode reward: [(0, '0.993')] [2024-03-21 02:12:16,954][04017] Updated weights for policy 0, policy_version 17656 (0.0025) [2024-03-21 02:12:19,446][03995] Signal inference workers to stop experience collection... (11600 times) [2024-03-21 02:12:19,447][03995] Signal inference workers to resume experience collection... (11600 times) [2024-03-21 02:12:19,512][04017] InferenceWorker_p0-w0: stopping experience collection (11600 times) [2024-03-21 02:12:19,512][04017] InferenceWorker_p0-w0: resuming experience collection (11600 times) [2024-03-21 02:12:20,521][03784] Fps is (10 sec: 29491.4, 60 sec: 42052.3, 300 sec: 46319.5). Total num frames: 578650112. Throughput: 0: 45773.4. Samples: 580094900. Policy #0 lag: (min: 1.0, avg: 56.2, max: 111.0) [2024-03-21 02:12:20,522][03784] Avg episode reward: [(0, '0.431')] [2024-03-21 02:12:25,311][04017] Updated weights for policy 0, policy_version 17666 (0.0014) [2024-03-21 02:12:25,521][03784] Fps is (10 sec: 42598.1, 60 sec: 43690.7, 300 sec: 46208.4). Total num frames: 578879488. Throughput: 0: 46104.5. Samples: 580372300. Policy #0 lag: (min: 1.0, avg: 56.2, max: 111.0) [2024-03-21 02:12:25,522][03784] Avg episode reward: [(0, '1.492')] [2024-03-21 02:12:28,853][04017] Updated weights for policy 0, policy_version 17676 (0.0012) [2024-03-21 02:12:30,521][03784] Fps is (10 sec: 62258.7, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 579272704. Throughput: 0: 45551.0. Samples: 580507200. Policy #0 lag: (min: 0.0, avg: 54.0, max: 125.0) [2024-03-21 02:12:30,522][03784] Avg episode reward: [(0, '1.492')] [2024-03-21 02:12:34,097][04017] Updated weights for policy 0, policy_version 17686 (0.0020) [2024-03-21 02:12:35,521][03784] Fps is (10 sec: 68812.6, 60 sec: 48059.7, 300 sec: 47430.3). Total num frames: 579567616. Throughput: 0: 45400.0. Samples: 580780400. Policy #0 lag: (min: 0.0, avg: 54.0, max: 125.0) [2024-03-21 02:12:35,522][03784] Avg episode reward: [(0, '0.576')] [2024-03-21 02:12:40,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 46652.7). Total num frames: 579665920. Throughput: 0: 45684.4. Samples: 581081600. Policy #0 lag: (min: 0.0, avg: 54.0, max: 125.0) [2024-03-21 02:12:40,522][03784] Avg episode reward: [(0, '1.101')] [2024-03-21 02:12:43,239][04017] Updated weights for policy 0, policy_version 17696 (0.0012) [2024-03-21 02:12:45,521][03784] Fps is (10 sec: 45875.1, 60 sec: 47513.5, 300 sec: 46541.6). Total num frames: 580026368. Throughput: 0: 45615.4. Samples: 581204200. Policy #0 lag: (min: 1.0, avg: 43.1, max: 79.0) [2024-03-21 02:12:45,522][03784] Avg episode reward: [(0, '1.037')] [2024-03-21 02:12:46,907][04017] Updated weights for policy 0, policy_version 17706 (0.0015) [2024-03-21 02:12:50,521][03784] Fps is (10 sec: 58981.9, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 580255744. Throughput: 0: 45193.3. Samples: 581475000. Policy #0 lag: (min: 1.0, avg: 43.1, max: 79.0) [2024-03-21 02:12:50,522][03784] Avg episode reward: [(0, '1.037')] [2024-03-21 02:12:53,866][04017] Updated weights for policy 0, policy_version 17716 (0.0011) [2024-03-21 02:12:55,521][03784] Fps is (10 sec: 55706.2, 60 sec: 49698.2, 300 sec: 47208.1). Total num frames: 580583424. Throughput: 0: 45788.9. Samples: 581741800. Policy #0 lag: (min: 1.0, avg: 43.1, max: 79.0) [2024-03-21 02:12:55,522][03784] Avg episode reward: [(0, '1.125')] [2024-03-21 02:13:00,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49151.9, 300 sec: 46986.0). Total num frames: 580780032. Throughput: 0: 46208.7. Samples: 581902200. Policy #0 lag: (min: 0.0, avg: 42.0, max: 76.0) [2024-03-21 02:13:00,522][03784] Avg episode reward: [(0, '1.307')] [2024-03-21 02:13:01,176][04017] Updated weights for policy 0, policy_version 17726 (0.0010) [2024-03-21 02:13:05,521][03784] Fps is (10 sec: 36044.8, 60 sec: 48059.8, 300 sec: 46541.7). Total num frames: 580943872. Throughput: 0: 46662.2. Samples: 582194700. Policy #0 lag: (min: 0.0, avg: 42.0, max: 76.0) [2024-03-21 02:13:05,522][03784] Avg episode reward: [(0, '1.044')] [2024-03-21 02:13:07,759][03995] Signal inference workers to stop experience collection... (11650 times) [2024-03-21 02:13:07,760][03995] Signal inference workers to resume experience collection... (11650 times) [2024-03-21 02:13:07,856][04017] InferenceWorker_p0-w0: stopping experience collection (11650 times) [2024-03-21 02:13:07,857][04017] InferenceWorker_p0-w0: resuming experience collection (11650 times) [2024-03-21 02:13:10,521][03784] Fps is (10 sec: 36045.5, 60 sec: 46421.4, 300 sec: 46541.7). Total num frames: 581140480. Throughput: 0: 46711.2. Samples: 582474300. Policy #0 lag: (min: 0.0, avg: 43.6, max: 94.0) [2024-03-21 02:13:10,522][03784] Avg episode reward: [(0, '1.002')] [2024-03-21 02:13:12,440][04017] Updated weights for policy 0, policy_version 17736 (0.0015) [2024-03-21 02:13:15,521][03784] Fps is (10 sec: 32768.1, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 581271552. Throughput: 0: 46602.3. Samples: 582604300. Policy #0 lag: (min: 0.0, avg: 43.6, max: 94.0) [2024-03-21 02:13:15,522][03784] Avg episode reward: [(0, '0.680')] [2024-03-21 02:13:20,521][03784] Fps is (10 sec: 32767.7, 60 sec: 46967.4, 300 sec: 46874.9). Total num frames: 581468160. Throughput: 0: 47131.1. Samples: 582901300. Policy #0 lag: (min: 0.0, avg: 43.6, max: 94.0) [2024-03-21 02:13:20,522][03784] Avg episode reward: [(0, '0.680')] [2024-03-21 02:13:21,124][04017] Updated weights for policy 0, policy_version 17746 (0.0010) [2024-03-21 02:13:25,521][03784] Fps is (10 sec: 39321.4, 60 sec: 46421.4, 300 sec: 45986.3). Total num frames: 581664768. Throughput: 0: 46346.7. Samples: 583167200. Policy #0 lag: (min: 2.0, avg: 37.0, max: 88.0) [2024-03-21 02:13:25,522][03784] Avg episode reward: [(0, '0.543')] [2024-03-21 02:13:30,348][04017] Updated weights for policy 0, policy_version 17756 (0.0028) [2024-03-21 02:13:30,525][03784] Fps is (10 sec: 36029.6, 60 sec: 42595.4, 300 sec: 46207.8). Total num frames: 581828608. Throughput: 0: 46911.2. Samples: 583315400. Policy #0 lag: (min: 2.0, avg: 37.0, max: 88.0) [2024-03-21 02:13:30,526][03784] Avg episode reward: [(0, '0.878')] [2024-03-21 02:13:34,414][04017] Updated weights for policy 0, policy_version 17766 (0.0022) [2024-03-21 02:13:35,521][03784] Fps is (10 sec: 55705.8, 60 sec: 44236.9, 300 sec: 46986.0). Total num frames: 582221824. Throughput: 0: 46651.3. Samples: 583574300. Policy #0 lag: (min: 4.0, avg: 51.1, max: 112.0) [2024-03-21 02:13:35,522][03784] Avg episode reward: [(0, '0.594')] [2024-03-21 02:13:38,558][04017] Updated weights for policy 0, policy_version 17776 (0.0014) [2024-03-21 02:13:40,521][03784] Fps is (10 sec: 72120.1, 60 sec: 48059.7, 300 sec: 47430.3). Total num frames: 582549504. Throughput: 0: 46562.2. Samples: 583837100. Policy #0 lag: (min: 4.0, avg: 51.1, max: 112.0) [2024-03-21 02:13:40,522][03784] Avg episode reward: [(0, '0.768')] [2024-03-21 02:13:45,521][03784] Fps is (10 sec: 52428.2, 60 sec: 45329.1, 300 sec: 46986.0). Total num frames: 582746112. Throughput: 0: 46411.2. Samples: 583990700. Policy #0 lag: (min: 4.0, avg: 51.1, max: 112.0) [2024-03-21 02:13:45,522][03784] Avg episode reward: [(0, '0.409')] [2024-03-21 02:13:47,974][04017] Updated weights for policy 0, policy_version 17786 (0.0012) [2024-03-21 02:13:50,521][03784] Fps is (10 sec: 42597.8, 60 sec: 45329.0, 300 sec: 46430.6). Total num frames: 582975488. Throughput: 0: 46122.0. Samples: 584270200. Policy #0 lag: (min: 0.0, avg: 61.2, max: 114.0) [2024-03-21 02:13:50,522][03784] Avg episode reward: [(0, '1.108')] [2024-03-21 02:13:52,552][04017] Updated weights for policy 0, policy_version 17796 (0.0012) [2024-03-21 02:13:55,521][03784] Fps is (10 sec: 55706.1, 60 sec: 45329.1, 300 sec: 46652.7). Total num frames: 583303168. Throughput: 0: 45468.8. Samples: 584520400. Policy #0 lag: (min: 0.0, avg: 61.2, max: 114.0) [2024-03-21 02:13:55,522][03784] Avg episode reward: [(0, '1.438')] [2024-03-21 02:13:59,208][04017] Updated weights for policy 0, policy_version 17806 (0.0021) [2024-03-21 02:14:00,521][03784] Fps is (10 sec: 55706.0, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 583532544. Throughput: 0: 45897.6. Samples: 584669700. Policy #0 lag: (min: 0.0, avg: 61.2, max: 114.0) [2024-03-21 02:14:00,522][03784] Avg episode reward: [(0, '1.128')] [2024-03-21 02:14:00,846][03995] Signal inference workers to stop experience collection... (11700 times) [2024-03-21 02:14:00,850][03995] Signal inference workers to resume experience collection... (11700 times) [2024-03-21 02:14:00,851][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000017809_583565312.pth... [2024-03-21 02:14:00,908][04017] InferenceWorker_p0-w0: stopping experience collection (11700 times) [2024-03-21 02:14:00,908][04017] InferenceWorker_p0-w0: resuming experience collection (11700 times) [2024-03-21 02:14:00,982][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000017464_572260352.pth [2024-03-21 02:14:03,443][04017] Updated weights for policy 0, policy_version 17816 (0.0017) [2024-03-21 02:14:05,521][03784] Fps is (10 sec: 55705.0, 60 sec: 48605.8, 300 sec: 47319.2). Total num frames: 583860224. Throughput: 0: 45202.2. Samples: 584935400. Policy #0 lag: (min: 2.0, avg: 42.7, max: 87.0) [2024-03-21 02:14:05,522][03784] Avg episode reward: [(0, '1.046')] [2024-03-21 02:14:10,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46967.4, 300 sec: 46652.7). Total num frames: 583958528. Throughput: 0: 45588.8. Samples: 585218700. Policy #0 lag: (min: 2.0, avg: 42.7, max: 87.0) [2024-03-21 02:14:10,522][03784] Avg episode reward: [(0, '1.173')] [2024-03-21 02:14:14,094][04017] Updated weights for policy 0, policy_version 17826 (0.0009) [2024-03-21 02:14:15,521][03784] Fps is (10 sec: 26214.6, 60 sec: 47513.5, 300 sec: 46208.5). Total num frames: 584122368. Throughput: 0: 45444.3. Samples: 585360200. Policy #0 lag: (min: 1.0, avg: 33.7, max: 74.0) [2024-03-21 02:14:15,522][03784] Avg episode reward: [(0, '0.629')] [2024-03-21 02:14:20,521][03784] Fps is (10 sec: 45875.0, 60 sec: 49151.9, 300 sec: 46874.9). Total num frames: 584417280. Throughput: 0: 45935.4. Samples: 585641400. Policy #0 lag: (min: 1.0, avg: 33.7, max: 74.0) [2024-03-21 02:14:20,522][03784] Avg episode reward: [(0, '1.211')] [2024-03-21 02:14:20,988][04017] Updated weights for policy 0, policy_version 17836 (0.0010) [2024-03-21 02:14:25,521][03784] Fps is (10 sec: 45875.4, 60 sec: 48605.9, 300 sec: 46652.8). Total num frames: 584581120. Throughput: 0: 46966.7. Samples: 585950600. Policy #0 lag: (min: 1.0, avg: 33.7, max: 74.0) [2024-03-21 02:14:25,522][03784] Avg episode reward: [(0, '1.211')] [2024-03-21 02:14:30,432][04017] Updated weights for policy 0, policy_version 17846 (0.0011) [2024-03-21 02:14:30,521][03784] Fps is (10 sec: 36045.0, 60 sec: 49155.4, 300 sec: 46097.4). Total num frames: 584777728. Throughput: 0: 46984.5. Samples: 586105000. Policy #0 lag: (min: 0.0, avg: 38.8, max: 89.0) [2024-03-21 02:14:30,522][03784] Avg episode reward: [(0, '0.991')] [2024-03-21 02:14:35,521][03784] Fps is (10 sec: 22937.3, 60 sec: 43144.4, 300 sec: 45430.9). Total num frames: 584810496. Throughput: 0: 47773.4. Samples: 586420000. Policy #0 lag: (min: 0.0, avg: 38.8, max: 89.0) [2024-03-21 02:14:35,522][03784] Avg episode reward: [(0, '1.185')] [2024-03-21 02:14:37,995][04017] Updated weights for policy 0, policy_version 17856 (0.0021) [2024-03-21 02:14:40,521][03784] Fps is (10 sec: 49151.9, 60 sec: 45329.0, 300 sec: 46319.5). Total num frames: 585269248. Throughput: 0: 47966.6. Samples: 586678900. Policy #0 lag: (min: 0.0, avg: 38.8, max: 89.0) [2024-03-21 02:14:40,522][03784] Avg episode reward: [(0, '0.953')] [2024-03-21 02:14:41,906][04017] Updated weights for policy 0, policy_version 17866 (0.0014) [2024-03-21 02:14:45,521][03784] Fps is (10 sec: 75367.4, 60 sec: 46967.5, 300 sec: 47097.1). Total num frames: 585564160. Throughput: 0: 47764.5. Samples: 586819100. Policy #0 lag: (min: 2.0, avg: 43.9, max: 103.0) [2024-03-21 02:14:45,522][03784] Avg episode reward: [(0, '0.953')] [2024-03-21 02:14:49,559][04017] Updated weights for policy 0, policy_version 17876 (0.0014) [2024-03-21 02:14:50,521][03784] Fps is (10 sec: 58982.5, 60 sec: 48059.8, 300 sec: 47319.2). Total num frames: 585859072. Throughput: 0: 48046.7. Samples: 587097500. Policy #0 lag: (min: 2.0, avg: 43.9, max: 103.0) [2024-03-21 02:14:50,522][03784] Avg episode reward: [(0, '0.441')] [2024-03-21 02:14:53,450][03995] Signal inference workers to stop experience collection... (11750 times) [2024-03-21 02:14:53,451][03995] Signal inference workers to resume experience collection... (11750 times) [2024-03-21 02:14:53,530][04017] InferenceWorker_p0-w0: stopping experience collection (11750 times) [2024-03-21 02:14:53,530][04017] InferenceWorker_p0-w0: resuming experience collection (11750 times) [2024-03-21 02:14:54,731][04017] Updated weights for policy 0, policy_version 17886 (0.0028) [2024-03-21 02:14:55,521][03784] Fps is (10 sec: 55704.4, 60 sec: 46967.3, 300 sec: 47097.1). Total num frames: 586121216. Throughput: 0: 47524.3. Samples: 587357300. Policy #0 lag: (min: 0.0, avg: 46.4, max: 97.0) [2024-03-21 02:14:55,522][03784] Avg episode reward: [(0, '1.045')] [2024-03-21 02:15:00,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46967.5, 300 sec: 47319.2). Total num frames: 586350592. Throughput: 0: 47182.2. Samples: 587483400. Policy #0 lag: (min: 0.0, avg: 46.4, max: 97.0) [2024-03-21 02:15:00,522][03784] Avg episode reward: [(0, '0.775')] [2024-03-21 02:15:00,996][04017] Updated weights for policy 0, policy_version 17896 (0.0019) [2024-03-21 02:15:05,521][03784] Fps is (10 sec: 45876.1, 60 sec: 45329.1, 300 sec: 46986.0). Total num frames: 586579968. Throughput: 0: 47409.0. Samples: 587774800. Policy #0 lag: (min: 0.0, avg: 46.4, max: 97.0) [2024-03-21 02:15:05,522][03784] Avg episode reward: [(0, '1.447')] [2024-03-21 02:15:08,766][04017] Updated weights for policy 0, policy_version 17906 (0.0010) [2024-03-21 02:15:10,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 586776576. Throughput: 0: 46784.4. Samples: 588055900. Policy #0 lag: (min: 0.0, avg: 55.4, max: 105.0) [2024-03-21 02:15:10,522][03784] Avg episode reward: [(0, '0.599')] [2024-03-21 02:15:15,521][03784] Fps is (10 sec: 29491.2, 60 sec: 45875.2, 300 sec: 46097.4). Total num frames: 586874880. Throughput: 0: 46564.5. Samples: 588200400. Policy #0 lag: (min: 0.0, avg: 55.4, max: 105.0) [2024-03-21 02:15:15,522][03784] Avg episode reward: [(0, '0.761')] [2024-03-21 02:15:19,332][04017] Updated weights for policy 0, policy_version 17916 (0.0010) [2024-03-21 02:15:20,521][03784] Fps is (10 sec: 39321.7, 60 sec: 45875.3, 300 sec: 46097.4). Total num frames: 587169792. Throughput: 0: 46013.5. Samples: 588490600. Policy #0 lag: (min: 0.0, avg: 55.4, max: 105.0) [2024-03-21 02:15:20,522][03784] Avg episode reward: [(0, '0.546')] [2024-03-21 02:15:22,676][04017] Updated weights for policy 0, policy_version 17926 (0.0021) [2024-03-21 02:15:25,521][03784] Fps is (10 sec: 62259.3, 60 sec: 48605.9, 300 sec: 46874.9). Total num frames: 587497472. Throughput: 0: 46511.2. Samples: 588771900. Policy #0 lag: (min: 2.0, avg: 28.8, max: 71.0) [2024-03-21 02:15:25,522][03784] Avg episode reward: [(0, '0.620')] [2024-03-21 02:15:30,521][03784] Fps is (10 sec: 42598.0, 60 sec: 46967.4, 300 sec: 46430.6). Total num frames: 587595776. Throughput: 0: 46348.8. Samples: 588904800. Policy #0 lag: (min: 2.0, avg: 28.8, max: 71.0) [2024-03-21 02:15:30,522][03784] Avg episode reward: [(0, '1.058')] [2024-03-21 02:15:32,122][04017] Updated weights for policy 0, policy_version 17936 (0.0011) [2024-03-21 02:15:35,521][03784] Fps is (10 sec: 42598.0, 60 sec: 51882.7, 300 sec: 46763.8). Total num frames: 587923456. Throughput: 0: 46437.8. Samples: 589187200. Policy #0 lag: (min: 0.0, avg: 38.7, max: 80.0) [2024-03-21 02:15:35,522][03784] Avg episode reward: [(0, '1.044')] [2024-03-21 02:15:38,545][04017] Updated weights for policy 0, policy_version 17946 (0.0015) [2024-03-21 02:15:40,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46421.3, 300 sec: 46652.8). Total num frames: 588054528. Throughput: 0: 47317.9. Samples: 589486600. Policy #0 lag: (min: 0.0, avg: 38.7, max: 80.0) [2024-03-21 02:15:40,522][03784] Avg episode reward: [(0, '1.341')] [2024-03-21 02:15:45,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46421.2, 300 sec: 46430.6). Total num frames: 588349440. Throughput: 0: 47624.4. Samples: 589626500. Policy #0 lag: (min: 0.0, avg: 38.7, max: 80.0) [2024-03-21 02:15:45,522][03784] Avg episode reward: [(0, '1.018')] [2024-03-21 02:15:45,902][04017] Updated weights for policy 0, policy_version 17956 (0.0010) [2024-03-21 02:15:50,521][03784] Fps is (10 sec: 45875.0, 60 sec: 44236.7, 300 sec: 46652.7). Total num frames: 588513280. Throughput: 0: 47795.4. Samples: 589925600. Policy #0 lag: (min: 0.0, avg: 43.3, max: 94.0) [2024-03-21 02:15:50,522][03784] Avg episode reward: [(0, '0.723')] [2024-03-21 02:15:50,812][03995] Signal inference workers to stop experience collection... (11800 times) [2024-03-21 02:15:50,935][04017] InferenceWorker_p0-w0: stopping experience collection (11800 times) [2024-03-21 02:15:51,046][03995] Signal inference workers to resume experience collection... (11800 times) [2024-03-21 02:15:51,046][04017] InferenceWorker_p0-w0: resuming experience collection (11800 times) [2024-03-21 02:15:52,238][04017] Updated weights for policy 0, policy_version 17966 (0.0018) [2024-03-21 02:15:55,521][03784] Fps is (10 sec: 58982.3, 60 sec: 46967.5, 300 sec: 47874.6). Total num frames: 588939264. Throughput: 0: 47311.0. Samples: 590184900. Policy #0 lag: (min: 0.0, avg: 43.3, max: 94.0) [2024-03-21 02:15:55,522][03784] Avg episode reward: [(0, '1.276')] [2024-03-21 02:15:57,084][04017] Updated weights for policy 0, policy_version 17976 (0.0016) [2024-03-21 02:16:00,521][03784] Fps is (10 sec: 58982.7, 60 sec: 45875.2, 300 sec: 47430.3). Total num frames: 589103104. Throughput: 0: 47226.6. Samples: 590325600. Policy #0 lag: (min: 0.0, avg: 43.3, max: 94.0) [2024-03-21 02:16:00,522][03784] Avg episode reward: [(0, '0.501')] [2024-03-21 02:16:00,748][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000017979_589135872.pth... [2024-03-21 02:16:00,864][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000017635_577863680.pth [2024-03-21 02:16:05,521][03784] Fps is (10 sec: 39322.1, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 589332480. Throughput: 0: 47544.5. Samples: 590630100. Policy #0 lag: (min: 0.0, avg: 45.8, max: 93.0) [2024-03-21 02:16:05,522][03784] Avg episode reward: [(0, '1.043')] [2024-03-21 02:16:08,246][04017] Updated weights for policy 0, policy_version 17986 (0.0011) [2024-03-21 02:16:10,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 589529088. Throughput: 0: 47653.3. Samples: 590916300. Policy #0 lag: (min: 0.0, avg: 45.8, max: 93.0) [2024-03-21 02:16:10,522][03784] Avg episode reward: [(0, '1.398')] [2024-03-21 02:16:13,877][04017] Updated weights for policy 0, policy_version 17996 (0.0019) [2024-03-21 02:16:15,521][03784] Fps is (10 sec: 52428.8, 60 sec: 49698.1, 300 sec: 46541.7). Total num frames: 589856768. Throughput: 0: 47911.2. Samples: 591060800. Policy #0 lag: (min: 0.0, avg: 45.8, max: 93.0) [2024-03-21 02:16:15,522][03784] Avg episode reward: [(0, '0.554')] [2024-03-21 02:16:17,575][04017] Updated weights for policy 0, policy_version 18006 (0.0031) [2024-03-21 02:16:20,521][03784] Fps is (10 sec: 62258.6, 60 sec: 49698.0, 300 sec: 47097.0). Total num frames: 590151680. Throughput: 0: 47764.4. Samples: 591336600. Policy #0 lag: (min: 3.0, avg: 41.4, max: 86.0) [2024-03-21 02:16:20,522][03784] Avg episode reward: [(0, '0.793')] [2024-03-21 02:16:24,283][04017] Updated weights for policy 0, policy_version 18016 (0.0015) [2024-03-21 02:16:25,521][03784] Fps is (10 sec: 58982.4, 60 sec: 49152.0, 300 sec: 47652.4). Total num frames: 590446592. Throughput: 0: 47789.0. Samples: 591637100. Policy #0 lag: (min: 3.0, avg: 41.4, max: 86.0) [2024-03-21 02:16:25,522][03784] Avg episode reward: [(0, '0.793')] [2024-03-21 02:16:30,291][04017] Updated weights for policy 0, policy_version 18026 (0.0023) [2024-03-21 02:16:30,521][03784] Fps is (10 sec: 52429.2, 60 sec: 51336.6, 300 sec: 47430.3). Total num frames: 590675968. Throughput: 0: 47904.5. Samples: 591782200. Policy #0 lag: (min: 0.0, avg: 68.6, max: 122.0) [2024-03-21 02:16:30,522][03784] Avg episode reward: [(0, '0.473')] [2024-03-21 02:16:35,521][03784] Fps is (10 sec: 42598.1, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 590872576. Throughput: 0: 47449.0. Samples: 592060800. Policy #0 lag: (min: 0.0, avg: 68.6, max: 122.0) [2024-03-21 02:16:35,522][03784] Avg episode reward: [(0, '0.659')] [2024-03-21 02:16:40,521][03784] Fps is (10 sec: 29491.4, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 590970880. Throughput: 0: 48109.0. Samples: 592349800. Policy #0 lag: (min: 0.0, avg: 68.6, max: 122.0) [2024-03-21 02:16:40,522][03784] Avg episode reward: [(0, '1.312')] [2024-03-21 02:16:41,251][04017] Updated weights for policy 0, policy_version 18036 (0.0011) [2024-03-21 02:16:45,521][03784] Fps is (10 sec: 22937.7, 60 sec: 45875.3, 300 sec: 45986.3). Total num frames: 591101952. Throughput: 0: 48195.6. Samples: 592494400. Policy #0 lag: (min: 0.0, avg: 29.4, max: 64.0) [2024-03-21 02:16:45,522][03784] Avg episode reward: [(0, '0.791')] [2024-03-21 02:16:49,600][04017] Updated weights for policy 0, policy_version 18046 (0.0021) [2024-03-21 02:16:50,021][03995] Signal inference workers to stop experience collection... (11850 times) [2024-03-21 02:16:50,093][04017] InferenceWorker_p0-w0: stopping experience collection (11850 times) [2024-03-21 02:16:50,355][03995] Signal inference workers to resume experience collection... (11850 times) [2024-03-21 02:16:50,356][04017] InferenceWorker_p0-w0: resuming experience collection (11850 times) [2024-03-21 02:16:50,521][03784] Fps is (10 sec: 39320.9, 60 sec: 47513.5, 300 sec: 46652.7). Total num frames: 591364096. Throughput: 0: 47339.8. Samples: 592760400. Policy #0 lag: (min: 0.0, avg: 29.4, max: 64.0) [2024-03-21 02:16:50,522][03784] Avg episode reward: [(0, '0.498')] [2024-03-21 02:16:55,521][03784] Fps is (10 sec: 36044.8, 60 sec: 42052.3, 300 sec: 46208.4). Total num frames: 591462400. Throughput: 0: 47002.3. Samples: 593031400. Policy #0 lag: (min: 0.0, avg: 29.4, max: 64.0) [2024-03-21 02:16:55,522][03784] Avg episode reward: [(0, '1.250')] [2024-03-21 02:17:00,028][04017] Updated weights for policy 0, policy_version 18056 (0.0016) [2024-03-21 02:17:00,521][03784] Fps is (10 sec: 32768.3, 60 sec: 43144.5, 300 sec: 46208.4). Total num frames: 591691776. Throughput: 0: 46662.1. Samples: 593160600. Policy #0 lag: (min: 0.0, avg: 28.3, max: 64.0) [2024-03-21 02:17:00,522][03784] Avg episode reward: [(0, '0.976')] [2024-03-21 02:17:05,521][03784] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 45875.2). Total num frames: 591888384. Throughput: 0: 46789.0. Samples: 593442100. Policy #0 lag: (min: 0.0, avg: 28.3, max: 64.0) [2024-03-21 02:17:05,522][03784] Avg episode reward: [(0, '1.278')] [2024-03-21 02:17:06,235][04017] Updated weights for policy 0, policy_version 18066 (0.0010) [2024-03-21 02:17:09,500][04017] Updated weights for policy 0, policy_version 18076 (0.0020) [2024-03-21 02:17:10,521][03784] Fps is (10 sec: 68813.4, 60 sec: 47513.6, 300 sec: 47208.1). Total num frames: 592379904. Throughput: 0: 45811.1. Samples: 593698600. Policy #0 lag: (min: 1.0, avg: 36.1, max: 68.0) [2024-03-21 02:17:10,522][03784] Avg episode reward: [(0, '1.007')] [2024-03-21 02:17:13,526][04017] Updated weights for policy 0, policy_version 18086 (0.0017) [2024-03-21 02:17:15,521][03784] Fps is (10 sec: 95027.1, 60 sec: 49698.1, 300 sec: 48096.7). Total num frames: 592838656. Throughput: 0: 45715.6. Samples: 593839400. Policy #0 lag: (min: 1.0, avg: 36.1, max: 68.0) [2024-03-21 02:17:15,522][03784] Avg episode reward: [(0, '0.482')] [2024-03-21 02:17:18,810][04017] Updated weights for policy 0, policy_version 18096 (0.0011) [2024-03-21 02:17:20,521][03784] Fps is (10 sec: 58982.3, 60 sec: 46967.5, 300 sec: 47763.5). Total num frames: 592969728. Throughput: 0: 45628.9. Samples: 594114100. Policy #0 lag: (min: 1.0, avg: 36.1, max: 68.0) [2024-03-21 02:17:20,522][03784] Avg episode reward: [(0, '0.482')] [2024-03-21 02:17:25,521][03784] Fps is (10 sec: 39321.9, 60 sec: 46421.3, 300 sec: 47319.2). Total num frames: 593231872. Throughput: 0: 45082.3. Samples: 594378500. Policy #0 lag: (min: 0.0, avg: 53.7, max: 112.0) [2024-03-21 02:17:25,522][03784] Avg episode reward: [(0, '0.519')] [2024-03-21 02:17:26,012][04017] Updated weights for policy 0, policy_version 18106 (0.0010) [2024-03-21 02:17:30,521][03784] Fps is (10 sec: 52429.1, 60 sec: 46967.5, 300 sec: 47208.1). Total num frames: 593494016. Throughput: 0: 44840.0. Samples: 594512200. Policy #0 lag: (min: 0.0, avg: 53.7, max: 112.0) [2024-03-21 02:17:30,522][03784] Avg episode reward: [(0, '1.125')] [2024-03-21 02:17:30,910][03995] Signal inference workers to stop experience collection... (11900 times) [2024-03-21 02:17:30,970][04017] InferenceWorker_p0-w0: stopping experience collection (11900 times) [2024-03-21 02:17:30,979][03995] Signal inference workers to resume experience collection... (11900 times) [2024-03-21 02:17:31,015][04017] InferenceWorker_p0-w0: resuming experience collection (11900 times) [2024-03-21 02:17:34,323][04017] Updated weights for policy 0, policy_version 18116 (0.0015) [2024-03-21 02:17:35,521][03784] Fps is (10 sec: 49151.5, 60 sec: 47513.6, 300 sec: 47652.4). Total num frames: 593723392. Throughput: 0: 45533.5. Samples: 594809400. Policy #0 lag: (min: 0.0, avg: 53.7, max: 112.0) [2024-03-21 02:17:35,522][03784] Avg episode reward: [(0, '1.025')] [2024-03-21 02:17:40,521][03784] Fps is (10 sec: 39320.8, 60 sec: 48605.7, 300 sec: 46986.0). Total num frames: 593887232. Throughput: 0: 45899.8. Samples: 595096900. Policy #0 lag: (min: 0.0, avg: 41.1, max: 81.0) [2024-03-21 02:17:40,522][03784] Avg episode reward: [(0, '1.279')] [2024-03-21 02:17:43,238][04017] Updated weights for policy 0, policy_version 18126 (0.0009) [2024-03-21 02:17:45,521][03784] Fps is (10 sec: 29491.3, 60 sec: 48605.8, 300 sec: 46652.8). Total num frames: 594018304. Throughput: 0: 46086.7. Samples: 595234500. Policy #0 lag: (min: 0.0, avg: 41.1, max: 81.0) [2024-03-21 02:17:45,522][03784] Avg episode reward: [(0, '0.948')] [2024-03-21 02:17:50,521][03784] Fps is (10 sec: 22937.8, 60 sec: 45875.3, 300 sec: 45875.2). Total num frames: 594116608. Throughput: 0: 46573.3. Samples: 595537900. Policy #0 lag: (min: 0.0, avg: 41.1, max: 81.0) [2024-03-21 02:17:50,522][03784] Avg episode reward: [(0, '1.152')] [2024-03-21 02:17:55,521][03784] Fps is (10 sec: 22937.5, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 594247680. Throughput: 0: 47564.4. Samples: 595839000. Policy #0 lag: (min: 0.0, avg: 27.6, max: 81.0) [2024-03-21 02:17:55,522][03784] Avg episode reward: [(0, '1.071')] [2024-03-21 02:17:55,814][04017] Updated weights for policy 0, policy_version 18136 (0.0011) [2024-03-21 02:17:59,974][04017] Updated weights for policy 0, policy_version 18146 (0.0011) [2024-03-21 02:18:00,521][03784] Fps is (10 sec: 49151.1, 60 sec: 48605.7, 300 sec: 46319.5). Total num frames: 594608128. Throughput: 0: 46948.7. Samples: 595952100. Policy #0 lag: (min: 0.0, avg: 27.6, max: 81.0) [2024-03-21 02:18:00,522][03784] Avg episode reward: [(0, '0.801')] [2024-03-21 02:18:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000018146_594608128.pth... [2024-03-21 02:18:00,675][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000017809_583565312.pth [2024-03-21 02:18:04,876][04017] Updated weights for policy 0, policy_version 18156 (0.0015) [2024-03-21 02:18:05,521][03784] Fps is (10 sec: 72089.9, 60 sec: 51336.5, 300 sec: 46874.9). Total num frames: 594968576. Throughput: 0: 46837.8. Samples: 596221800. Policy #0 lag: (min: 0.0, avg: 37.2, max: 107.0) [2024-03-21 02:18:05,522][03784] Avg episode reward: [(0, '0.809')] [2024-03-21 02:18:10,521][03784] Fps is (10 sec: 45875.6, 60 sec: 44782.8, 300 sec: 46763.8). Total num frames: 595066880. Throughput: 0: 47006.4. Samples: 596493800. Policy #0 lag: (min: 0.0, avg: 37.2, max: 107.0) [2024-03-21 02:18:10,522][03784] Avg episode reward: [(0, '1.106')] [2024-03-21 02:18:15,521][03784] Fps is (10 sec: 19660.5, 60 sec: 38775.4, 300 sec: 46430.6). Total num frames: 595165184. Throughput: 0: 46748.7. Samples: 596615900. Policy #0 lag: (min: 0.0, avg: 37.2, max: 107.0) [2024-03-21 02:18:15,522][03784] Avg episode reward: [(0, '0.708')] [2024-03-21 02:18:16,838][04017] Updated weights for policy 0, policy_version 18166 (0.0011) [2024-03-21 02:18:20,521][03784] Fps is (10 sec: 45875.5, 60 sec: 42598.3, 300 sec: 46986.0). Total num frames: 595525632. Throughput: 0: 44504.4. Samples: 596812100. Policy #0 lag: (min: 1.0, avg: 32.4, max: 69.0) [2024-03-21 02:18:20,522][03784] Avg episode reward: [(0, '0.995')] [2024-03-21 02:18:20,950][04017] Updated weights for policy 0, policy_version 18176 (0.0013) [2024-03-21 02:18:25,521][03784] Fps is (10 sec: 55706.5, 60 sec: 41506.1, 300 sec: 47097.7). Total num frames: 595722240. Throughput: 0: 44375.7. Samples: 597093800. Policy #0 lag: (min: 1.0, avg: 32.4, max: 69.0) [2024-03-21 02:18:25,522][03784] Avg episode reward: [(0, '0.732')] [2024-03-21 02:18:30,125][04017] Updated weights for policy 0, policy_version 18186 (0.0010) [2024-03-21 02:18:30,521][03784] Fps is (10 sec: 42598.7, 60 sec: 40960.0, 300 sec: 46541.7). Total num frames: 595951616. Throughput: 0: 44642.2. Samples: 597243400. Policy #0 lag: (min: 1.0, avg: 36.7, max: 103.0) [2024-03-21 02:18:30,522][03784] Avg episode reward: [(0, '0.710')] [2024-03-21 02:18:32,487][03995] Signal inference workers to stop experience collection... (11950 times) [2024-03-21 02:18:32,554][04017] InferenceWorker_p0-w0: stopping experience collection (11950 times) [2024-03-21 02:18:32,568][03995] Signal inference workers to resume experience collection... (11950 times) [2024-03-21 02:18:32,600][04017] InferenceWorker_p0-w0: resuming experience collection (11950 times) [2024-03-21 02:18:34,487][04017] Updated weights for policy 0, policy_version 18196 (0.0012) [2024-03-21 02:18:35,521][03784] Fps is (10 sec: 62258.8, 60 sec: 43690.6, 300 sec: 46763.8). Total num frames: 596344832. Throughput: 0: 44224.4. Samples: 597528000. Policy #0 lag: (min: 1.0, avg: 36.7, max: 103.0) [2024-03-21 02:18:35,522][03784] Avg episode reward: [(0, '1.303')] [2024-03-21 02:18:38,623][04017] Updated weights for policy 0, policy_version 18206 (0.0027) [2024-03-21 02:18:40,521][03784] Fps is (10 sec: 72089.7, 60 sec: 46421.5, 300 sec: 47208.1). Total num frames: 596672512. Throughput: 0: 43902.3. Samples: 597814600. Policy #0 lag: (min: 1.0, avg: 36.7, max: 103.0) [2024-03-21 02:18:40,522][03784] Avg episode reward: [(0, '1.303')] [2024-03-21 02:18:44,092][04017] Updated weights for policy 0, policy_version 18216 (0.0011) [2024-03-21 02:18:45,527][03784] Fps is (10 sec: 65499.5, 60 sec: 49693.5, 300 sec: 47540.5). Total num frames: 597000192. Throughput: 0: 44254.7. Samples: 597943800. Policy #0 lag: (min: 3.0, avg: 37.8, max: 78.0) [2024-03-21 02:18:45,528][03784] Avg episode reward: [(0, '1.129')] [2024-03-21 02:18:50,521][03784] Fps is (10 sec: 45875.3, 60 sec: 50244.4, 300 sec: 46874.9). Total num frames: 597131264. Throughput: 0: 44775.6. Samples: 598236700. Policy #0 lag: (min: 3.0, avg: 37.8, max: 78.0) [2024-03-21 02:18:50,522][03784] Avg episode reward: [(0, '1.129')] [2024-03-21 02:18:52,405][04017] Updated weights for policy 0, policy_version 18226 (0.0011) [2024-03-21 02:18:55,521][03784] Fps is (10 sec: 32786.1, 60 sec: 51336.5, 300 sec: 46763.8). Total num frames: 597327872. Throughput: 0: 45171.2. Samples: 598526500. Policy #0 lag: (min: 3.0, avg: 37.8, max: 78.0) [2024-03-21 02:18:55,522][03784] Avg episode reward: [(0, '0.698')] [2024-03-21 02:19:00,521][03784] Fps is (10 sec: 29491.3, 60 sec: 46967.7, 300 sec: 45986.3). Total num frames: 597426176. Throughput: 0: 45633.6. Samples: 598669400. Policy #0 lag: (min: 1.0, avg: 45.0, max: 95.0) [2024-03-21 02:19:00,522][03784] Avg episode reward: [(0, '1.235')] [2024-03-21 02:19:05,230][04017] Updated weights for policy 0, policy_version 18236 (0.0011) [2024-03-21 02:19:05,521][03784] Fps is (10 sec: 22937.7, 60 sec: 43144.5, 300 sec: 46097.4). Total num frames: 597557248. Throughput: 0: 47851.1. Samples: 598965400. Policy #0 lag: (min: 1.0, avg: 45.0, max: 95.0) [2024-03-21 02:19:05,522][03784] Avg episode reward: [(0, '1.238')] [2024-03-21 02:19:10,521][03784] Fps is (10 sec: 26214.3, 60 sec: 43690.8, 300 sec: 45986.3). Total num frames: 597688320. Throughput: 0: 48171.2. Samples: 599261500. Policy #0 lag: (min: 0.0, avg: 17.6, max: 80.0) [2024-03-21 02:19:10,522][03784] Avg episode reward: [(0, '0.692')] [2024-03-21 02:19:14,938][04017] Updated weights for policy 0, policy_version 18246 (0.0016) [2024-03-21 02:19:15,521][03784] Fps is (10 sec: 36045.4, 60 sec: 45875.4, 300 sec: 45764.1). Total num frames: 597917696. Throughput: 0: 47971.2. Samples: 599402100. Policy #0 lag: (min: 0.0, avg: 17.6, max: 80.0) [2024-03-21 02:19:15,522][03784] Avg episode reward: [(0, '1.229')] [2024-03-21 02:19:19,627][04017] Updated weights for policy 0, policy_version 18256 (0.0014) [2024-03-21 02:19:20,521][03784] Fps is (10 sec: 58982.0, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 598278144. Throughput: 0: 47433.4. Samples: 599662500. Policy #0 lag: (min: 0.0, avg: 17.6, max: 80.0) [2024-03-21 02:19:20,522][03784] Avg episode reward: [(0, '0.822')] [2024-03-21 02:19:24,717][04017] Updated weights for policy 0, policy_version 18266 (0.0020) [2024-03-21 02:19:24,757][03995] Signal inference workers to stop experience collection... (12000 times) [2024-03-21 02:19:24,855][04017] InferenceWorker_p0-w0: stopping experience collection (12000 times) [2024-03-21 02:19:24,956][03995] Signal inference workers to resume experience collection... (12000 times) [2024-03-21 02:19:24,956][04017] InferenceWorker_p0-w0: resuming experience collection (12000 times) [2024-03-21 02:19:25,521][03784] Fps is (10 sec: 68812.0, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 598605824. Throughput: 0: 47513.3. Samples: 599952700. Policy #0 lag: (min: 0.0, avg: 45.0, max: 100.0) [2024-03-21 02:19:25,522][03784] Avg episode reward: [(0, '1.033')] [2024-03-21 02:19:28,639][04017] Updated weights for policy 0, policy_version 18276 (0.0011) [2024-03-21 02:19:30,521][03784] Fps is (10 sec: 65536.2, 60 sec: 49698.1, 300 sec: 47874.6). Total num frames: 598933504. Throughput: 0: 47874.9. Samples: 600097900. Policy #0 lag: (min: 0.0, avg: 45.0, max: 100.0) [2024-03-21 02:19:30,522][03784] Avg episode reward: [(0, '1.033')] [2024-03-21 02:19:35,521][03784] Fps is (10 sec: 55706.0, 60 sec: 46967.6, 300 sec: 47097.1). Total num frames: 599162880. Throughput: 0: 48120.0. Samples: 600402100. Policy #0 lag: (min: 2.0, avg: 45.2, max: 100.0) [2024-03-21 02:19:35,522][03784] Avg episode reward: [(0, '1.033')] [2024-03-21 02:19:35,628][04017] Updated weights for policy 0, policy_version 18286 (0.0026) [2024-03-21 02:19:39,269][04017] Updated weights for policy 0, policy_version 18296 (0.0016) [2024-03-21 02:19:40,521][03784] Fps is (10 sec: 62258.7, 60 sec: 48059.7, 300 sec: 47430.3). Total num frames: 599556096. Throughput: 0: 47348.9. Samples: 600657200. Policy #0 lag: (min: 2.0, avg: 45.2, max: 100.0) [2024-03-21 02:19:40,522][03784] Avg episode reward: [(0, '1.147')] [2024-03-21 02:19:45,521][03784] Fps is (10 sec: 58982.2, 60 sec: 45879.5, 300 sec: 47097.1). Total num frames: 599752704. Throughput: 0: 47384.4. Samples: 600801700. Policy #0 lag: (min: 2.0, avg: 45.2, max: 100.0) [2024-03-21 02:19:45,522][03784] Avg episode reward: [(0, '1.413')] [2024-03-21 02:19:49,747][04017] Updated weights for policy 0, policy_version 18306 (0.0016) [2024-03-21 02:19:50,521][03784] Fps is (10 sec: 32768.3, 60 sec: 45875.2, 300 sec: 46652.8). Total num frames: 599883776. Throughput: 0: 47255.6. Samples: 601091900. Policy #0 lag: (min: 0.0, avg: 43.0, max: 83.0) [2024-03-21 02:19:50,522][03784] Avg episode reward: [(0, '0.589')] [2024-03-21 02:19:55,521][03784] Fps is (10 sec: 32767.7, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 600080384. Throughput: 0: 47079.9. Samples: 601380100. Policy #0 lag: (min: 0.0, avg: 43.0, max: 83.0) [2024-03-21 02:19:55,522][03784] Avg episode reward: [(0, '1.124')] [2024-03-21 02:19:58,037][04017] Updated weights for policy 0, policy_version 18316 (0.0010) [2024-03-21 02:20:00,521][03784] Fps is (10 sec: 32768.1, 60 sec: 46421.3, 300 sec: 46208.4). Total num frames: 600211456. Throughput: 0: 47353.3. Samples: 601533000. Policy #0 lag: (min: 0.0, avg: 43.0, max: 83.0) [2024-03-21 02:20:00,522][03784] Avg episode reward: [(0, '1.124')] [2024-03-21 02:20:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000018317_600211456.pth... [2024-03-21 02:20:00,694][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000017979_589135872.pth [2024-03-21 02:20:05,521][03784] Fps is (10 sec: 32768.2, 60 sec: 47513.6, 300 sec: 46208.4). Total num frames: 600408064. Throughput: 0: 48148.9. Samples: 601829200. Policy #0 lag: (min: 2.0, avg: 45.5, max: 101.0) [2024-03-21 02:20:05,522][03784] Avg episode reward: [(0, '0.978')] [2024-03-21 02:20:06,260][04017] Updated weights for policy 0, policy_version 18326 (0.0021) [2024-03-21 02:20:10,264][03995] Signal inference workers to stop experience collection... (12050 times) [2024-03-21 02:20:10,264][03995] Signal inference workers to resume experience collection... (12050 times) [2024-03-21 02:20:10,287][04017] Updated weights for policy 0, policy_version 18336 (0.0012) [2024-03-21 02:20:10,331][04017] InferenceWorker_p0-w0: stopping experience collection (12050 times) [2024-03-21 02:20:10,331][04017] InferenceWorker_p0-w0: resuming experience collection (12050 times) [2024-03-21 02:20:10,521][03784] Fps is (10 sec: 62259.2, 60 sec: 52428.8, 300 sec: 47319.2). Total num frames: 600834048. Throughput: 0: 47555.6. Samples: 602092700. Policy #0 lag: (min: 2.0, avg: 45.5, max: 101.0) [2024-03-21 02:20:10,522][03784] Avg episode reward: [(0, '1.412')] [2024-03-21 02:20:15,521][03784] Fps is (10 sec: 65536.4, 60 sec: 52428.7, 300 sec: 47097.1). Total num frames: 601063424. Throughput: 0: 47460.0. Samples: 602233600. Policy #0 lag: (min: 2.0, avg: 45.5, max: 101.0) [2024-03-21 02:20:15,522][03784] Avg episode reward: [(0, '0.933')] [2024-03-21 02:20:19,426][04017] Updated weights for policy 0, policy_version 18346 (0.0010) [2024-03-21 02:20:20,521][03784] Fps is (10 sec: 32768.1, 60 sec: 48059.8, 300 sec: 46319.5). Total num frames: 601161728. Throughput: 0: 47940.0. Samples: 602559400. Policy #0 lag: (min: 0.0, avg: 38.6, max: 89.0) [2024-03-21 02:20:20,522][03784] Avg episode reward: [(0, '1.063')] [2024-03-21 02:20:25,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 601423872. Throughput: 0: 48526.7. Samples: 602840900. Policy #0 lag: (min: 0.0, avg: 38.6, max: 89.0) [2024-03-21 02:20:25,522][03784] Avg episode reward: [(0, '1.063')] [2024-03-21 02:20:26,435][04017] Updated weights for policy 0, policy_version 18356 (0.0012) [2024-03-21 02:20:30,521][03784] Fps is (10 sec: 55705.3, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 601718784. Throughput: 0: 48291.1. Samples: 602974800. Policy #0 lag: (min: 1.0, avg: 33.6, max: 84.0) [2024-03-21 02:20:30,522][03784] Avg episode reward: [(0, '1.252')] [2024-03-21 02:20:31,299][04017] Updated weights for policy 0, policy_version 18366 (0.0021) [2024-03-21 02:20:35,521][03784] Fps is (10 sec: 45875.2, 60 sec: 45329.0, 300 sec: 46874.9). Total num frames: 601882624. Throughput: 0: 48413.3. Samples: 603270500. Policy #0 lag: (min: 1.0, avg: 33.6, max: 84.0) [2024-03-21 02:20:35,522][03784] Avg episode reward: [(0, '1.252')] [2024-03-21 02:20:39,402][04017] Updated weights for policy 0, policy_version 18376 (0.0019) [2024-03-21 02:20:40,521][03784] Fps is (10 sec: 52428.3, 60 sec: 44782.9, 300 sec: 47097.1). Total num frames: 602243072. Throughput: 0: 47944.5. Samples: 603537600. Policy #0 lag: (min: 1.0, avg: 33.6, max: 84.0) [2024-03-21 02:20:40,522][03784] Avg episode reward: [(0, '1.102')] [2024-03-21 02:20:43,781][04017] Updated weights for policy 0, policy_version 18386 (0.0025) [2024-03-21 02:20:45,521][03784] Fps is (10 sec: 68813.5, 60 sec: 46967.5, 300 sec: 47652.5). Total num frames: 602570752. Throughput: 0: 47651.2. Samples: 603677300. Policy #0 lag: (min: 0.0, avg: 41.6, max: 81.0) [2024-03-21 02:20:45,522][03784] Avg episode reward: [(0, '1.211')] [2024-03-21 02:20:50,521][03784] Fps is (10 sec: 52429.2, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 602767360. Throughput: 0: 47713.4. Samples: 603976300. Policy #0 lag: (min: 0.0, avg: 41.6, max: 81.0) [2024-03-21 02:20:50,522][03784] Avg episode reward: [(0, '1.211')] [2024-03-21 02:20:51,261][04017] Updated weights for policy 0, policy_version 18396 (0.0011) [2024-03-21 02:20:55,521][03784] Fps is (10 sec: 45874.5, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 603029504. Throughput: 0: 48286.6. Samples: 604265600. Policy #0 lag: (min: 0.0, avg: 41.6, max: 81.0) [2024-03-21 02:20:55,522][03784] Avg episode reward: [(0, '1.558')] [2024-03-21 02:20:56,036][03995] Signal inference workers to stop experience collection... (12100 times) [2024-03-21 02:20:56,105][04017] InferenceWorker_p0-w0: stopping experience collection (12100 times) [2024-03-21 02:20:56,263][03995] Signal inference workers to resume experience collection... (12100 times) [2024-03-21 02:20:56,263][04017] InferenceWorker_p0-w0: resuming experience collection (12100 times) [2024-03-21 02:20:56,265][04017] Updated weights for policy 0, policy_version 18406 (0.0018) [2024-03-21 02:21:00,521][03784] Fps is (10 sec: 65535.7, 60 sec: 53521.0, 300 sec: 47763.5). Total num frames: 603422720. Throughput: 0: 47873.2. Samples: 604387900. Policy #0 lag: (min: 3.0, avg: 48.3, max: 112.0) [2024-03-21 02:21:00,522][03784] Avg episode reward: [(0, '1.166')] [2024-03-21 02:21:03,607][04017] Updated weights for policy 0, policy_version 18416 (0.0015) [2024-03-21 02:21:05,521][03784] Fps is (10 sec: 52428.5, 60 sec: 52428.7, 300 sec: 47541.4). Total num frames: 603553792. Throughput: 0: 46959.8. Samples: 604672600. Policy #0 lag: (min: 3.0, avg: 48.3, max: 112.0) [2024-03-21 02:21:05,522][03784] Avg episode reward: [(0, '1.245')] [2024-03-21 02:21:10,521][03784] Fps is (10 sec: 32768.3, 60 sec: 48605.9, 300 sec: 47097.1). Total num frames: 603750400. Throughput: 0: 46833.4. Samples: 604948400. Policy #0 lag: (min: 3.0, avg: 48.3, max: 112.0) [2024-03-21 02:21:10,522][03784] Avg episode reward: [(0, '1.219')] [2024-03-21 02:21:12,352][04017] Updated weights for policy 0, policy_version 18426 (0.0011) [2024-03-21 02:21:15,521][03784] Fps is (10 sec: 32768.5, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 603881472. Throughput: 0: 47157.8. Samples: 605096900. Policy #0 lag: (min: 0.0, avg: 35.0, max: 80.0) [2024-03-21 02:21:15,522][03784] Avg episode reward: [(0, '0.736')] [2024-03-21 02:21:20,521][03784] Fps is (10 sec: 19660.6, 60 sec: 46421.2, 300 sec: 45764.1). Total num frames: 603947008. Throughput: 0: 46311.0. Samples: 605354500. Policy #0 lag: (min: 0.0, avg: 35.0, max: 80.0) [2024-03-21 02:21:20,522][03784] Avg episode reward: [(0, '1.269')] [2024-03-21 02:21:25,521][03784] Fps is (10 sec: 19660.9, 60 sec: 44236.9, 300 sec: 45430.9). Total num frames: 604078080. Throughput: 0: 46055.7. Samples: 605610100. Policy #0 lag: (min: 0.0, avg: 35.0, max: 80.0) [2024-03-21 02:21:25,522][03784] Avg episode reward: [(0, '0.691')] [2024-03-21 02:21:26,439][04017] Updated weights for policy 0, policy_version 18436 (0.0011) [2024-03-21 02:21:30,521][03784] Fps is (10 sec: 19661.0, 60 sec: 40413.9, 300 sec: 44986.6). Total num frames: 604143616. Throughput: 0: 45831.0. Samples: 605739700. Policy #0 lag: (min: 0.0, avg: 26.8, max: 76.0) [2024-03-21 02:21:30,522][03784] Avg episode reward: [(0, '0.964')] [2024-03-21 02:21:34,131][04017] Updated weights for policy 0, policy_version 18446 (0.0017) [2024-03-21 02:21:35,521][03784] Fps is (10 sec: 45874.6, 60 sec: 44236.8, 300 sec: 45986.3). Total num frames: 604536832. Throughput: 0: 44317.7. Samples: 605970600. Policy #0 lag: (min: 0.0, avg: 26.8, max: 76.0) [2024-03-21 02:21:35,522][03784] Avg episode reward: [(0, '1.004')] [2024-03-21 02:21:38,133][04017] Updated weights for policy 0, policy_version 18456 (0.0012) [2024-03-21 02:21:40,521][03784] Fps is (10 sec: 72089.5, 60 sec: 43690.7, 300 sec: 46652.7). Total num frames: 604864512. Throughput: 0: 44122.3. Samples: 606251100. Policy #0 lag: (min: 1.0, avg: 52.4, max: 105.0) [2024-03-21 02:21:40,522][03784] Avg episode reward: [(0, '1.307')] [2024-03-21 02:21:43,087][04017] Updated weights for policy 0, policy_version 18466 (0.0013) [2024-03-21 02:21:45,521][03784] Fps is (10 sec: 62259.4, 60 sec: 43144.5, 300 sec: 46763.8). Total num frames: 605159424. Throughput: 0: 44168.9. Samples: 606375500. Policy #0 lag: (min: 1.0, avg: 52.4, max: 105.0) [2024-03-21 02:21:45,522][03784] Avg episode reward: [(0, '0.790')] [2024-03-21 02:21:50,521][03784] Fps is (10 sec: 49151.8, 60 sec: 43144.5, 300 sec: 47097.0). Total num frames: 605356032. Throughput: 0: 44677.8. Samples: 606683100. Policy #0 lag: (min: 1.0, avg: 52.4, max: 105.0) [2024-03-21 02:21:50,522][03784] Avg episode reward: [(0, '1.167')] [2024-03-21 02:21:50,931][04017] Updated weights for policy 0, policy_version 18476 (0.0015) [2024-03-21 02:21:51,913][03995] Signal inference workers to stop experience collection... (12150 times) [2024-03-21 02:21:51,991][04017] InferenceWorker_p0-w0: stopping experience collection (12150 times) [2024-03-21 02:21:52,182][03995] Signal inference workers to resume experience collection... (12150 times) [2024-03-21 02:21:52,183][04017] InferenceWorker_p0-w0: resuming experience collection (12150 times) [2024-03-21 02:21:54,601][04017] Updated weights for policy 0, policy_version 18486 (0.0015) [2024-03-21 02:21:55,521][03784] Fps is (10 sec: 65535.2, 60 sec: 46421.3, 300 sec: 47874.6). Total num frames: 605814784. Throughput: 0: 44326.5. Samples: 606943100. Policy #0 lag: (min: 0.0, avg: 55.3, max: 110.0) [2024-03-21 02:21:55,523][03784] Avg episode reward: [(0, '0.589')] [2024-03-21 02:21:59,368][04017] Updated weights for policy 0, policy_version 18496 (0.0021) [2024-03-21 02:22:00,521][03784] Fps is (10 sec: 81920.9, 60 sec: 45875.3, 300 sec: 48430.0). Total num frames: 606175232. Throughput: 0: 44226.7. Samples: 607087100. Policy #0 lag: (min: 0.0, avg: 55.3, max: 110.0) [2024-03-21 02:22:00,522][03784] Avg episode reward: [(0, '1.150')] [2024-03-21 02:22:00,630][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000018500_606208000.pth... [2024-03-21 02:22:00,743][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000018146_594608128.pth [2024-03-21 02:22:05,521][03784] Fps is (10 sec: 49152.6, 60 sec: 45875.3, 300 sec: 47208.1). Total num frames: 606306304. Throughput: 0: 44835.6. Samples: 607372100. Policy #0 lag: (min: 0.0, avg: 55.3, max: 110.0) [2024-03-21 02:22:05,522][03784] Avg episode reward: [(0, '1.039')] [2024-03-21 02:22:09,340][04017] Updated weights for policy 0, policy_version 18506 (0.0016) [2024-03-21 02:22:10,521][03784] Fps is (10 sec: 22937.5, 60 sec: 44236.8, 300 sec: 45986.3). Total num frames: 606404608. Throughput: 0: 45017.7. Samples: 607635900. Policy #0 lag: (min: 0.0, avg: 41.6, max: 78.0) [2024-03-21 02:22:10,522][03784] Avg episode reward: [(0, '0.576')] [2024-03-21 02:22:15,521][03784] Fps is (10 sec: 22937.4, 60 sec: 44236.7, 300 sec: 45986.3). Total num frames: 606535680. Throughput: 0: 45019.9. Samples: 607765600. Policy #0 lag: (min: 0.0, avg: 41.6, max: 78.0) [2024-03-21 02:22:15,522][03784] Avg episode reward: [(0, '0.915')] [2024-03-21 02:22:17,427][04017] Updated weights for policy 0, policy_version 18516 (0.0013) [2024-03-21 02:22:20,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48605.9, 300 sec: 46208.4). Total num frames: 606863360. Throughput: 0: 46404.5. Samples: 608058800. Policy #0 lag: (min: 0.0, avg: 41.6, max: 78.0) [2024-03-21 02:22:20,522][03784] Avg episode reward: [(0, '1.182')] [2024-03-21 02:22:25,521][03784] Fps is (10 sec: 42599.0, 60 sec: 48059.7, 300 sec: 45653.0). Total num frames: 606961664. Throughput: 0: 46686.7. Samples: 608352000. Policy #0 lag: (min: 0.0, avg: 31.7, max: 73.0) [2024-03-21 02:22:25,522][03784] Avg episode reward: [(0, '1.087')] [2024-03-21 02:22:26,626][04017] Updated weights for policy 0, policy_version 18526 (0.0015) [2024-03-21 02:22:30,521][03784] Fps is (10 sec: 29491.1, 60 sec: 50244.2, 300 sec: 45542.0). Total num frames: 607158272. Throughput: 0: 47208.9. Samples: 608499900. Policy #0 lag: (min: 0.0, avg: 31.7, max: 73.0) [2024-03-21 02:22:30,522][03784] Avg episode reward: [(0, '1.087')] [2024-03-21 02:22:35,521][03784] Fps is (10 sec: 39321.9, 60 sec: 46967.6, 300 sec: 45653.1). Total num frames: 607354880. Throughput: 0: 46520.2. Samples: 608776500. Policy #0 lag: (min: 0.0, avg: 31.7, max: 73.0) [2024-03-21 02:22:35,522][03784] Avg episode reward: [(0, '0.896')] [2024-03-21 02:22:36,220][04017] Updated weights for policy 0, policy_version 18536 (0.0011) [2024-03-21 02:22:40,521][03784] Fps is (10 sec: 42598.6, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 607584256. Throughput: 0: 46842.4. Samples: 609051000. Policy #0 lag: (min: 0.0, avg: 32.3, max: 95.0) [2024-03-21 02:22:40,522][03784] Avg episode reward: [(0, '0.631')] [2024-03-21 02:22:42,368][04017] Updated weights for policy 0, policy_version 18546 (0.0020) [2024-03-21 02:22:45,521][03784] Fps is (10 sec: 42598.1, 60 sec: 43690.7, 300 sec: 46319.5). Total num frames: 607780864. Throughput: 0: 46560.0. Samples: 609182300. Policy #0 lag: (min: 0.0, avg: 32.3, max: 95.0) [2024-03-21 02:22:45,522][03784] Avg episode reward: [(0, '1.448')] [2024-03-21 02:22:50,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43144.6, 300 sec: 46430.6). Total num frames: 607944704. Throughput: 0: 45968.9. Samples: 609440700. Policy #0 lag: (min: 3.0, avg: 30.5, max: 56.0) [2024-03-21 02:22:50,522][03784] Avg episode reward: [(0, '1.146')] [2024-03-21 02:22:50,735][03995] Signal inference workers to stop experience collection... (12200 times) [2024-03-21 02:22:50,778][04017] InferenceWorker_p0-w0: stopping experience collection (12200 times) [2024-03-21 02:22:51,049][03995] Signal inference workers to resume experience collection... (12200 times) [2024-03-21 02:22:51,049][04017] InferenceWorker_p0-w0: resuming experience collection (12200 times) [2024-03-21 02:22:51,362][04017] Updated weights for policy 0, policy_version 18556 (0.0021) [2024-03-21 02:22:55,310][04017] Updated weights for policy 0, policy_version 18566 (0.0012) [2024-03-21 02:22:55,521][03784] Fps is (10 sec: 58982.4, 60 sec: 42598.5, 300 sec: 46652.8). Total num frames: 608370688. Throughput: 0: 45742.3. Samples: 609694300. Policy #0 lag: (min: 3.0, avg: 30.5, max: 56.0) [2024-03-21 02:22:55,522][03784] Avg episode reward: [(0, '0.558')] [2024-03-21 02:23:00,521][03784] Fps is (10 sec: 65536.1, 60 sec: 40413.9, 300 sec: 46208.4). Total num frames: 608600064. Throughput: 0: 46191.3. Samples: 609844200. Policy #0 lag: (min: 3.0, avg: 30.5, max: 56.0) [2024-03-21 02:23:00,522][03784] Avg episode reward: [(0, '0.558')] [2024-03-21 02:23:03,966][04017] Updated weights for policy 0, policy_version 18576 (0.0018) [2024-03-21 02:23:05,521][03784] Fps is (10 sec: 45875.1, 60 sec: 42052.3, 300 sec: 46652.8). Total num frames: 608829440. Throughput: 0: 45893.4. Samples: 610124000. Policy #0 lag: (min: 0.0, avg: 49.7, max: 116.0) [2024-03-21 02:23:05,522][03784] Avg episode reward: [(0, '1.116')] [2024-03-21 02:23:10,358][04017] Updated weights for policy 0, policy_version 18586 (0.0012) [2024-03-21 02:23:10,521][03784] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 46986.0). Total num frames: 609026048. Throughput: 0: 45693.2. Samples: 610408200. Policy #0 lag: (min: 0.0, avg: 49.7, max: 116.0) [2024-03-21 02:23:10,522][03784] Avg episode reward: [(0, '0.569')] [2024-03-21 02:23:13,408][04017] Updated weights for policy 0, policy_version 18596 (0.0019) [2024-03-21 02:23:15,521][03784] Fps is (10 sec: 65536.1, 60 sec: 49152.1, 300 sec: 47319.2). Total num frames: 609484800. Throughput: 0: 44937.8. Samples: 610522100. Policy #0 lag: (min: 0.0, avg: 49.7, max: 116.0) [2024-03-21 02:23:15,522][03784] Avg episode reward: [(0, '0.880')] [2024-03-21 02:23:17,636][04017] Updated weights for policy 0, policy_version 18606 (0.0029) [2024-03-21 02:23:20,521][03784] Fps is (10 sec: 72089.4, 60 sec: 48059.7, 300 sec: 47541.4). Total num frames: 609746944. Throughput: 0: 45413.1. Samples: 610820100. Policy #0 lag: (min: 1.0, avg: 45.9, max: 84.0) [2024-03-21 02:23:20,522][03784] Avg episode reward: [(0, '0.880')] [2024-03-21 02:23:25,521][03784] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 47541.4). Total num frames: 609976320. Throughput: 0: 46242.2. Samples: 611131900. Policy #0 lag: (min: 1.0, avg: 45.9, max: 84.0) [2024-03-21 02:23:25,522][03784] Avg episode reward: [(0, '0.880')] [2024-03-21 02:23:26,289][04017] Updated weights for policy 0, policy_version 18616 (0.0014) [2024-03-21 02:23:29,690][03995] Signal inference workers to stop experience collection... (12250 times) [2024-03-21 02:23:29,763][03995] Signal inference workers to resume experience collection... (12250 times) [2024-03-21 02:23:29,784][04017] InferenceWorker_p0-w0: stopping experience collection (12250 times) [2024-03-21 02:23:29,834][04017] InferenceWorker_p0-w0: resuming experience collection (12250 times) [2024-03-21 02:23:30,521][03784] Fps is (10 sec: 42599.0, 60 sec: 50244.3, 300 sec: 46874.9). Total num frames: 610172928. Throughput: 0: 46702.2. Samples: 611283900. Policy #0 lag: (min: 0.0, avg: 40.7, max: 80.0) [2024-03-21 02:23:30,522][03784] Avg episode reward: [(0, '0.880')] [2024-03-21 02:23:34,715][04017] Updated weights for policy 0, policy_version 18626 (0.0015) [2024-03-21 02:23:35,521][03784] Fps is (10 sec: 36044.7, 60 sec: 49698.0, 300 sec: 46319.5). Total num frames: 610336768. Throughput: 0: 47851.0. Samples: 611594000. Policy #0 lag: (min: 0.0, avg: 40.7, max: 80.0) [2024-03-21 02:23:35,522][03784] Avg episode reward: [(0, '1.518')] [2024-03-21 02:23:40,521][03784] Fps is (10 sec: 36044.8, 60 sec: 49152.0, 300 sec: 45876.1). Total num frames: 610533376. Throughput: 0: 48633.3. Samples: 611882800. Policy #0 lag: (min: 0.0, avg: 40.7, max: 80.0) [2024-03-21 02:23:40,522][03784] Avg episode reward: [(0, '1.518')] [2024-03-21 02:23:42,830][04017] Updated weights for policy 0, policy_version 18636 (0.0011) [2024-03-21 02:23:45,521][03784] Fps is (10 sec: 42598.5, 60 sec: 49698.1, 300 sec: 46208.4). Total num frames: 610762752. Throughput: 0: 48071.0. Samples: 612007400. Policy #0 lag: (min: 1.0, avg: 46.7, max: 88.0) [2024-03-21 02:23:45,522][03784] Avg episode reward: [(0, '1.361')] [2024-03-21 02:23:50,521][03784] Fps is (10 sec: 42598.2, 60 sec: 50244.2, 300 sec: 46208.4). Total num frames: 610959360. Throughput: 0: 48893.3. Samples: 612324200. Policy #0 lag: (min: 1.0, avg: 46.7, max: 88.0) [2024-03-21 02:23:50,522][03784] Avg episode reward: [(0, '1.361')] [2024-03-21 02:23:50,655][04017] Updated weights for policy 0, policy_version 18646 (0.0017) [2024-03-21 02:23:55,521][03784] Fps is (10 sec: 36045.2, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 611123200. Throughput: 0: 48764.6. Samples: 612602600. Policy #0 lag: (min: 1.0, avg: 46.7, max: 88.0) [2024-03-21 02:23:55,521][03784] Avg episode reward: [(0, '0.475')] [2024-03-21 02:23:58,078][04017] Updated weights for policy 0, policy_version 18656 (0.0021) [2024-03-21 02:24:00,521][03784] Fps is (10 sec: 49151.9, 60 sec: 47513.5, 300 sec: 47097.1). Total num frames: 611450880. Throughput: 0: 48953.2. Samples: 612725000. Policy #0 lag: (min: 0.0, avg: 40.7, max: 91.0) [2024-03-21 02:24:00,522][03784] Avg episode reward: [(0, '1.129')] [2024-03-21 02:24:00,720][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000018661_611483648.pth... [2024-03-21 02:24:00,836][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000018317_600211456.pth [2024-03-21 02:24:02,742][04017] Updated weights for policy 0, policy_version 18666 (0.0011) [2024-03-21 02:24:05,521][03784] Fps is (10 sec: 65535.5, 60 sec: 49152.0, 300 sec: 47763.5). Total num frames: 611778560. Throughput: 0: 48500.1. Samples: 613002600. Policy #0 lag: (min: 0.0, avg: 40.7, max: 91.0) [2024-03-21 02:24:05,522][03784] Avg episode reward: [(0, '1.001')] [2024-03-21 02:24:10,071][04017] Updated weights for policy 0, policy_version 18676 (0.0018) [2024-03-21 02:24:10,521][03784] Fps is (10 sec: 55706.1, 60 sec: 49698.2, 300 sec: 47763.5). Total num frames: 612007936. Throughput: 0: 48202.3. Samples: 613301000. Policy #0 lag: (min: 0.0, avg: 40.7, max: 91.0) [2024-03-21 02:24:10,522][03784] Avg episode reward: [(0, '1.290')] [2024-03-21 02:24:15,521][03784] Fps is (10 sec: 32768.4, 60 sec: 43690.7, 300 sec: 46874.9). Total num frames: 612106240. Throughput: 0: 48137.9. Samples: 613450100. Policy #0 lag: (min: 2.0, avg: 49.0, max: 102.0) [2024-03-21 02:24:15,521][03784] Avg episode reward: [(0, '1.286')] [2024-03-21 02:24:18,102][04017] Updated weights for policy 0, policy_version 18686 (0.0017) [2024-03-21 02:24:20,521][03784] Fps is (10 sec: 52428.2, 60 sec: 46421.4, 300 sec: 47208.1). Total num frames: 612532224. Throughput: 0: 47135.5. Samples: 613715100. Policy #0 lag: (min: 2.0, avg: 49.0, max: 102.0) [2024-03-21 02:24:20,522][03784] Avg episode reward: [(0, '0.599')] [2024-03-21 02:24:23,558][04017] Updated weights for policy 0, policy_version 18696 (0.0021) [2024-03-21 02:24:25,521][03784] Fps is (10 sec: 52428.2, 60 sec: 44236.8, 300 sec: 46430.6). Total num frames: 612630528. Throughput: 0: 47151.1. Samples: 614004600. Policy #0 lag: (min: 2.0, avg: 49.0, max: 102.0) [2024-03-21 02:24:25,522][03784] Avg episode reward: [(0, '1.234')] [2024-03-21 02:24:25,818][03995] Signal inference workers to stop experience collection... (12300 times) [2024-03-21 02:24:25,823][03995] Signal inference workers to resume experience collection... (12300 times) [2024-03-21 02:24:25,889][04017] InferenceWorker_p0-w0: stopping experience collection (12300 times) [2024-03-21 02:24:25,890][04017] InferenceWorker_p0-w0: resuming experience collection (12300 times) [2024-03-21 02:24:30,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45875.2, 300 sec: 46652.7). Total num frames: 612925440. Throughput: 0: 47368.9. Samples: 614139000. Policy #0 lag: (min: 0.0, avg: 43.8, max: 78.0) [2024-03-21 02:24:30,522][03784] Avg episode reward: [(0, '1.507')] [2024-03-21 02:24:30,577][04017] Updated weights for policy 0, policy_version 18706 (0.0010) [2024-03-21 02:24:35,521][03784] Fps is (10 sec: 58982.9, 60 sec: 48059.8, 300 sec: 46319.5). Total num frames: 613220352. Throughput: 0: 46704.6. Samples: 614425900. Policy #0 lag: (min: 0.0, avg: 43.8, max: 78.0) [2024-03-21 02:24:35,522][03784] Avg episode reward: [(0, '1.254')] [2024-03-21 02:24:36,325][04017] Updated weights for policy 0, policy_version 18716 (0.0011) [2024-03-21 02:24:40,521][03784] Fps is (10 sec: 45875.0, 60 sec: 47513.5, 300 sec: 46208.4). Total num frames: 613384192. Throughput: 0: 46619.8. Samples: 614700500. Policy #0 lag: (min: 0.0, avg: 40.0, max: 88.0) [2024-03-21 02:24:40,522][03784] Avg episode reward: [(0, '1.406')] [2024-03-21 02:24:44,087][04017] Updated weights for policy 0, policy_version 18726 (0.0024) [2024-03-21 02:24:45,521][03784] Fps is (10 sec: 45874.8, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 613679104. Throughput: 0: 46793.4. Samples: 614830700. Policy #0 lag: (min: 0.0, avg: 40.0, max: 88.0) [2024-03-21 02:24:45,522][03784] Avg episode reward: [(0, '1.157')] [2024-03-21 02:24:49,410][04017] Updated weights for policy 0, policy_version 18736 (0.0012) [2024-03-21 02:24:50,521][03784] Fps is (10 sec: 58982.6, 60 sec: 50244.3, 300 sec: 47097.1). Total num frames: 613974016. Throughput: 0: 46606.6. Samples: 615099900. Policy #0 lag: (min: 0.0, avg: 40.0, max: 88.0) [2024-03-21 02:24:50,522][03784] Avg episode reward: [(0, '0.676')] [2024-03-21 02:24:55,521][03784] Fps is (10 sec: 45875.3, 60 sec: 50244.2, 300 sec: 47208.1). Total num frames: 614137856. Throughput: 0: 46097.8. Samples: 615375400. Policy #0 lag: (min: 0.0, avg: 32.9, max: 68.0) [2024-03-21 02:24:55,522][03784] Avg episode reward: [(0, '1.299')] [2024-03-21 02:24:59,814][04017] Updated weights for policy 0, policy_version 18746 (0.0010) [2024-03-21 02:25:00,521][03784] Fps is (10 sec: 29491.3, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 614268928. Throughput: 0: 46077.6. Samples: 615523600. Policy #0 lag: (min: 0.0, avg: 32.9, max: 68.0) [2024-03-21 02:25:00,522][03784] Avg episode reward: [(0, '1.223')] [2024-03-21 02:25:05,521][03784] Fps is (10 sec: 19660.7, 60 sec: 42598.4, 300 sec: 45764.1). Total num frames: 614334464. Throughput: 0: 46669.0. Samples: 615815200. Policy #0 lag: (min: 0.0, avg: 32.9, max: 68.0) [2024-03-21 02:25:05,522][03784] Avg episode reward: [(0, '1.304')] [2024-03-21 02:25:10,521][03784] Fps is (10 sec: 22937.4, 60 sec: 41506.1, 300 sec: 45541.9). Total num frames: 614498304. Throughput: 0: 46782.1. Samples: 616109800. Policy #0 lag: (min: 1.0, avg: 36.8, max: 83.0) [2024-03-21 02:25:10,522][03784] Avg episode reward: [(0, '0.628')] [2024-03-21 02:25:12,532][04017] Updated weights for policy 0, policy_version 18756 (0.0015) [2024-03-21 02:25:15,521][03784] Fps is (10 sec: 52429.2, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 614858752. Throughput: 0: 46971.2. Samples: 616252700. Policy #0 lag: (min: 1.0, avg: 36.8, max: 83.0) [2024-03-21 02:25:15,522][03784] Avg episode reward: [(0, '1.322')] [2024-03-21 02:25:17,148][04017] Updated weights for policy 0, policy_version 18766 (0.0013) [2024-03-21 02:25:19,845][03995] Signal inference workers to stop experience collection... (12350 times) [2024-03-21 02:25:19,895][04017] InferenceWorker_p0-w0: stopping experience collection (12350 times) [2024-03-21 02:25:19,915][03995] Signal inference workers to resume experience collection... (12350 times) [2024-03-21 02:25:19,939][04017] InferenceWorker_p0-w0: resuming experience collection (12350 times) [2024-03-21 02:25:20,521][03784] Fps is (10 sec: 72090.4, 60 sec: 44783.0, 300 sec: 46763.8). Total num frames: 615219200. Throughput: 0: 46246.6. Samples: 616507000. Policy #0 lag: (min: 1.0, avg: 36.8, max: 83.0) [2024-03-21 02:25:20,522][03784] Avg episode reward: [(0, '1.300')] [2024-03-21 02:25:21,132][04017] Updated weights for policy 0, policy_version 18776 (0.0026) [2024-03-21 02:25:25,521][03784] Fps is (10 sec: 62258.7, 60 sec: 47513.6, 300 sec: 46652.7). Total num frames: 615481344. Throughput: 0: 46193.4. Samples: 616779200. Policy #0 lag: (min: 0.0, avg: 43.6, max: 90.0) [2024-03-21 02:25:25,522][03784] Avg episode reward: [(0, '0.933')] [2024-03-21 02:25:29,389][04017] Updated weights for policy 0, policy_version 18786 (0.0009) [2024-03-21 02:25:30,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 615677952. Throughput: 0: 46613.3. Samples: 616928300. Policy #0 lag: (min: 0.0, avg: 43.6, max: 90.0) [2024-03-21 02:25:30,522][03784] Avg episode reward: [(0, '0.571')] [2024-03-21 02:25:33,196][04017] Updated weights for policy 0, policy_version 18796 (0.0015) [2024-03-21 02:25:35,521][03784] Fps is (10 sec: 65536.1, 60 sec: 48605.8, 300 sec: 47097.1). Total num frames: 616136704. Throughput: 0: 46184.5. Samples: 617178200. Policy #0 lag: (min: 4.0, avg: 51.9, max: 99.0) [2024-03-21 02:25:35,522][03784] Avg episode reward: [(0, '0.454')] [2024-03-21 02:25:36,210][04017] Updated weights for policy 0, policy_version 18806 (0.0014) [2024-03-21 02:25:40,521][03784] Fps is (10 sec: 72089.4, 60 sec: 50244.3, 300 sec: 46874.9). Total num frames: 616398848. Throughput: 0: 45957.7. Samples: 617443500. Policy #0 lag: (min: 4.0, avg: 51.9, max: 99.0) [2024-03-21 02:25:40,522][03784] Avg episode reward: [(0, '1.028')] [2024-03-21 02:25:42,148][04017] Updated weights for policy 0, policy_version 18816 (0.0015) [2024-03-21 02:25:45,521][03784] Fps is (10 sec: 52428.5, 60 sec: 49698.1, 300 sec: 47097.1). Total num frames: 616660992. Throughput: 0: 45613.3. Samples: 617576200. Policy #0 lag: (min: 4.0, avg: 51.9, max: 99.0) [2024-03-21 02:25:45,522][03784] Avg episode reward: [(0, '0.513')] [2024-03-21 02:25:50,521][03784] Fps is (10 sec: 32768.1, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 616726528. Throughput: 0: 45908.9. Samples: 617881100. Policy #0 lag: (min: 0.0, avg: 52.4, max: 99.0) [2024-03-21 02:25:50,522][03784] Avg episode reward: [(0, '0.513')] [2024-03-21 02:25:55,521][03784] Fps is (10 sec: 13107.4, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 616792064. Throughput: 0: 46040.2. Samples: 618181600. Policy #0 lag: (min: 0.0, avg: 52.4, max: 99.0) [2024-03-21 02:25:55,522][03784] Avg episode reward: [(0, '1.060')] [2024-03-21 02:25:58,368][04017] Updated weights for policy 0, policy_version 18826 (0.0011) [2024-03-21 02:26:00,521][03784] Fps is (10 sec: 26214.4, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 616988672. Throughput: 0: 45882.2. Samples: 618317400. Policy #0 lag: (min: 0.0, avg: 52.4, max: 99.0) [2024-03-21 02:26:00,522][03784] Avg episode reward: [(0, '1.305')] [2024-03-21 02:26:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000018829_616988672.pth... [2024-03-21 02:26:00,650][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000018500_606208000.pth [2024-03-21 02:26:05,521][03784] Fps is (10 sec: 29491.1, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 617086976. Throughput: 0: 45833.4. Samples: 618569500. Policy #0 lag: (min: 0.0, avg: 33.8, max: 80.0) [2024-03-21 02:26:05,522][03784] Avg episode reward: [(0, '0.846')] [2024-03-21 02:26:10,170][04017] Updated weights for policy 0, policy_version 18836 (0.0011) [2024-03-21 02:26:10,521][03784] Fps is (10 sec: 22937.6, 60 sec: 45329.2, 300 sec: 45208.7). Total num frames: 617218048. Throughput: 0: 45733.4. Samples: 618837200. Policy #0 lag: (min: 0.0, avg: 33.8, max: 80.0) [2024-03-21 02:26:10,522][03784] Avg episode reward: [(0, '0.657')] [2024-03-21 02:26:15,521][03784] Fps is (10 sec: 42597.9, 60 sec: 44236.7, 300 sec: 45986.3). Total num frames: 617512960. Throughput: 0: 44966.6. Samples: 618951800. Policy #0 lag: (min: 0.0, avg: 33.8, max: 80.0) [2024-03-21 02:26:15,522][03784] Avg episode reward: [(0, '1.004')] [2024-03-21 02:26:15,816][04017] Updated weights for policy 0, policy_version 18846 (0.0018) [2024-03-21 02:26:19,367][03995] Signal inference workers to stop experience collection... (12400 times) [2024-03-21 02:26:19,367][03995] Signal inference workers to resume experience collection... (12400 times) [2024-03-21 02:26:19,452][04017] InferenceWorker_p0-w0: stopping experience collection (12400 times) [2024-03-21 02:26:19,452][04017] InferenceWorker_p0-w0: resuming experience collection (12400 times) [2024-03-21 02:26:20,521][03784] Fps is (10 sec: 52428.9, 60 sec: 42052.3, 300 sec: 46319.5). Total num frames: 617742336. Throughput: 0: 45891.1. Samples: 619243300. Policy #0 lag: (min: 3.0, avg: 22.6, max: 56.0) [2024-03-21 02:26:20,522][03784] Avg episode reward: [(0, '0.718')] [2024-03-21 02:26:21,606][04017] Updated weights for policy 0, policy_version 18856 (0.0018) [2024-03-21 02:26:25,521][03784] Fps is (10 sec: 65536.7, 60 sec: 44783.0, 300 sec: 47541.4). Total num frames: 618168320. Throughput: 0: 45766.7. Samples: 619503000. Policy #0 lag: (min: 3.0, avg: 22.6, max: 56.0) [2024-03-21 02:26:25,522][03784] Avg episode reward: [(0, '0.561')] [2024-03-21 02:26:25,605][04017] Updated weights for policy 0, policy_version 18866 (0.0013) [2024-03-21 02:26:30,521][03784] Fps is (10 sec: 65536.0, 60 sec: 45329.1, 300 sec: 46986.0). Total num frames: 618397696. Throughput: 0: 46146.8. Samples: 619652800. Policy #0 lag: (min: 2.0, avg: 52.9, max: 119.0) [2024-03-21 02:26:30,522][03784] Avg episode reward: [(0, '1.161')] [2024-03-21 02:26:34,181][04017] Updated weights for policy 0, policy_version 18876 (0.0015) [2024-03-21 02:26:35,521][03784] Fps is (10 sec: 42598.5, 60 sec: 40960.0, 300 sec: 46541.7). Total num frames: 618594304. Throughput: 0: 45966.7. Samples: 619949600. Policy #0 lag: (min: 2.0, avg: 52.9, max: 119.0) [2024-03-21 02:26:35,522][03784] Avg episode reward: [(0, '0.510')] [2024-03-21 02:26:37,829][04017] Updated weights for policy 0, policy_version 18886 (0.0013) [2024-03-21 02:26:40,521][03784] Fps is (10 sec: 68812.7, 60 sec: 44783.0, 300 sec: 47208.1). Total num frames: 619085824. Throughput: 0: 44473.3. Samples: 620182900. Policy #0 lag: (min: 2.0, avg: 52.9, max: 119.0) [2024-03-21 02:26:40,522][03784] Avg episode reward: [(0, '0.956')] [2024-03-21 02:26:42,507][04017] Updated weights for policy 0, policy_version 18896 (0.0024) [2024-03-21 02:26:45,521][03784] Fps is (10 sec: 72089.5, 60 sec: 44236.9, 300 sec: 47319.2). Total num frames: 619315200. Throughput: 0: 44817.8. Samples: 620334200. Policy #0 lag: (min: 1.0, avg: 48.7, max: 86.0) [2024-03-21 02:26:45,522][03784] Avg episode reward: [(0, '0.956')] [2024-03-21 02:26:50,226][04017] Updated weights for policy 0, policy_version 18906 (0.0010) [2024-03-21 02:26:50,521][03784] Fps is (10 sec: 42598.0, 60 sec: 46421.3, 300 sec: 46430.6). Total num frames: 619511808. Throughput: 0: 46006.6. Samples: 620639800. Policy #0 lag: (min: 1.0, avg: 48.7, max: 86.0) [2024-03-21 02:26:50,522][03784] Avg episode reward: [(0, '1.022')] [2024-03-21 02:26:55,521][03784] Fps is (10 sec: 36044.6, 60 sec: 48059.7, 300 sec: 45764.1). Total num frames: 619675648. Throughput: 0: 46935.5. Samples: 620949300. Policy #0 lag: (min: 1.0, avg: 48.7, max: 86.0) [2024-03-21 02:26:55,522][03784] Avg episode reward: [(0, '1.022')] [2024-03-21 02:27:00,282][04017] Updated weights for policy 0, policy_version 18916 (0.0021) [2024-03-21 02:27:00,521][03784] Fps is (10 sec: 32768.5, 60 sec: 47513.7, 300 sec: 45875.2). Total num frames: 619839488. Throughput: 0: 47829.1. Samples: 621104100. Policy #0 lag: (min: 1.0, avg: 32.9, max: 71.0) [2024-03-21 02:27:00,522][03784] Avg episode reward: [(0, '1.001')] [2024-03-21 02:27:01,114][03995] Signal inference workers to stop experience collection... (12450 times) [2024-03-21 02:27:01,175][03995] Signal inference workers to resume experience collection... (12450 times) [2024-03-21 02:27:01,212][04017] InferenceWorker_p0-w0: stopping experience collection (12450 times) [2024-03-21 02:27:01,264][04017] InferenceWorker_p0-w0: resuming experience collection (12450 times) [2024-03-21 02:27:05,405][04017] Updated weights for policy 0, policy_version 18926 (0.0011) [2024-03-21 02:27:05,521][03784] Fps is (10 sec: 49152.4, 60 sec: 51336.6, 300 sec: 46652.8). Total num frames: 620167168. Throughput: 0: 47660.0. Samples: 621388000. Policy #0 lag: (min: 1.0, avg: 32.9, max: 71.0) [2024-03-21 02:27:05,522][03784] Avg episode reward: [(0, '0.700')] [2024-03-21 02:27:10,521][03784] Fps is (10 sec: 62258.4, 60 sec: 54067.2, 300 sec: 47208.1). Total num frames: 620462080. Throughput: 0: 47726.6. Samples: 621650700. Policy #0 lag: (min: 0.0, avg: 34.6, max: 83.0) [2024-03-21 02:27:10,522][03784] Avg episode reward: [(0, '0.830')] [2024-03-21 02:27:11,547][04017] Updated weights for policy 0, policy_version 18936 (0.0011) [2024-03-21 02:27:15,521][03784] Fps is (10 sec: 52428.7, 60 sec: 52975.0, 300 sec: 46874.9). Total num frames: 620691456. Throughput: 0: 47477.8. Samples: 621789300. Policy #0 lag: (min: 0.0, avg: 34.6, max: 83.0) [2024-03-21 02:27:15,522][03784] Avg episode reward: [(0, '1.108')] [2024-03-21 02:27:18,970][04017] Updated weights for policy 0, policy_version 18946 (0.0012) [2024-03-21 02:27:20,521][03784] Fps is (10 sec: 36045.1, 60 sec: 51336.5, 300 sec: 46986.0). Total num frames: 620822528. Throughput: 0: 46511.1. Samples: 622042600. Policy #0 lag: (min: 0.0, avg: 34.6, max: 83.0) [2024-03-21 02:27:20,522][03784] Avg episode reward: [(0, '1.026')] [2024-03-21 02:27:25,521][03784] Fps is (10 sec: 26214.4, 60 sec: 46421.4, 300 sec: 46763.8). Total num frames: 620953600. Throughput: 0: 47153.4. Samples: 622304800. Policy #0 lag: (min: 0.0, avg: 33.1, max: 84.0) [2024-03-21 02:27:25,522][03784] Avg episode reward: [(0, '0.518')] [2024-03-21 02:27:30,521][03784] Fps is (10 sec: 26214.3, 60 sec: 44782.9, 300 sec: 46541.7). Total num frames: 621084672. Throughput: 0: 46853.3. Samples: 622442600. Policy #0 lag: (min: 0.0, avg: 33.1, max: 84.0) [2024-03-21 02:27:30,522][03784] Avg episode reward: [(0, '0.476')] [2024-03-21 02:27:32,180][04017] Updated weights for policy 0, policy_version 18956 (0.0013) [2024-03-21 02:27:35,521][03784] Fps is (10 sec: 29491.0, 60 sec: 44236.8, 300 sec: 46319.5). Total num frames: 621248512. Throughput: 0: 45731.2. Samples: 622697700. Policy #0 lag: (min: 0.0, avg: 33.1, max: 84.0) [2024-03-21 02:27:35,522][03784] Avg episode reward: [(0, '0.728')] [2024-03-21 02:27:40,521][03784] Fps is (10 sec: 36044.6, 60 sec: 39321.6, 300 sec: 46319.5). Total num frames: 621445120. Throughput: 0: 45335.6. Samples: 622989400. Policy #0 lag: (min: 1.0, avg: 47.6, max: 110.0) [2024-03-21 02:27:40,522][03784] Avg episode reward: [(0, '1.078')] [2024-03-21 02:27:41,266][04017] Updated weights for policy 0, policy_version 18966 (0.0010) [2024-03-21 02:27:45,521][03784] Fps is (10 sec: 52429.3, 60 sec: 40960.0, 300 sec: 46874.9). Total num frames: 621772800. Throughput: 0: 45184.4. Samples: 623137400. Policy #0 lag: (min: 1.0, avg: 47.6, max: 110.0) [2024-03-21 02:27:45,522][03784] Avg episode reward: [(0, '1.078')] [2024-03-21 02:27:48,773][04017] Updated weights for policy 0, policy_version 18976 (0.0016) [2024-03-21 02:27:50,521][03784] Fps is (10 sec: 45875.2, 60 sec: 39867.8, 300 sec: 45875.2). Total num frames: 621903872. Throughput: 0: 45357.7. Samples: 623429100. Policy #0 lag: (min: 1.0, avg: 47.6, max: 110.0) [2024-03-21 02:27:50,522][03784] Avg episode reward: [(0, '0.658')] [2024-03-21 02:27:53,586][04017] Updated weights for policy 0, policy_version 18986 (0.0018) [2024-03-21 02:27:55,521][03784] Fps is (10 sec: 55704.5, 60 sec: 44236.7, 300 sec: 46541.6). Total num frames: 622329856. Throughput: 0: 45511.0. Samples: 623698700. Policy #0 lag: (min: 1.0, avg: 26.9, max: 63.0) [2024-03-21 02:27:55,522][03784] Avg episode reward: [(0, '1.541')] [2024-03-21 02:27:56,059][03995] Signal inference workers to stop experience collection... (12500 times) [2024-03-21 02:27:56,132][03995] Signal inference workers to resume experience collection... (12500 times) [2024-03-21 02:27:56,149][04017] InferenceWorker_p0-w0: stopping experience collection (12500 times) [2024-03-21 02:27:56,179][04017] InferenceWorker_p0-w0: resuming experience collection (12500 times) [2024-03-21 02:27:56,807][04017] Updated weights for policy 0, policy_version 18996 (0.0021) [2024-03-21 02:28:00,521][03784] Fps is (10 sec: 85195.9, 60 sec: 48605.7, 300 sec: 47208.1). Total num frames: 622755840. Throughput: 0: 45435.4. Samples: 623833900. Policy #0 lag: (min: 1.0, avg: 26.9, max: 63.0) [2024-03-21 02:28:00,522][03784] Avg episode reward: [(0, '1.541')] [2024-03-21 02:28:00,602][04017] Updated weights for policy 0, policy_version 19006 (0.0010) [2024-03-21 02:28:00,924][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000019007_622821376.pth... [2024-03-21 02:28:01,046][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000018661_611483648.pth [2024-03-21 02:28:05,521][03784] Fps is (10 sec: 75367.4, 60 sec: 48605.8, 300 sec: 47652.5). Total num frames: 623083520. Throughput: 0: 45573.3. Samples: 624093400. Policy #0 lag: (min: 1.0, avg: 26.9, max: 63.0) [2024-03-21 02:28:05,522][03784] Avg episode reward: [(0, '0.843')] [2024-03-21 02:28:06,180][04017] Updated weights for policy 0, policy_version 19016 (0.0019) [2024-03-21 02:28:10,521][03784] Fps is (10 sec: 65536.9, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 623411200. Throughput: 0: 45742.2. Samples: 624363200. Policy #0 lag: (min: 0.0, avg: 47.6, max: 80.0) [2024-03-21 02:28:10,522][03784] Avg episode reward: [(0, '1.112')] [2024-03-21 02:28:11,502][04017] Updated weights for policy 0, policy_version 19026 (0.0020) [2024-03-21 02:28:15,521][03784] Fps is (10 sec: 52428.3, 60 sec: 48605.8, 300 sec: 46986.0). Total num frames: 623607808. Throughput: 0: 45757.7. Samples: 624501700. Policy #0 lag: (min: 0.0, avg: 47.6, max: 80.0) [2024-03-21 02:28:15,522][03784] Avg episode reward: [(0, '0.619')] [2024-03-21 02:28:20,521][03784] Fps is (10 sec: 22937.6, 60 sec: 46967.4, 300 sec: 46319.5). Total num frames: 623640576. Throughput: 0: 46622.2. Samples: 624795700. Policy #0 lag: (min: 0.0, avg: 47.6, max: 80.0) [2024-03-21 02:28:20,522][03784] Avg episode reward: [(0, '0.792')] [2024-03-21 02:28:25,521][03784] Fps is (10 sec: 13107.3, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 623738880. Throughput: 0: 46455.6. Samples: 625079900. Policy #0 lag: (min: 0.0, avg: 43.4, max: 84.0) [2024-03-21 02:28:25,522][03784] Avg episode reward: [(0, '0.771')] [2024-03-21 02:28:26,577][04017] Updated weights for policy 0, policy_version 19036 (0.0015) [2024-03-21 02:28:30,521][03784] Fps is (10 sec: 22937.6, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 623869952. Throughput: 0: 46346.6. Samples: 625223000. Policy #0 lag: (min: 0.0, avg: 43.4, max: 84.0) [2024-03-21 02:28:30,522][03784] Avg episode reward: [(0, '1.171')] [2024-03-21 02:28:35,521][03784] Fps is (10 sec: 26214.5, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 624001024. Throughput: 0: 46333.4. Samples: 625514100. Policy #0 lag: (min: 0.0, avg: 43.4, max: 84.0) [2024-03-21 02:28:35,522][03784] Avg episode reward: [(0, '0.688')] [2024-03-21 02:28:40,521][03784] Fps is (10 sec: 19660.6, 60 sec: 43690.6, 300 sec: 45097.6). Total num frames: 624066560. Throughput: 0: 47000.0. Samples: 625813700. Policy #0 lag: (min: 0.0, avg: 24.5, max: 86.0) [2024-03-21 02:28:40,522][03784] Avg episode reward: [(0, '0.687')] [2024-03-21 02:28:41,357][04017] Updated weights for policy 0, policy_version 19046 (0.0017) [2024-03-21 02:28:45,521][03784] Fps is (10 sec: 39320.7, 60 sec: 43690.5, 300 sec: 45541.9). Total num frames: 624394240. Throughput: 0: 46942.2. Samples: 625946300. Policy #0 lag: (min: 0.0, avg: 24.5, max: 86.0) [2024-03-21 02:28:45,522][03784] Avg episode reward: [(0, '0.812')] [2024-03-21 02:28:45,725][04017] Updated weights for policy 0, policy_version 19056 (0.0015) [2024-03-21 02:28:49,809][04017] Updated weights for policy 0, policy_version 19066 (0.0015) [2024-03-21 02:28:50,078][03995] Signal inference workers to stop experience collection... (12550 times) [2024-03-21 02:28:50,134][03995] Signal inference workers to resume experience collection... (12550 times) [2024-03-21 02:28:50,152][04017] InferenceWorker_p0-w0: stopping experience collection (12550 times) [2024-03-21 02:28:50,204][04017] InferenceWorker_p0-w0: resuming experience collection (12550 times) [2024-03-21 02:28:50,521][03784] Fps is (10 sec: 75367.4, 60 sec: 48605.9, 300 sec: 46430.6). Total num frames: 624820224. Throughput: 0: 47068.9. Samples: 626211500. Policy #0 lag: (min: 0.0, avg: 24.5, max: 86.0) [2024-03-21 02:28:50,522][03784] Avg episode reward: [(0, '1.074')] [2024-03-21 02:28:54,524][04017] Updated weights for policy 0, policy_version 19076 (0.0019) [2024-03-21 02:28:55,521][03784] Fps is (10 sec: 78643.8, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 625180672. Throughput: 0: 46877.7. Samples: 626472700. Policy #0 lag: (min: 3.0, avg: 43.4, max: 83.0) [2024-03-21 02:28:55,522][03784] Avg episode reward: [(0, '0.741')] [2024-03-21 02:28:59,156][04017] Updated weights for policy 0, policy_version 19086 (0.0017) [2024-03-21 02:29:00,521][03784] Fps is (10 sec: 65535.1, 60 sec: 45329.1, 300 sec: 46430.6). Total num frames: 625475584. Throughput: 0: 46893.3. Samples: 626611900. Policy #0 lag: (min: 3.0, avg: 43.4, max: 83.0) [2024-03-21 02:29:00,522][03784] Avg episode reward: [(0, '1.205')] [2024-03-21 02:29:03,212][04017] Updated weights for policy 0, policy_version 19096 (0.0018) [2024-03-21 02:29:05,521][03784] Fps is (10 sec: 75365.8, 60 sec: 47513.4, 300 sec: 47208.1). Total num frames: 625934336. Throughput: 0: 46317.6. Samples: 626880000. Policy #0 lag: (min: 3.0, avg: 43.4, max: 83.0) [2024-03-21 02:29:05,522][03784] Avg episode reward: [(0, '0.479')] [2024-03-21 02:29:10,339][04017] Updated weights for policy 0, policy_version 19106 (0.0015) [2024-03-21 02:29:10,521][03784] Fps is (10 sec: 58983.1, 60 sec: 44236.8, 300 sec: 47319.2). Total num frames: 626065408. Throughput: 0: 46968.9. Samples: 627193500. Policy #0 lag: (min: 0.0, avg: 47.3, max: 80.0) [2024-03-21 02:29:10,522][03784] Avg episode reward: [(0, '0.479')] [2024-03-21 02:29:15,521][03784] Fps is (10 sec: 39322.2, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 626327552. Throughput: 0: 46826.6. Samples: 627330200. Policy #0 lag: (min: 0.0, avg: 47.3, max: 80.0) [2024-03-21 02:29:15,522][03784] Avg episode reward: [(0, '0.723')] [2024-03-21 02:29:16,796][04017] Updated weights for policy 0, policy_version 19116 (0.0011) [2024-03-21 02:29:20,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 626458624. Throughput: 0: 46755.5. Samples: 627618100. Policy #0 lag: (min: 0.0, avg: 47.3, max: 80.0) [2024-03-21 02:29:20,522][03784] Avg episode reward: [(0, '1.245')] [2024-03-21 02:29:25,521][03784] Fps is (10 sec: 19660.8, 60 sec: 46421.3, 300 sec: 46097.4). Total num frames: 626524160. Throughput: 0: 46617.8. Samples: 627911500. Policy #0 lag: (min: 0.0, avg: 47.3, max: 80.0) [2024-03-21 02:29:25,522][03784] Avg episode reward: [(0, '1.117')] [2024-03-21 02:29:30,521][03784] Fps is (10 sec: 22937.7, 60 sec: 46967.5, 300 sec: 45653.0). Total num frames: 626688000. Throughput: 0: 47020.2. Samples: 628062200. Policy #0 lag: (min: 0.0, avg: 46.3, max: 92.0) [2024-03-21 02:29:30,522][03784] Avg episode reward: [(0, '1.117')] [2024-03-21 02:29:31,709][04017] Updated weights for policy 0, policy_version 19126 (0.0010) [2024-03-21 02:29:35,521][03784] Fps is (10 sec: 39321.9, 60 sec: 48605.9, 300 sec: 45875.2). Total num frames: 626917376. Throughput: 0: 47495.5. Samples: 628348800. Policy #0 lag: (min: 0.0, avg: 46.3, max: 92.0) [2024-03-21 02:29:35,522][03784] Avg episode reward: [(0, '0.790')] [2024-03-21 02:29:37,220][04017] Updated weights for policy 0, policy_version 19136 (0.0024) [2024-03-21 02:29:37,409][03995] Signal inference workers to stop experience collection... (12600 times) [2024-03-21 02:29:37,523][03995] Signal inference workers to resume experience collection... (12600 times) [2024-03-21 02:29:37,611][04017] InferenceWorker_p0-w0: stopping experience collection (12600 times) [2024-03-21 02:29:37,611][04017] InferenceWorker_p0-w0: resuming experience collection (12600 times) [2024-03-21 02:29:40,521][03784] Fps is (10 sec: 52428.3, 60 sec: 52428.8, 300 sec: 45875.2). Total num frames: 627212288. Throughput: 0: 47717.8. Samples: 628620000. Policy #0 lag: (min: 0.0, avg: 46.3, max: 92.0) [2024-03-21 02:29:40,522][03784] Avg episode reward: [(0, '0.872')] [2024-03-21 02:29:42,783][04017] Updated weights for policy 0, policy_version 19146 (0.0019) [2024-03-21 02:29:45,521][03784] Fps is (10 sec: 62259.1, 60 sec: 52429.0, 300 sec: 45986.3). Total num frames: 627539968. Throughput: 0: 47906.8. Samples: 628767700. Policy #0 lag: (min: 0.0, avg: 33.7, max: 79.0) [2024-03-21 02:29:45,522][03784] Avg episode reward: [(0, '1.054')] [2024-03-21 02:29:50,401][04017] Updated weights for policy 0, policy_version 19156 (0.0015) [2024-03-21 02:29:50,526][03784] Fps is (10 sec: 49128.4, 60 sec: 48055.8, 300 sec: 45985.5). Total num frames: 627703808. Throughput: 0: 48728.3. Samples: 629073000. Policy #0 lag: (min: 0.0, avg: 33.7, max: 79.0) [2024-03-21 02:29:50,527][03784] Avg episode reward: [(0, '1.035')] [2024-03-21 02:29:55,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45329.2, 300 sec: 46208.4). Total num frames: 627900416. Throughput: 0: 48113.3. Samples: 629358600. Policy #0 lag: (min: 4.0, avg: 33.8, max: 96.0) [2024-03-21 02:29:55,530][03784] Avg episode reward: [(0, '0.929')] [2024-03-21 02:29:58,387][04017] Updated weights for policy 0, policy_version 19166 (0.0012) [2024-03-21 02:30:00,521][03784] Fps is (10 sec: 52454.3, 60 sec: 45875.3, 300 sec: 47097.1). Total num frames: 628228096. Throughput: 0: 48611.1. Samples: 629517700. Policy #0 lag: (min: 4.0, avg: 33.8, max: 96.0) [2024-03-21 02:30:00,522][03784] Avg episode reward: [(0, '0.929')] [2024-03-21 02:30:00,559][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000019173_628260864.pth... [2024-03-21 02:30:00,671][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000018829_616988672.pth [2024-03-21 02:30:02,302][04017] Updated weights for policy 0, policy_version 19176 (0.0015) [2024-03-21 02:30:05,522][03784] Fps is (10 sec: 68810.2, 60 sec: 44236.7, 300 sec: 47763.5). Total num frames: 628588544. Throughput: 0: 47926.3. Samples: 629774800. Policy #0 lag: (min: 4.0, avg: 33.8, max: 96.0) [2024-03-21 02:30:05,523][03784] Avg episode reward: [(0, '0.746')] [2024-03-21 02:30:06,880][04017] Updated weights for policy 0, policy_version 19186 (0.0012) [2024-03-21 02:30:10,521][03784] Fps is (10 sec: 68812.9, 60 sec: 47513.6, 300 sec: 47652.4). Total num frames: 628916224. Throughput: 0: 47808.9. Samples: 630062900. Policy #0 lag: (min: 1.0, avg: 45.7, max: 90.0) [2024-03-21 02:30:10,522][03784] Avg episode reward: [(0, '0.746')] [2024-03-21 02:30:13,530][04017] Updated weights for policy 0, policy_version 19196 (0.0015) [2024-03-21 02:30:15,521][03784] Fps is (10 sec: 55707.1, 60 sec: 46967.4, 300 sec: 47208.1). Total num frames: 629145600. Throughput: 0: 47997.7. Samples: 630222100. Policy #0 lag: (min: 1.0, avg: 45.7, max: 90.0) [2024-03-21 02:30:15,522][03784] Avg episode reward: [(0, '1.016')] [2024-03-21 02:30:20,521][03784] Fps is (10 sec: 39321.5, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 629309440. Throughput: 0: 47995.5. Samples: 630508600. Policy #0 lag: (min: 1.0, avg: 45.7, max: 90.0) [2024-03-21 02:30:20,522][03784] Avg episode reward: [(0, '0.692')] [2024-03-21 02:30:20,896][04017] Updated weights for policy 0, policy_version 19206 (0.0017) [2024-03-21 02:30:25,521][03784] Fps is (10 sec: 49152.1, 60 sec: 51882.7, 300 sec: 47319.2). Total num frames: 629637120. Throughput: 0: 48368.9. Samples: 630796600. Policy #0 lag: (min: 1.0, avg: 38.6, max: 81.0) [2024-03-21 02:30:25,522][03784] Avg episode reward: [(0, '1.544')] [2024-03-21 02:30:27,066][04017] Updated weights for policy 0, policy_version 19216 (0.0015) [2024-03-21 02:30:29,441][03995] Signal inference workers to stop experience collection... (12650 times) [2024-03-21 02:30:29,487][04017] InferenceWorker_p0-w0: stopping experience collection (12650 times) [2024-03-21 02:30:29,666][03995] Signal inference workers to resume experience collection... (12650 times) [2024-03-21 02:30:29,666][04017] InferenceWorker_p0-w0: resuming experience collection (12650 times) [2024-03-21 02:30:30,521][03784] Fps is (10 sec: 49152.2, 60 sec: 51882.7, 300 sec: 46319.5). Total num frames: 629800960. Throughput: 0: 48331.1. Samples: 630942600. Policy #0 lag: (min: 1.0, avg: 38.6, max: 81.0) [2024-03-21 02:30:30,522][03784] Avg episode reward: [(0, '1.404')] [2024-03-21 02:30:34,608][04017] Updated weights for policy 0, policy_version 19226 (0.0019) [2024-03-21 02:30:35,521][03784] Fps is (10 sec: 42598.3, 60 sec: 52428.7, 300 sec: 46319.5). Total num frames: 630063104. Throughput: 0: 48147.3. Samples: 631239400. Policy #0 lag: (min: 3.0, avg: 38.4, max: 76.0) [2024-03-21 02:30:35,522][03784] Avg episode reward: [(0, '1.404')] [2024-03-21 02:30:40,521][03784] Fps is (10 sec: 45874.8, 60 sec: 50790.4, 300 sec: 46097.4). Total num frames: 630259712. Throughput: 0: 47424.4. Samples: 631492700. Policy #0 lag: (min: 3.0, avg: 38.4, max: 76.0) [2024-03-21 02:30:40,522][03784] Avg episode reward: [(0, '1.025')] [2024-03-21 02:30:42,260][04017] Updated weights for policy 0, policy_version 19236 (0.0019) [2024-03-21 02:30:45,521][03784] Fps is (10 sec: 39321.9, 60 sec: 48605.9, 300 sec: 46541.7). Total num frames: 630456320. Throughput: 0: 46620.0. Samples: 631615600. Policy #0 lag: (min: 3.0, avg: 38.4, max: 76.0) [2024-03-21 02:30:45,522][03784] Avg episode reward: [(0, '1.164')] [2024-03-21 02:30:50,521][03784] Fps is (10 sec: 36044.7, 60 sec: 48609.7, 300 sec: 46874.9). Total num frames: 630620160. Throughput: 0: 46206.9. Samples: 631854100. Policy #0 lag: (min: 3.0, avg: 38.4, max: 76.0) [2024-03-21 02:30:50,522][03784] Avg episode reward: [(0, '0.595')] [2024-03-21 02:30:50,690][04017] Updated weights for policy 0, policy_version 19246 (0.0011) [2024-03-21 02:30:55,521][03784] Fps is (10 sec: 26214.5, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 630718464. Throughput: 0: 45882.3. Samples: 632127600. Policy #0 lag: (min: 0.0, avg: 31.1, max: 77.0) [2024-03-21 02:30:55,522][03784] Avg episode reward: [(0, '0.975')] [2024-03-21 02:31:00,521][03784] Fps is (10 sec: 29491.4, 60 sec: 44782.9, 300 sec: 46874.9). Total num frames: 630915072. Throughput: 0: 45437.8. Samples: 632266800. Policy #0 lag: (min: 0.0, avg: 31.1, max: 77.0) [2024-03-21 02:31:00,522][03784] Avg episode reward: [(0, '0.573')] [2024-03-21 02:31:02,853][04017] Updated weights for policy 0, policy_version 19256 (0.0011) [2024-03-21 02:31:05,521][03784] Fps is (10 sec: 36045.0, 60 sec: 41506.4, 300 sec: 46986.0). Total num frames: 631078912. Throughput: 0: 45617.9. Samples: 632561400. Policy #0 lag: (min: 0.0, avg: 21.6, max: 50.0) [2024-03-21 02:31:05,522][03784] Avg episode reward: [(0, '1.462')] [2024-03-21 02:31:08,553][04017] Updated weights for policy 0, policy_version 19266 (0.0012) [2024-03-21 02:31:10,521][03784] Fps is (10 sec: 52428.7, 60 sec: 42052.2, 300 sec: 47208.1). Total num frames: 631439360. Throughput: 0: 45448.9. Samples: 632841800. Policy #0 lag: (min: 0.0, avg: 21.6, max: 50.0) [2024-03-21 02:31:10,522][03784] Avg episode reward: [(0, '1.225')] [2024-03-21 02:31:14,849][04017] Updated weights for policy 0, policy_version 19276 (0.0015) [2024-03-21 02:31:15,521][03784] Fps is (10 sec: 55705.2, 60 sec: 41506.2, 300 sec: 47097.1). Total num frames: 631635968. Throughput: 0: 45477.8. Samples: 632989100. Policy #0 lag: (min: 0.0, avg: 21.6, max: 50.0) [2024-03-21 02:31:15,522][03784] Avg episode reward: [(0, '1.003')] [2024-03-21 02:31:20,521][03784] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 46430.6). Total num frames: 631865344. Throughput: 0: 45228.9. Samples: 633274700. Policy #0 lag: (min: 0.0, avg: 37.8, max: 111.0) [2024-03-21 02:31:20,522][03784] Avg episode reward: [(0, '0.762')] [2024-03-21 02:31:21,436][04017] Updated weights for policy 0, policy_version 19286 (0.0015) [2024-03-21 02:31:25,521][03784] Fps is (10 sec: 55705.0, 60 sec: 42598.4, 300 sec: 46763.8). Total num frames: 632193024. Throughput: 0: 45555.5. Samples: 633542700. Policy #0 lag: (min: 0.0, avg: 37.8, max: 111.0) [2024-03-21 02:31:25,522][03784] Avg episode reward: [(0, '0.891')] [2024-03-21 02:31:26,146][03995] Signal inference workers to stop experience collection... (12700 times) [2024-03-21 02:31:26,188][04017] InferenceWorker_p0-w0: stopping experience collection (12700 times) [2024-03-21 02:31:26,437][03995] Signal inference workers to resume experience collection... (12700 times) [2024-03-21 02:31:26,437][04017] InferenceWorker_p0-w0: resuming experience collection (12700 times) [2024-03-21 02:31:26,445][04017] Updated weights for policy 0, policy_version 19296 (0.0017) [2024-03-21 02:31:30,521][03784] Fps is (10 sec: 68812.3, 60 sec: 45875.1, 300 sec: 47319.2). Total num frames: 632553472. Throughput: 0: 45593.2. Samples: 633667300. Policy #0 lag: (min: 0.0, avg: 37.8, max: 111.0) [2024-03-21 02:31:30,522][03784] Avg episode reward: [(0, '1.314')] [2024-03-21 02:31:31,030][04017] Updated weights for policy 0, policy_version 19306 (0.0016) [2024-03-21 02:31:35,521][03784] Fps is (10 sec: 68812.8, 60 sec: 46967.5, 300 sec: 46763.8). Total num frames: 632881152. Throughput: 0: 46557.8. Samples: 633949200. Policy #0 lag: (min: 0.0, avg: 42.6, max: 84.0) [2024-03-21 02:31:35,522][03784] Avg episode reward: [(0, '1.051')] [2024-03-21 02:31:36,029][04017] Updated weights for policy 0, policy_version 19316 (0.0011) [2024-03-21 02:31:40,521][03784] Fps is (10 sec: 68813.7, 60 sec: 49698.2, 300 sec: 47208.1). Total num frames: 633241600. Throughput: 0: 46240.0. Samples: 634208400. Policy #0 lag: (min: 0.0, avg: 42.6, max: 84.0) [2024-03-21 02:31:40,522][03784] Avg episode reward: [(0, '0.706')] [2024-03-21 02:31:42,428][04017] Updated weights for policy 0, policy_version 19326 (0.0016) [2024-03-21 02:31:45,521][03784] Fps is (10 sec: 45875.7, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 633339904. Throughput: 0: 45982.3. Samples: 634336000. Policy #0 lag: (min: 0.0, avg: 42.6, max: 84.0) [2024-03-21 02:31:45,522][03784] Avg episode reward: [(0, '1.152')] [2024-03-21 02:31:50,521][03784] Fps is (10 sec: 19660.8, 60 sec: 46967.5, 300 sec: 46652.8). Total num frames: 633438208. Throughput: 0: 46111.0. Samples: 634636400. Policy #0 lag: (min: 0.0, avg: 34.4, max: 93.0) [2024-03-21 02:31:50,522][03784] Avg episode reward: [(0, '1.066')] [2024-03-21 02:31:55,521][03784] Fps is (10 sec: 22937.5, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 633569280. Throughput: 0: 45684.5. Samples: 634897600. Policy #0 lag: (min: 0.0, avg: 34.4, max: 93.0) [2024-03-21 02:31:55,522][03784] Avg episode reward: [(0, '0.743')] [2024-03-21 02:31:55,831][04017] Updated weights for policy 0, policy_version 19336 (0.0015) [2024-03-21 02:32:00,521][03784] Fps is (10 sec: 26214.2, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 633700352. Throughput: 0: 45368.8. Samples: 635030700. Policy #0 lag: (min: 0.0, avg: 34.4, max: 93.0) [2024-03-21 02:32:00,522][03784] Avg episode reward: [(0, '0.872')] [2024-03-21 02:32:00,833][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000019340_633733120.pth... [2024-03-21 02:32:00,945][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000019007_622821376.pth [2024-03-21 02:32:05,521][03784] Fps is (10 sec: 32768.2, 60 sec: 46967.4, 300 sec: 45542.0). Total num frames: 633896960. Throughput: 0: 44533.4. Samples: 635278700. Policy #0 lag: (min: 0.0, avg: 35.4, max: 81.0) [2024-03-21 02:32:05,522][03784] Avg episode reward: [(0, '1.459')] [2024-03-21 02:32:06,417][04017] Updated weights for policy 0, policy_version 19346 (0.0020) [2024-03-21 02:32:10,521][03784] Fps is (10 sec: 36044.9, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 634060800. Throughput: 0: 44229.0. Samples: 635533000. Policy #0 lag: (min: 0.0, avg: 35.4, max: 81.0) [2024-03-21 02:32:10,522][03784] Avg episode reward: [(0, '0.786')] [2024-03-21 02:32:13,348][04017] Updated weights for policy 0, policy_version 19356 (0.0018) [2024-03-21 02:32:15,521][03784] Fps is (10 sec: 39321.3, 60 sec: 44236.8, 300 sec: 45653.0). Total num frames: 634290176. Throughput: 0: 44237.9. Samples: 635658000. Policy #0 lag: (min: 0.0, avg: 26.4, max: 53.0) [2024-03-21 02:32:15,522][03784] Avg episode reward: [(0, '0.982')] [2024-03-21 02:32:20,521][03784] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 45764.1). Total num frames: 634454016. Throughput: 0: 44824.5. Samples: 635966300. Policy #0 lag: (min: 0.0, avg: 26.4, max: 53.0) [2024-03-21 02:32:20,522][03784] Avg episode reward: [(0, '0.812')] [2024-03-21 02:32:21,600][04017] Updated weights for policy 0, policy_version 19366 (0.0015) [2024-03-21 02:32:25,521][03784] Fps is (10 sec: 49152.2, 60 sec: 43144.6, 300 sec: 46430.6). Total num frames: 634781696. Throughput: 0: 45528.9. Samples: 636257200. Policy #0 lag: (min: 0.0, avg: 26.4, max: 53.0) [2024-03-21 02:32:25,522][03784] Avg episode reward: [(0, '0.907')] [2024-03-21 02:32:27,519][04017] Updated weights for policy 0, policy_version 19376 (0.0023) [2024-03-21 02:32:28,507][03995] Signal inference workers to stop experience collection... (12750 times) [2024-03-21 02:32:28,508][03995] Signal inference workers to resume experience collection... (12750 times) [2024-03-21 02:32:28,573][04017] InferenceWorker_p0-w0: stopping experience collection (12750 times) [2024-03-21 02:32:28,573][04017] InferenceWorker_p0-w0: resuming experience collection (12750 times) [2024-03-21 02:32:30,521][03784] Fps is (10 sec: 62258.5, 60 sec: 42052.3, 300 sec: 46874.9). Total num frames: 635076608. Throughput: 0: 45744.3. Samples: 636394500. Policy #0 lag: (min: 1.0, avg: 32.4, max: 52.0) [2024-03-21 02:32:30,523][03784] Avg episode reward: [(0, '1.511')] [2024-03-21 02:32:32,527][04017] Updated weights for policy 0, policy_version 19386 (0.0012) [2024-03-21 02:32:35,521][03784] Fps is (10 sec: 68813.0, 60 sec: 43144.6, 300 sec: 47541.4). Total num frames: 635469824. Throughput: 0: 45457.8. Samples: 636682000. Policy #0 lag: (min: 1.0, avg: 32.4, max: 52.0) [2024-03-21 02:32:35,522][03784] Avg episode reward: [(0, '1.511')] [2024-03-21 02:32:38,215][04017] Updated weights for policy 0, policy_version 19396 (0.0011) [2024-03-21 02:32:40,521][03784] Fps is (10 sec: 72090.2, 60 sec: 42598.4, 300 sec: 47541.4). Total num frames: 635797504. Throughput: 0: 45646.6. Samples: 636951700. Policy #0 lag: (min: 1.0, avg: 32.4, max: 52.0) [2024-03-21 02:32:40,522][03784] Avg episode reward: [(0, '1.511')] [2024-03-21 02:32:41,797][04017] Updated weights for policy 0, policy_version 19406 (0.0016) [2024-03-21 02:32:45,521][03784] Fps is (10 sec: 65535.2, 60 sec: 46421.3, 300 sec: 48207.8). Total num frames: 636125184. Throughput: 0: 45671.1. Samples: 637085900. Policy #0 lag: (min: 0.0, avg: 53.8, max: 104.0) [2024-03-21 02:32:45,522][03784] Avg episode reward: [(0, '0.907')] [2024-03-21 02:32:47,787][04017] Updated weights for policy 0, policy_version 19416 (0.0020) [2024-03-21 02:32:50,521][03784] Fps is (10 sec: 58981.6, 60 sec: 49151.9, 300 sec: 47652.4). Total num frames: 636387328. Throughput: 0: 46573.1. Samples: 637374500. Policy #0 lag: (min: 0.0, avg: 53.8, max: 104.0) [2024-03-21 02:32:50,522][03784] Avg episode reward: [(0, '0.907')] [2024-03-21 02:32:53,444][04017] Updated weights for policy 0, policy_version 19426 (0.0010) [2024-03-21 02:32:55,521][03784] Fps is (10 sec: 42598.9, 60 sec: 49698.2, 300 sec: 46763.9). Total num frames: 636551168. Throughput: 0: 47395.6. Samples: 637665800. Policy #0 lag: (min: 0.0, avg: 53.8, max: 104.0) [2024-03-21 02:32:55,522][03784] Avg episode reward: [(0, '0.907')] [2024-03-21 02:33:00,521][03784] Fps is (10 sec: 22938.0, 60 sec: 48605.9, 300 sec: 45875.2). Total num frames: 636616704. Throughput: 0: 47742.2. Samples: 637806400. Policy #0 lag: (min: 0.0, avg: 44.0, max: 88.0) [2024-03-21 02:33:00,522][03784] Avg episode reward: [(0, '0.711')] [2024-03-21 02:33:05,521][03784] Fps is (10 sec: 26214.1, 60 sec: 48605.8, 300 sec: 45430.9). Total num frames: 636813312. Throughput: 0: 47779.9. Samples: 638116400. Policy #0 lag: (min: 0.0, avg: 44.0, max: 88.0) [2024-03-21 02:33:05,523][03784] Avg episode reward: [(0, '0.829')] [2024-03-21 02:33:06,918][04017] Updated weights for policy 0, policy_version 19436 (0.0014) [2024-03-21 02:33:10,521][03784] Fps is (10 sec: 45875.2, 60 sec: 50244.3, 300 sec: 45653.1). Total num frames: 637075456. Throughput: 0: 47502.2. Samples: 638394800. Policy #0 lag: (min: 0.0, avg: 44.0, max: 88.0) [2024-03-21 02:33:10,522][03784] Avg episode reward: [(0, '1.077')] [2024-03-21 02:33:14,816][04017] Updated weights for policy 0, policy_version 19446 (0.0011) [2024-03-21 02:33:15,521][03784] Fps is (10 sec: 39321.7, 60 sec: 48605.8, 300 sec: 45986.3). Total num frames: 637206528. Throughput: 0: 47884.5. Samples: 638549300. Policy #0 lag: (min: 0.0, avg: 32.0, max: 79.0) [2024-03-21 02:33:15,522][03784] Avg episode reward: [(0, '0.863')] [2024-03-21 02:33:20,521][03784] Fps is (10 sec: 36044.7, 60 sec: 49698.1, 300 sec: 46430.6). Total num frames: 637435904. Throughput: 0: 47582.2. Samples: 638823200. Policy #0 lag: (min: 0.0, avg: 32.0, max: 79.0) [2024-03-21 02:33:20,522][03784] Avg episode reward: [(0, '1.308')] [2024-03-21 02:33:23,180][04017] Updated weights for policy 0, policy_version 19456 (0.0017) [2024-03-21 02:33:25,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46421.3, 300 sec: 46430.6). Total num frames: 637566976. Throughput: 0: 48468.9. Samples: 639132800. Policy #0 lag: (min: 0.0, avg: 32.0, max: 79.0) [2024-03-21 02:33:25,522][03784] Avg episode reward: [(0, '1.123')] [2024-03-21 02:33:28,418][03995] Signal inference workers to stop experience collection... (12800 times) [2024-03-21 02:33:28,419][03995] Signal inference workers to resume experience collection... (12800 times) [2024-03-21 02:33:28,488][04017] InferenceWorker_p0-w0: stopping experience collection (12800 times) [2024-03-21 02:33:28,489][04017] InferenceWorker_p0-w0: resuming experience collection (12800 times) [2024-03-21 02:33:29,394][04017] Updated weights for policy 0, policy_version 19466 (0.0015) [2024-03-21 02:33:30,521][03784] Fps is (10 sec: 45874.6, 60 sec: 46967.4, 300 sec: 47097.0). Total num frames: 637894656. Throughput: 0: 48591.0. Samples: 639272500. Policy #0 lag: (min: 0.0, avg: 35.7, max: 87.0) [2024-03-21 02:33:30,522][03784] Avg episode reward: [(0, '1.212')] [2024-03-21 02:33:34,657][04017] Updated weights for policy 0, policy_version 19476 (0.0011) [2024-03-21 02:33:35,521][03784] Fps is (10 sec: 68812.3, 60 sec: 46421.2, 300 sec: 48096.8). Total num frames: 638255104. Throughput: 0: 48577.8. Samples: 639560500. Policy #0 lag: (min: 0.0, avg: 35.7, max: 87.0) [2024-03-21 02:33:35,522][03784] Avg episode reward: [(0, '1.292')] [2024-03-21 02:33:37,794][04017] Updated weights for policy 0, policy_version 19486 (0.0013) [2024-03-21 02:33:40,521][03784] Fps is (10 sec: 75367.5, 60 sec: 47513.6, 300 sec: 48318.9). Total num frames: 638648320. Throughput: 0: 48120.0. Samples: 639831200. Policy #0 lag: (min: 0.0, avg: 35.7, max: 87.0) [2024-03-21 02:33:40,522][03784] Avg episode reward: [(0, '1.292')] [2024-03-21 02:33:45,296][04017] Updated weights for policy 0, policy_version 19496 (0.0011) [2024-03-21 02:33:45,521][03784] Fps is (10 sec: 58983.3, 60 sec: 45329.2, 300 sec: 47541.4). Total num frames: 638844928. Throughput: 0: 48406.7. Samples: 639984700. Policy #0 lag: (min: 1.0, avg: 41.7, max: 70.0) [2024-03-21 02:33:45,522][03784] Avg episode reward: [(0, '1.292')] [2024-03-21 02:33:50,521][03784] Fps is (10 sec: 45875.2, 60 sec: 45329.2, 300 sec: 47208.2). Total num frames: 639107072. Throughput: 0: 48017.9. Samples: 640277200. Policy #0 lag: (min: 1.0, avg: 41.7, max: 70.0) [2024-03-21 02:33:50,522][03784] Avg episode reward: [(0, '0.575')] [2024-03-21 02:33:52,549][04017] Updated weights for policy 0, policy_version 19506 (0.0017) [2024-03-21 02:33:55,521][03784] Fps is (10 sec: 52428.4, 60 sec: 46967.4, 300 sec: 47097.1). Total num frames: 639369216. Throughput: 0: 48393.3. Samples: 640572500. Policy #0 lag: (min: 1.0, avg: 41.7, max: 70.0) [2024-03-21 02:33:55,522][03784] Avg episode reward: [(0, '0.801')] [2024-03-21 02:33:57,419][04017] Updated weights for policy 0, policy_version 19516 (0.0011) [2024-03-21 02:34:00,521][03784] Fps is (10 sec: 52428.5, 60 sec: 50244.2, 300 sec: 46430.6). Total num frames: 639631360. Throughput: 0: 48117.8. Samples: 640714600. Policy #0 lag: (min: 0.0, avg: 53.0, max: 128.0) [2024-03-21 02:34:00,522][03784] Avg episode reward: [(0, '0.788')] [2024-03-21 02:34:00,826][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000019521_639664128.pth... [2024-03-21 02:34:00,951][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000019173_628260864.pth [2024-03-21 02:34:03,988][04017] Updated weights for policy 0, policy_version 19526 (0.0020) [2024-03-21 02:34:05,521][03784] Fps is (10 sec: 52428.5, 60 sec: 51336.5, 300 sec: 46874.9). Total num frames: 639893504. Throughput: 0: 48066.6. Samples: 640986200. Policy #0 lag: (min: 0.0, avg: 53.0, max: 128.0) [2024-03-21 02:34:05,522][03784] Avg episode reward: [(0, '0.437')] [2024-03-21 02:34:10,521][03784] Fps is (10 sec: 39321.9, 60 sec: 49152.0, 300 sec: 46430.6). Total num frames: 640024576. Throughput: 0: 46969.0. Samples: 641246400. Policy #0 lag: (min: 0.0, avg: 53.0, max: 128.0) [2024-03-21 02:34:10,522][03784] Avg episode reward: [(0, '1.104')] [2024-03-21 02:34:15,521][03784] Fps is (10 sec: 19661.1, 60 sec: 48059.8, 300 sec: 46208.4). Total num frames: 640090112. Throughput: 0: 47246.9. Samples: 641398600. Policy #0 lag: (min: 0.0, avg: 40.9, max: 86.0) [2024-03-21 02:34:15,522][03784] Avg episode reward: [(0, '0.562')] [2024-03-21 02:34:16,769][03995] Signal inference workers to stop experience collection... (12850 times) [2024-03-21 02:34:16,828][04017] InferenceWorker_p0-w0: stopping experience collection (12850 times) [2024-03-21 02:34:17,052][03995] Signal inference workers to resume experience collection... (12850 times) [2024-03-21 02:34:17,052][04017] InferenceWorker_p0-w0: resuming experience collection (12850 times) [2024-03-21 02:34:17,054][04017] Updated weights for policy 0, policy_version 19536 (0.0010) [2024-03-21 02:34:20,521][03784] Fps is (10 sec: 32767.9, 60 sec: 48605.9, 300 sec: 46874.9). Total num frames: 640352256. Throughput: 0: 46773.4. Samples: 641665300. Policy #0 lag: (min: 0.0, avg: 40.9, max: 86.0) [2024-03-21 02:34:20,522][03784] Avg episode reward: [(0, '0.770')] [2024-03-21 02:34:22,976][04017] Updated weights for policy 0, policy_version 19546 (0.0015) [2024-03-21 02:34:25,521][03784] Fps is (10 sec: 52428.2, 60 sec: 50790.4, 300 sec: 47208.1). Total num frames: 640614400. Throughput: 0: 47244.4. Samples: 641957200. Policy #0 lag: (min: 0.0, avg: 40.9, max: 86.0) [2024-03-21 02:34:25,522][03784] Avg episode reward: [(0, '0.865')] [2024-03-21 02:34:30,136][04017] Updated weights for policy 0, policy_version 19556 (0.0011) [2024-03-21 02:34:30,521][03784] Fps is (10 sec: 49151.8, 60 sec: 49152.1, 300 sec: 47208.1). Total num frames: 640843776. Throughput: 0: 46877.7. Samples: 642094200. Policy #0 lag: (min: 0.0, avg: 39.8, max: 98.0) [2024-03-21 02:34:30,522][03784] Avg episode reward: [(0, '1.230')] [2024-03-21 02:34:35,521][03784] Fps is (10 sec: 32768.0, 60 sec: 44783.0, 300 sec: 46541.7). Total num frames: 640942080. Throughput: 0: 47088.8. Samples: 642396200. Policy #0 lag: (min: 0.0, avg: 39.8, max: 98.0) [2024-03-21 02:34:35,522][03784] Avg episode reward: [(0, '0.613')] [2024-03-21 02:34:37,288][04017] Updated weights for policy 0, policy_version 19566 (0.0015) [2024-03-21 02:34:40,521][03784] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 46541.7). Total num frames: 641269760. Throughput: 0: 46373.4. Samples: 642659300. Policy #0 lag: (min: 0.0, avg: 39.8, max: 98.0) [2024-03-21 02:34:40,522][03784] Avg episode reward: [(0, '0.697')] [2024-03-21 02:34:45,521][03784] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 46431.4). Total num frames: 641400832. Throughput: 0: 46764.5. Samples: 642819000. Policy #0 lag: (min: 0.0, avg: 36.2, max: 79.0) [2024-03-21 02:34:45,522][03784] Avg episode reward: [(0, '0.449')] [2024-03-21 02:34:46,225][04017] Updated weights for policy 0, policy_version 19576 (0.0010) [2024-03-21 02:34:49,565][04017] Updated weights for policy 0, policy_version 19586 (0.0011) [2024-03-21 02:34:50,521][03784] Fps is (10 sec: 58981.7, 60 sec: 45875.1, 300 sec: 47319.2). Total num frames: 641859584. Throughput: 0: 46488.9. Samples: 643078200. Policy #0 lag: (min: 0.0, avg: 36.2, max: 79.0) [2024-03-21 02:34:50,522][03784] Avg episode reward: [(0, '0.906')] [2024-03-21 02:34:54,341][04017] Updated weights for policy 0, policy_version 19596 (0.0014) [2024-03-21 02:34:55,521][03784] Fps is (10 sec: 81919.7, 60 sec: 47513.6, 300 sec: 47430.3). Total num frames: 642220032. Throughput: 0: 47193.3. Samples: 643370100. Policy #0 lag: (min: 0.0, avg: 36.2, max: 79.0) [2024-03-21 02:34:55,522][03784] Avg episode reward: [(0, '0.906')] [2024-03-21 02:34:59,385][04017] Updated weights for policy 0, policy_version 19606 (0.0015) [2024-03-21 02:35:00,521][03784] Fps is (10 sec: 58983.0, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 642449408. Throughput: 0: 46995.5. Samples: 643513400. Policy #0 lag: (min: 0.0, avg: 54.1, max: 129.0) [2024-03-21 02:35:00,522][03784] Avg episode reward: [(0, '0.906')] [2024-03-21 02:35:03,489][03995] Signal inference workers to stop experience collection... (12900 times) [2024-03-21 02:35:03,490][03995] Signal inference workers to resume experience collection... (12900 times) [2024-03-21 02:35:03,567][04017] InferenceWorker_p0-w0: stopping experience collection (12900 times) [2024-03-21 02:35:03,568][04017] InferenceWorker_p0-w0: resuming experience collection (12900 times) [2024-03-21 02:35:05,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44783.0, 300 sec: 46319.5). Total num frames: 642580480. Throughput: 0: 47195.6. Samples: 643789100. Policy #0 lag: (min: 0.0, avg: 54.1, max: 129.0) [2024-03-21 02:35:05,522][03784] Avg episode reward: [(0, '0.865')] [2024-03-21 02:35:10,521][03784] Fps is (10 sec: 29491.1, 60 sec: 45329.0, 300 sec: 46097.4). Total num frames: 642744320. Throughput: 0: 46822.3. Samples: 644064200. Policy #0 lag: (min: 0.0, avg: 54.1, max: 129.0) [2024-03-21 02:35:10,522][03784] Avg episode reward: [(0, '1.630')] [2024-03-21 02:35:10,604][04017] Updated weights for policy 0, policy_version 19616 (0.0015) [2024-03-21 02:35:15,521][03784] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 46652.8). Total num frames: 643072000. Throughput: 0: 46515.7. Samples: 644187400. Policy #0 lag: (min: 0.0, avg: 36.1, max: 71.0) [2024-03-21 02:35:15,522][03784] Avg episode reward: [(0, '0.489')] [2024-03-21 02:35:16,652][04017] Updated weights for policy 0, policy_version 19626 (0.0011) [2024-03-21 02:35:20,521][03784] Fps is (10 sec: 55705.4, 60 sec: 49152.0, 300 sec: 46319.5). Total num frames: 643301376. Throughput: 0: 45100.0. Samples: 644425700. Policy #0 lag: (min: 0.0, avg: 36.1, max: 71.0) [2024-03-21 02:35:20,522][03784] Avg episode reward: [(0, '0.723')] [2024-03-21 02:35:24,401][04017] Updated weights for policy 0, policy_version 19636 (0.0012) [2024-03-21 02:35:25,521][03784] Fps is (10 sec: 45875.0, 60 sec: 48605.9, 300 sec: 46541.7). Total num frames: 643530752. Throughput: 0: 44404.4. Samples: 644657500. Policy #0 lag: (min: 2.0, avg: 32.1, max: 70.0) [2024-03-21 02:35:25,522][03784] Avg episode reward: [(0, '0.820')] [2024-03-21 02:35:30,521][03784] Fps is (10 sec: 36045.1, 60 sec: 46967.5, 300 sec: 46097.4). Total num frames: 643661824. Throughput: 0: 43522.2. Samples: 644777500. Policy #0 lag: (min: 2.0, avg: 32.1, max: 70.0) [2024-03-21 02:35:30,522][03784] Avg episode reward: [(0, '0.660')] [2024-03-21 02:35:34,672][04017] Updated weights for policy 0, policy_version 19646 (0.0012) [2024-03-21 02:35:35,521][03784] Fps is (10 sec: 26214.0, 60 sec: 47513.5, 300 sec: 45875.2). Total num frames: 643792896. Throughput: 0: 43631.1. Samples: 645041600. Policy #0 lag: (min: 2.0, avg: 32.1, max: 70.0) [2024-03-21 02:35:35,522][03784] Avg episode reward: [(0, '0.793')] [2024-03-21 02:35:40,521][03784] Fps is (10 sec: 26214.2, 60 sec: 44236.8, 300 sec: 45653.0). Total num frames: 643923968. Throughput: 0: 43966.6. Samples: 645348600. Policy #0 lag: (min: 0.0, avg: 34.0, max: 87.0) [2024-03-21 02:35:40,522][03784] Avg episode reward: [(0, '1.478')] [2024-03-21 02:35:43,658][04017] Updated weights for policy 0, policy_version 19656 (0.0014) [2024-03-21 02:35:45,521][03784] Fps is (10 sec: 29491.6, 60 sec: 44782.9, 300 sec: 45653.1). Total num frames: 644087808. Throughput: 0: 44142.2. Samples: 645499800. Policy #0 lag: (min: 0.0, avg: 34.0, max: 87.0) [2024-03-21 02:35:45,522][03784] Avg episode reward: [(0, '1.009')] [2024-03-21 02:35:50,521][03784] Fps is (10 sec: 29491.4, 60 sec: 39321.7, 300 sec: 45764.1). Total num frames: 644218880. Throughput: 0: 44864.5. Samples: 645808000. Policy #0 lag: (min: 0.0, avg: 34.0, max: 87.0) [2024-03-21 02:35:50,522][03784] Avg episode reward: [(0, '1.009')] [2024-03-21 02:35:53,505][04017] Updated weights for policy 0, policy_version 19666 (0.0024) [2024-03-21 02:35:55,521][03784] Fps is (10 sec: 52428.8, 60 sec: 39867.7, 300 sec: 46430.6). Total num frames: 644612096. Throughput: 0: 44491.1. Samples: 646066300. Policy #0 lag: (min: 4.0, avg: 22.9, max: 55.0) [2024-03-21 02:35:55,522][03784] Avg episode reward: [(0, '1.056')] [2024-03-21 02:35:56,650][04017] Updated weights for policy 0, policy_version 19676 (0.0016) [2024-03-21 02:36:00,521][03784] Fps is (10 sec: 68812.6, 60 sec: 40960.0, 300 sec: 46874.9). Total num frames: 644907008. Throughput: 0: 44613.3. Samples: 646195000. Policy #0 lag: (min: 4.0, avg: 22.9, max: 55.0) [2024-03-21 02:36:00,522][03784] Avg episode reward: [(0, '1.056')] [2024-03-21 02:36:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000019681_644907008.pth... [2024-03-21 02:36:00,617][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000019340_633733120.pth [2024-03-21 02:36:04,901][03995] Signal inference workers to stop experience collection... (12950 times) [2024-03-21 02:36:04,979][03995] Signal inference workers to resume experience collection... (12950 times) [2024-03-21 02:36:04,982][04017] InferenceWorker_p0-w0: stopping experience collection (12950 times) [2024-03-21 02:36:05,056][04017] InferenceWorker_p0-w0: resuming experience collection (12950 times) [2024-03-21 02:36:05,297][04017] Updated weights for policy 0, policy_version 19686 (0.0010) [2024-03-21 02:36:05,521][03784] Fps is (10 sec: 45875.5, 60 sec: 41506.2, 300 sec: 46208.4). Total num frames: 645070848. Throughput: 0: 46362.3. Samples: 646512000. Policy #0 lag: (min: 4.0, avg: 22.9, max: 55.0) [2024-03-21 02:36:05,522][03784] Avg episode reward: [(0, '1.056')] [2024-03-21 02:36:08,560][04017] Updated weights for policy 0, policy_version 19696 (0.0022) [2024-03-21 02:36:10,521][03784] Fps is (10 sec: 65535.3, 60 sec: 46967.4, 300 sec: 47208.1). Total num frames: 645562368. Throughput: 0: 46191.0. Samples: 646736100. Policy #0 lag: (min: 4.0, avg: 31.9, max: 74.0) [2024-03-21 02:36:10,522][03784] Avg episode reward: [(0, '0.453')] [2024-03-21 02:36:12,625][04017] Updated weights for policy 0, policy_version 19706 (0.0010) [2024-03-21 02:36:15,521][03784] Fps is (10 sec: 85195.3, 60 sec: 47513.4, 300 sec: 47652.4). Total num frames: 645922816. Throughput: 0: 46144.3. Samples: 646854000. Policy #0 lag: (min: 4.0, avg: 31.9, max: 74.0) [2024-03-21 02:36:15,522][03784] Avg episode reward: [(0, '0.547')] [2024-03-21 02:36:17,392][04017] Updated weights for policy 0, policy_version 19716 (0.0012) [2024-03-21 02:36:20,521][03784] Fps is (10 sec: 68813.0, 60 sec: 49152.0, 300 sec: 47652.5). Total num frames: 646250496. Throughput: 0: 46669.0. Samples: 647141700. Policy #0 lag: (min: 4.0, avg: 31.9, max: 74.0) [2024-03-21 02:36:20,522][03784] Avg episode reward: [(0, '1.451')] [2024-03-21 02:36:25,521][03784] Fps is (10 sec: 42599.1, 60 sec: 46967.5, 300 sec: 46763.8). Total num frames: 646348800. Throughput: 0: 46482.3. Samples: 647440300. Policy #0 lag: (min: 5.0, avg: 60.6, max: 101.0) [2024-03-21 02:36:25,522][03784] Avg episode reward: [(0, '1.451')] [2024-03-21 02:36:29,879][04017] Updated weights for policy 0, policy_version 19726 (0.0011) [2024-03-21 02:36:30,521][03784] Fps is (10 sec: 16384.1, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 646414336. Throughput: 0: 46575.6. Samples: 647595700. Policy #0 lag: (min: 5.0, avg: 60.6, max: 101.0) [2024-03-21 02:36:30,522][03784] Avg episode reward: [(0, '1.091')] [2024-03-21 02:36:35,521][03784] Fps is (10 sec: 26213.9, 60 sec: 46967.5, 300 sec: 45319.8). Total num frames: 646610944. Throughput: 0: 46048.7. Samples: 647880200. Policy #0 lag: (min: 0.0, avg: 26.1, max: 81.0) [2024-03-21 02:36:35,522][03784] Avg episode reward: [(0, '0.969')] [2024-03-21 02:36:36,812][04017] Updated weights for policy 0, policy_version 19736 (0.0021) [2024-03-21 02:36:40,521][03784] Fps is (10 sec: 42598.6, 60 sec: 48605.9, 300 sec: 45764.1). Total num frames: 646840320. Throughput: 0: 46171.2. Samples: 648144000. Policy #0 lag: (min: 0.0, avg: 26.1, max: 81.0) [2024-03-21 02:36:40,522][03784] Avg episode reward: [(0, '1.063')] [2024-03-21 02:36:45,521][03784] Fps is (10 sec: 36045.5, 60 sec: 48059.8, 300 sec: 45875.2). Total num frames: 646971392. Throughput: 0: 46193.4. Samples: 648273700. Policy #0 lag: (min: 0.0, avg: 26.1, max: 81.0) [2024-03-21 02:36:45,522][03784] Avg episode reward: [(0, '1.079')] [2024-03-21 02:36:45,968][04017] Updated weights for policy 0, policy_version 19746 (0.0015) [2024-03-21 02:36:50,525][03784] Fps is (10 sec: 36030.3, 60 sec: 49694.8, 300 sec: 46207.8). Total num frames: 647200768. Throughput: 0: 45149.3. Samples: 648543900. Policy #0 lag: (min: 0.0, avg: 39.1, max: 88.0) [2024-03-21 02:36:50,526][03784] Avg episode reward: [(0, '1.051')] [2024-03-21 02:36:50,863][03995] Signal inference workers to stop experience collection... (13000 times) [2024-03-21 02:36:50,945][04017] InferenceWorker_p0-w0: stopping experience collection (13000 times) [2024-03-21 02:36:50,989][03995] Signal inference workers to resume experience collection... (13000 times) [2024-03-21 02:36:50,993][04017] InferenceWorker_p0-w0: resuming experience collection (13000 times) [2024-03-21 02:36:52,137][04017] Updated weights for policy 0, policy_version 19756 (0.0011) [2024-03-21 02:36:55,521][03784] Fps is (10 sec: 55705.0, 60 sec: 48605.8, 300 sec: 46874.9). Total num frames: 647528448. Throughput: 0: 46253.4. Samples: 648817500. Policy #0 lag: (min: 0.0, avg: 39.1, max: 88.0) [2024-03-21 02:36:55,522][03784] Avg episode reward: [(0, '1.397')] [2024-03-21 02:36:57,884][04017] Updated weights for policy 0, policy_version 19766 (0.0020) [2024-03-21 02:37:00,521][03784] Fps is (10 sec: 55727.9, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 647757824. Throughput: 0: 47011.3. Samples: 648969500. Policy #0 lag: (min: 0.0, avg: 39.1, max: 88.0) [2024-03-21 02:37:00,522][03784] Avg episode reward: [(0, '1.093')] [2024-03-21 02:37:05,351][04017] Updated weights for policy 0, policy_version 19776 (0.0012) [2024-03-21 02:37:05,521][03784] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 47319.2). Total num frames: 648019968. Throughput: 0: 46937.8. Samples: 649253900. Policy #0 lag: (min: 5.0, avg: 38.9, max: 72.0) [2024-03-21 02:37:05,522][03784] Avg episode reward: [(0, '1.308')] [2024-03-21 02:37:10,521][03784] Fps is (10 sec: 36044.8, 60 sec: 42598.5, 300 sec: 46874.9). Total num frames: 648118272. Throughput: 0: 47362.2. Samples: 649571600. Policy #0 lag: (min: 5.0, avg: 38.9, max: 72.0) [2024-03-21 02:37:10,522][03784] Avg episode reward: [(0, '1.308')] [2024-03-21 02:37:15,521][03784] Fps is (10 sec: 22937.5, 60 sec: 38775.5, 300 sec: 46763.8). Total num frames: 648249344. Throughput: 0: 47464.4. Samples: 649731600. Policy #0 lag: (min: 5.0, avg: 38.9, max: 72.0) [2024-03-21 02:37:15,522][03784] Avg episode reward: [(0, '0.662')] [2024-03-21 02:37:16,611][04017] Updated weights for policy 0, policy_version 19786 (0.0015) [2024-03-21 02:37:20,521][03784] Fps is (10 sec: 49151.8, 60 sec: 39321.6, 300 sec: 46874.9). Total num frames: 648609792. Throughput: 0: 47457.9. Samples: 650015800. Policy #0 lag: (min: 1.0, avg: 38.9, max: 77.0) [2024-03-21 02:37:20,522][03784] Avg episode reward: [(0, '1.001')] [2024-03-21 02:37:20,993][04017] Updated weights for policy 0, policy_version 19796 (0.0012) [2024-03-21 02:37:25,108][04017] Updated weights for policy 0, policy_version 19806 (0.0010) [2024-03-21 02:37:25,521][03784] Fps is (10 sec: 78643.9, 60 sec: 44782.9, 300 sec: 47319.2). Total num frames: 649035776. Throughput: 0: 47104.4. Samples: 650263700. Policy #0 lag: (min: 1.0, avg: 38.9, max: 77.0) [2024-03-21 02:37:25,522][03784] Avg episode reward: [(0, '1.199')] [2024-03-21 02:37:30,029][04017] Updated weights for policy 0, policy_version 19816 (0.0018) [2024-03-21 02:37:30,521][03784] Fps is (10 sec: 75366.5, 60 sec: 49152.0, 300 sec: 47097.0). Total num frames: 649363456. Throughput: 0: 47602.1. Samples: 650415800. Policy #0 lag: (min: 1.0, avg: 38.9, max: 77.0) [2024-03-21 02:37:30,522][03784] Avg episode reward: [(0, '0.781')] [2024-03-21 02:37:30,688][03995] Signal inference workers to stop experience collection... (13050 times) [2024-03-21 02:37:30,758][04017] InferenceWorker_p0-w0: stopping experience collection (13050 times) [2024-03-21 02:37:30,960][03995] Signal inference workers to resume experience collection... (13050 times) [2024-03-21 02:37:30,961][04017] InferenceWorker_p0-w0: resuming experience collection (13050 times) [2024-03-21 02:37:35,521][03784] Fps is (10 sec: 52428.5, 60 sec: 49152.1, 300 sec: 46652.7). Total num frames: 649560064. Throughput: 0: 47910.9. Samples: 650699700. Policy #0 lag: (min: 0.0, avg: 53.9, max: 116.0) [2024-03-21 02:37:35,522][03784] Avg episode reward: [(0, '0.781')] [2024-03-21 02:37:37,375][04017] Updated weights for policy 0, policy_version 19826 (0.0009) [2024-03-21 02:37:40,521][03784] Fps is (10 sec: 45875.5, 60 sec: 49698.1, 300 sec: 46430.6). Total num frames: 649822208. Throughput: 0: 48324.5. Samples: 650992100. Policy #0 lag: (min: 0.0, avg: 53.9, max: 116.0) [2024-03-21 02:37:40,522][03784] Avg episode reward: [(0, '0.620')] [2024-03-21 02:37:44,370][04017] Updated weights for policy 0, policy_version 19836 (0.0019) [2024-03-21 02:37:45,521][03784] Fps is (10 sec: 52428.9, 60 sec: 51882.6, 300 sec: 46430.6). Total num frames: 650084352. Throughput: 0: 47931.1. Samples: 651126400. Policy #0 lag: (min: 0.0, avg: 53.9, max: 116.0) [2024-03-21 02:37:45,522][03784] Avg episode reward: [(0, '0.527')] [2024-03-21 02:37:50,521][03784] Fps is (10 sec: 39321.7, 60 sec: 50247.7, 300 sec: 46319.5). Total num frames: 650215424. Throughput: 0: 47840.1. Samples: 651406700. Policy #0 lag: (min: 0.0, avg: 43.6, max: 78.0) [2024-03-21 02:37:50,522][03784] Avg episode reward: [(0, '1.305')] [2024-03-21 02:37:51,413][04017] Updated weights for policy 0, policy_version 19846 (0.0014) [2024-03-21 02:37:55,521][03784] Fps is (10 sec: 29491.2, 60 sec: 47513.7, 300 sec: 46652.7). Total num frames: 650379264. Throughput: 0: 47093.4. Samples: 651690800. Policy #0 lag: (min: 0.0, avg: 43.6, max: 78.0) [2024-03-21 02:37:55,522][03784] Avg episode reward: [(0, '1.065')] [2024-03-21 02:38:00,521][03784] Fps is (10 sec: 32767.6, 60 sec: 46421.3, 300 sec: 46541.7). Total num frames: 650543104. Throughput: 0: 46831.1. Samples: 651839000. Policy #0 lag: (min: 0.0, avg: 43.6, max: 78.0) [2024-03-21 02:38:00,522][03784] Avg episode reward: [(0, '1.144')] [2024-03-21 02:38:00,619][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000019854_650575872.pth... [2024-03-21 02:38:00,734][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000019521_639664128.pth [2024-03-21 02:38:01,976][04017] Updated weights for policy 0, policy_version 19856 (0.0019) [2024-03-21 02:38:05,521][03784] Fps is (10 sec: 39321.5, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 650772480. Throughput: 0: 46264.5. Samples: 652097700. Policy #0 lag: (min: 1.0, avg: 38.6, max: 113.0) [2024-03-21 02:38:05,522][03784] Avg episode reward: [(0, '1.186')] [2024-03-21 02:38:10,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46421.3, 300 sec: 46430.6). Total num frames: 650903552. Throughput: 0: 47399.8. Samples: 652396700. Policy #0 lag: (min: 1.0, avg: 38.6, max: 113.0) [2024-03-21 02:38:10,522][03784] Avg episode reward: [(0, '0.921')] [2024-03-21 02:38:11,751][04017] Updated weights for policy 0, policy_version 19866 (0.0011) [2024-03-21 02:38:15,521][03784] Fps is (10 sec: 36045.0, 60 sec: 48059.8, 300 sec: 46430.6). Total num frames: 651132928. Throughput: 0: 47189.0. Samples: 652539300. Policy #0 lag: (min: 1.0, avg: 38.6, max: 113.0) [2024-03-21 02:38:15,522][03784] Avg episode reward: [(0, '0.921')] [2024-03-21 02:38:18,670][04017] Updated weights for policy 0, policy_version 19876 (0.0011) [2024-03-21 02:38:20,521][03784] Fps is (10 sec: 49151.7, 60 sec: 46421.3, 300 sec: 46874.9). Total num frames: 651395072. Throughput: 0: 47122.1. Samples: 652820200. Policy #0 lag: (min: 0.0, avg: 38.4, max: 92.0) [2024-03-21 02:38:20,522][03784] Avg episode reward: [(0, '1.330')] [2024-03-21 02:38:23,604][04017] Updated weights for policy 0, policy_version 19886 (0.0017) [2024-03-21 02:38:25,521][03784] Fps is (10 sec: 65535.4, 60 sec: 45875.1, 300 sec: 47097.1). Total num frames: 651788288. Throughput: 0: 46922.1. Samples: 653103600. Policy #0 lag: (min: 0.0, avg: 38.4, max: 92.0) [2024-03-21 02:38:25,522][03784] Avg episode reward: [(0, '1.017')] [2024-03-21 02:38:27,439][03995] Signal inference workers to stop experience collection... (13100 times) [2024-03-21 02:38:27,523][04017] InferenceWorker_p0-w0: stopping experience collection (13100 times) [2024-03-21 02:38:27,713][03995] Signal inference workers to resume experience collection... (13100 times) [2024-03-21 02:38:27,713][04017] InferenceWorker_p0-w0: resuming experience collection (13100 times) [2024-03-21 02:38:28,005][04017] Updated weights for policy 0, policy_version 19896 (0.0021) [2024-03-21 02:38:30,521][03784] Fps is (10 sec: 62259.9, 60 sec: 44236.8, 300 sec: 46652.8). Total num frames: 652017664. Throughput: 0: 47028.8. Samples: 653242700. Policy #0 lag: (min: 0.0, avg: 38.4, max: 92.0) [2024-03-21 02:38:30,522][03784] Avg episode reward: [(0, '0.796')] [2024-03-21 02:38:34,105][04017] Updated weights for policy 0, policy_version 19906 (0.0017) [2024-03-21 02:38:35,521][03784] Fps is (10 sec: 55706.2, 60 sec: 46421.4, 300 sec: 46430.6). Total num frames: 652345344. Throughput: 0: 47166.7. Samples: 653529200. Policy #0 lag: (min: 2.0, avg: 47.9, max: 110.0) [2024-03-21 02:38:35,522][03784] Avg episode reward: [(0, '0.885')] [2024-03-21 02:38:39,032][04017] Updated weights for policy 0, policy_version 19916 (0.0015) [2024-03-21 02:38:40,521][03784] Fps is (10 sec: 62259.0, 60 sec: 46967.4, 300 sec: 46763.8). Total num frames: 652640256. Throughput: 0: 46251.0. Samples: 653772100. Policy #0 lag: (min: 2.0, avg: 47.9, max: 110.0) [2024-03-21 02:38:40,522][03784] Avg episode reward: [(0, '1.087')] [2024-03-21 02:38:45,521][03784] Fps is (10 sec: 45875.3, 60 sec: 45329.1, 300 sec: 46430.6). Total num frames: 652804096. Throughput: 0: 45624.6. Samples: 653892100. Policy #0 lag: (min: 0.0, avg: 59.5, max: 124.0) [2024-03-21 02:38:45,522][03784] Avg episode reward: [(0, '1.096')] [2024-03-21 02:38:47,951][04017] Updated weights for policy 0, policy_version 19926 (0.0012) [2024-03-21 02:38:50,521][03784] Fps is (10 sec: 39321.9, 60 sec: 46967.4, 300 sec: 46319.5). Total num frames: 653033472. Throughput: 0: 45640.0. Samples: 654151500. Policy #0 lag: (min: 0.0, avg: 59.5, max: 124.0) [2024-03-21 02:38:50,522][03784] Avg episode reward: [(0, '0.593')] [2024-03-21 02:38:55,521][03784] Fps is (10 sec: 42598.3, 60 sec: 47513.6, 300 sec: 46097.4). Total num frames: 653230080. Throughput: 0: 45693.5. Samples: 654452900. Policy #0 lag: (min: 0.0, avg: 59.5, max: 124.0) [2024-03-21 02:38:55,522][03784] Avg episode reward: [(0, '1.197')] [2024-03-21 02:38:55,702][04017] Updated weights for policy 0, policy_version 19936 (0.0011) [2024-03-21 02:39:00,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46967.5, 300 sec: 45653.1). Total num frames: 653361152. Throughput: 0: 45679.9. Samples: 654594900. Policy #0 lag: (min: 0.0, avg: 59.5, max: 124.0) [2024-03-21 02:39:00,522][03784] Avg episode reward: [(0, '1.412')] [2024-03-21 02:39:05,521][03784] Fps is (10 sec: 16383.8, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 653393920. Throughput: 0: 46104.5. Samples: 654894900. Policy #0 lag: (min: 0.0, avg: 33.2, max: 75.0) [2024-03-21 02:39:05,522][03784] Avg episode reward: [(0, '1.513')] [2024-03-21 02:39:07,923][04017] Updated weights for policy 0, policy_version 19946 (0.0010) [2024-03-21 02:39:10,521][03784] Fps is (10 sec: 45875.7, 60 sec: 48606.0, 300 sec: 46541.7). Total num frames: 653819904. Throughput: 0: 45909.0. Samples: 655169500. Policy #0 lag: (min: 0.0, avg: 33.2, max: 75.0) [2024-03-21 02:39:10,521][03784] Avg episode reward: [(0, '1.342')] [2024-03-21 02:39:13,085][04017] Updated weights for policy 0, policy_version 19956 (0.0020) [2024-03-21 02:39:15,521][03784] Fps is (10 sec: 62259.8, 60 sec: 48059.7, 300 sec: 46319.5). Total num frames: 654016512. Throughput: 0: 46166.7. Samples: 655320200. Policy #0 lag: (min: 0.0, avg: 33.2, max: 75.0) [2024-03-21 02:39:15,522][03784] Avg episode reward: [(0, '0.978')] [2024-03-21 02:39:20,521][03784] Fps is (10 sec: 29490.9, 60 sec: 45329.2, 300 sec: 45764.1). Total num frames: 654114816. Throughput: 0: 46599.9. Samples: 655626200. Policy #0 lag: (min: 0.0, avg: 38.5, max: 92.0) [2024-03-21 02:39:20,522][03784] Avg episode reward: [(0, '0.978')] [2024-03-21 02:39:23,048][04017] Updated weights for policy 0, policy_version 19966 (0.0013) [2024-03-21 02:39:25,521][03784] Fps is (10 sec: 32768.2, 60 sec: 42598.5, 300 sec: 45764.1). Total num frames: 654344192. Throughput: 0: 47917.9. Samples: 655928400. Policy #0 lag: (min: 0.0, avg: 38.5, max: 92.0) [2024-03-21 02:39:25,522][03784] Avg episode reward: [(0, '0.978')] [2024-03-21 02:39:27,220][03995] Signal inference workers to stop experience collection... (13150 times) [2024-03-21 02:39:27,301][04017] InferenceWorker_p0-w0: stopping experience collection (13150 times) [2024-03-21 02:39:27,484][03995] Signal inference workers to resume experience collection... (13150 times) [2024-03-21 02:39:27,484][04017] InferenceWorker_p0-w0: resuming experience collection (13150 times) [2024-03-21 02:39:29,389][04017] Updated weights for policy 0, policy_version 19976 (0.0012) [2024-03-21 02:39:30,521][03784] Fps is (10 sec: 55705.2, 60 sec: 44236.8, 300 sec: 46541.7). Total num frames: 654671872. Throughput: 0: 48126.5. Samples: 656057800. Policy #0 lag: (min: 2.0, avg: 44.4, max: 108.0) [2024-03-21 02:39:30,522][03784] Avg episode reward: [(0, '1.297')] [2024-03-21 02:39:32,513][04017] Updated weights for policy 0, policy_version 19986 (0.0013) [2024-03-21 02:39:35,521][03784] Fps is (10 sec: 78643.1, 60 sec: 46421.3, 300 sec: 46986.0). Total num frames: 655130624. Throughput: 0: 48035.6. Samples: 656313100. Policy #0 lag: (min: 2.0, avg: 44.4, max: 108.0) [2024-03-21 02:39:35,522][03784] Avg episode reward: [(0, '1.257')] [2024-03-21 02:39:36,354][04017] Updated weights for policy 0, policy_version 19996 (0.0013) [2024-03-21 02:39:40,168][04017] Updated weights for policy 0, policy_version 20006 (0.0012) [2024-03-21 02:39:40,521][03784] Fps is (10 sec: 88474.7, 60 sec: 48606.0, 300 sec: 47985.7). Total num frames: 655556608. Throughput: 0: 47288.9. Samples: 656580900. Policy #0 lag: (min: 2.0, avg: 44.4, max: 108.0) [2024-03-21 02:39:40,522][03784] Avg episode reward: [(0, '1.257')] [2024-03-21 02:39:45,521][03784] Fps is (10 sec: 52428.1, 60 sec: 47513.5, 300 sec: 46763.8). Total num frames: 655654912. Throughput: 0: 47446.6. Samples: 656730000. Policy #0 lag: (min: 2.0, avg: 44.4, max: 108.0) [2024-03-21 02:39:45,523][03784] Avg episode reward: [(0, '1.283')] [2024-03-21 02:39:49,985][04017] Updated weights for policy 0, policy_version 20016 (0.0009) [2024-03-21 02:39:50,521][03784] Fps is (10 sec: 36044.6, 60 sec: 48059.7, 300 sec: 46430.6). Total num frames: 655917056. Throughput: 0: 47340.1. Samples: 657025200. Policy #0 lag: (min: 0.0, avg: 45.6, max: 86.0) [2024-03-21 02:39:50,522][03784] Avg episode reward: [(0, '0.625')] [2024-03-21 02:39:54,742][04017] Updated weights for policy 0, policy_version 20026 (0.0012) [2024-03-21 02:39:55,521][03784] Fps is (10 sec: 55706.5, 60 sec: 49698.1, 300 sec: 46652.8). Total num frames: 656211968. Throughput: 0: 46991.1. Samples: 657284100. Policy #0 lag: (min: 0.0, avg: 45.6, max: 86.0) [2024-03-21 02:39:55,522][03784] Avg episode reward: [(0, '1.064')] [2024-03-21 02:40:00,521][03784] Fps is (10 sec: 39321.4, 60 sec: 49152.0, 300 sec: 46541.7). Total num frames: 656310272. Throughput: 0: 47017.7. Samples: 657436000. Policy #0 lag: (min: 1.0, avg: 48.8, max: 93.0) [2024-03-21 02:40:00,522][03784] Avg episode reward: [(0, '1.248')] [2024-03-21 02:40:00,541][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000020029_656310272.pth... [2024-03-21 02:40:00,657][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000019681_644907008.pth [2024-03-21 02:40:05,521][03784] Fps is (10 sec: 29491.2, 60 sec: 51882.8, 300 sec: 46652.8). Total num frames: 656506880. Throughput: 0: 45813.4. Samples: 657687800. Policy #0 lag: (min: 1.0, avg: 48.8, max: 93.0) [2024-03-21 02:40:05,522][03784] Avg episode reward: [(0, '1.231')] [2024-03-21 02:40:08,253][04017] Updated weights for policy 0, policy_version 20036 (0.0012) [2024-03-21 02:40:10,521][03784] Fps is (10 sec: 29491.3, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 656605184. Throughput: 0: 44768.8. Samples: 657943000. Policy #0 lag: (min: 1.0, avg: 48.8, max: 93.0) [2024-03-21 02:40:10,522][03784] Avg episode reward: [(0, '1.409')] [2024-03-21 02:40:14,513][03995] Signal inference workers to stop experience collection... (13200 times) [2024-03-21 02:40:14,579][04017] InferenceWorker_p0-w0: stopping experience collection (13200 times) [2024-03-21 02:40:14,750][03995] Signal inference workers to resume experience collection... (13200 times) [2024-03-21 02:40:14,750][04017] InferenceWorker_p0-w0: resuming experience collection (13200 times) [2024-03-21 02:40:15,387][04017] Updated weights for policy 0, policy_version 20046 (0.0016) [2024-03-21 02:40:15,521][03784] Fps is (10 sec: 36044.7, 60 sec: 47513.6, 300 sec: 45986.3). Total num frames: 656867328. Throughput: 0: 45109.0. Samples: 658087700. Policy #0 lag: (min: 2.0, avg: 25.8, max: 79.0) [2024-03-21 02:40:15,522][03784] Avg episode reward: [(0, '1.410')] [2024-03-21 02:40:20,521][03784] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 45986.3). Total num frames: 657096704. Throughput: 0: 45626.6. Samples: 658366300. Policy #0 lag: (min: 2.0, avg: 25.8, max: 79.0) [2024-03-21 02:40:20,522][03784] Avg episode reward: [(0, '1.410')] [2024-03-21 02:40:22,305][04017] Updated weights for policy 0, policy_version 20056 (0.0009) [2024-03-21 02:40:25,521][03784] Fps is (10 sec: 45874.8, 60 sec: 49698.0, 300 sec: 46319.5). Total num frames: 657326080. Throughput: 0: 46622.1. Samples: 658678900. Policy #0 lag: (min: 2.0, avg: 25.8, max: 79.0) [2024-03-21 02:40:25,522][03784] Avg episode reward: [(0, '1.246')] [2024-03-21 02:40:29,004][04017] Updated weights for policy 0, policy_version 20066 (0.0011) [2024-03-21 02:40:30,521][03784] Fps is (10 sec: 45875.2, 60 sec: 48059.8, 300 sec: 46652.8). Total num frames: 657555456. Throughput: 0: 46622.3. Samples: 658828000. Policy #0 lag: (min: 0.0, avg: 34.7, max: 108.0) [2024-03-21 02:40:30,522][03784] Avg episode reward: [(0, '0.951')] [2024-03-21 02:40:35,521][03784] Fps is (10 sec: 45875.4, 60 sec: 44236.8, 300 sec: 46986.0). Total num frames: 657784832. Throughput: 0: 46533.3. Samples: 659119200. Policy #0 lag: (min: 0.0, avg: 34.7, max: 108.0) [2024-03-21 02:40:35,522][03784] Avg episode reward: [(0, '1.114')] [2024-03-21 02:40:36,444][04017] Updated weights for policy 0, policy_version 20076 (0.0011) [2024-03-21 02:40:40,521][03784] Fps is (10 sec: 45875.2, 60 sec: 40960.0, 300 sec: 47208.1). Total num frames: 658014208. Throughput: 0: 47355.5. Samples: 659415100. Policy #0 lag: (min: 0.0, avg: 34.7, max: 108.0) [2024-03-21 02:40:40,522][03784] Avg episode reward: [(0, '1.114')] [2024-03-21 02:40:45,264][04017] Updated weights for policy 0, policy_version 20086 (0.0012) [2024-03-21 02:40:45,521][03784] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 47319.2). Total num frames: 658178048. Throughput: 0: 47353.3. Samples: 659566900. Policy #0 lag: (min: 3.0, avg: 44.3, max: 113.0) [2024-03-21 02:40:45,522][03784] Avg episode reward: [(0, '1.182')] [2024-03-21 02:40:50,169][04017] Updated weights for policy 0, policy_version 20096 (0.0015) [2024-03-21 02:40:50,521][03784] Fps is (10 sec: 49151.8, 60 sec: 43144.5, 300 sec: 47097.1). Total num frames: 658505728. Throughput: 0: 48273.3. Samples: 659860100. Policy #0 lag: (min: 3.0, avg: 44.3, max: 113.0) [2024-03-21 02:40:50,522][03784] Avg episode reward: [(0, '0.605')] [2024-03-21 02:40:53,872][04017] Updated weights for policy 0, policy_version 20106 (0.0020) [2024-03-21 02:40:55,521][03784] Fps is (10 sec: 75366.3, 60 sec: 45329.0, 300 sec: 47541.4). Total num frames: 658931712. Throughput: 0: 48668.8. Samples: 660133100. Policy #0 lag: (min: 3.0, avg: 44.3, max: 113.0) [2024-03-21 02:40:55,522][03784] Avg episode reward: [(0, '1.173')] [2024-03-21 02:40:59,249][04017] Updated weights for policy 0, policy_version 20116 (0.0015) [2024-03-21 02:41:00,521][03784] Fps is (10 sec: 72089.5, 60 sec: 48605.9, 300 sec: 47985.7). Total num frames: 659226624. Throughput: 0: 48262.2. Samples: 660259500. Policy #0 lag: (min: 1.0, avg: 51.5, max: 94.0) [2024-03-21 02:41:00,522][03784] Avg episode reward: [(0, '1.389')] [2024-03-21 02:41:00,812][03995] Signal inference workers to stop experience collection... (13250 times) [2024-03-21 02:41:00,880][04017] InferenceWorker_p0-w0: stopping experience collection (13250 times) [2024-03-21 02:41:01,106][03995] Signal inference workers to resume experience collection... (13250 times) [2024-03-21 02:41:01,106][04017] InferenceWorker_p0-w0: resuming experience collection (13250 times) [2024-03-21 02:41:05,521][03784] Fps is (10 sec: 49152.5, 60 sec: 48605.8, 300 sec: 46986.0). Total num frames: 659423232. Throughput: 0: 48102.2. Samples: 660530900. Policy #0 lag: (min: 1.0, avg: 51.5, max: 94.0) [2024-03-21 02:41:05,522][03784] Avg episode reward: [(0, '1.424')] [2024-03-21 02:41:06,004][04017] Updated weights for policy 0, policy_version 20126 (0.0017) [2024-03-21 02:41:10,521][03784] Fps is (10 sec: 32767.9, 60 sec: 49151.9, 300 sec: 46208.4). Total num frames: 659554304. Throughput: 0: 47462.2. Samples: 660814700. Policy #0 lag: (min: 1.0, avg: 51.5, max: 94.0) [2024-03-21 02:41:10,522][03784] Avg episode reward: [(0, '0.644')] [2024-03-21 02:41:15,521][03784] Fps is (10 sec: 36044.9, 60 sec: 48605.9, 300 sec: 45875.2). Total num frames: 659783680. Throughput: 0: 46806.7. Samples: 660934300. Policy #0 lag: (min: 1.0, avg: 40.9, max: 90.0) [2024-03-21 02:41:15,522][03784] Avg episode reward: [(0, '1.402')] [2024-03-21 02:41:16,101][04017] Updated weights for policy 0, policy_version 20136 (0.0012) [2024-03-21 02:41:20,523][03784] Fps is (10 sec: 45868.0, 60 sec: 48604.5, 300 sec: 46319.3). Total num frames: 660013056. Throughput: 0: 46113.9. Samples: 661194400. Policy #0 lag: (min: 1.0, avg: 40.9, max: 90.0) [2024-03-21 02:41:20,523][03784] Avg episode reward: [(0, '1.134')] [2024-03-21 02:41:24,927][04017] Updated weights for policy 0, policy_version 20146 (0.0012) [2024-03-21 02:41:25,521][03784] Fps is (10 sec: 39321.5, 60 sec: 47513.7, 300 sec: 46652.7). Total num frames: 660176896. Throughput: 0: 45544.4. Samples: 661464600. Policy #0 lag: (min: 1.0, avg: 34.7, max: 109.0) [2024-03-21 02:41:25,522][03784] Avg episode reward: [(0, '0.673')] [2024-03-21 02:41:30,521][03784] Fps is (10 sec: 36050.7, 60 sec: 46967.5, 300 sec: 46652.8). Total num frames: 660373504. Throughput: 0: 44982.3. Samples: 661591100. Policy #0 lag: (min: 1.0, avg: 34.7, max: 109.0) [2024-03-21 02:41:30,522][03784] Avg episode reward: [(0, '1.310')] [2024-03-21 02:41:32,148][04017] Updated weights for policy 0, policy_version 20156 (0.0011) [2024-03-21 02:41:35,521][03784] Fps is (10 sec: 29490.9, 60 sec: 44782.9, 300 sec: 46208.4). Total num frames: 660471808. Throughput: 0: 44775.5. Samples: 661875000. Policy #0 lag: (min: 1.0, avg: 34.7, max: 109.0) [2024-03-21 02:41:35,522][03784] Avg episode reward: [(0, '0.510')] [2024-03-21 02:41:40,521][03784] Fps is (10 sec: 39321.7, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 660766720. Throughput: 0: 44797.9. Samples: 662149000. Policy #0 lag: (min: 0.0, avg: 28.8, max: 72.0) [2024-03-21 02:41:40,522][03784] Avg episode reward: [(0, '1.471')] [2024-03-21 02:41:40,868][04017] Updated weights for policy 0, policy_version 20166 (0.0023) [2024-03-21 02:41:45,521][03784] Fps is (10 sec: 55706.5, 60 sec: 47513.7, 300 sec: 46875.5). Total num frames: 661028864. Throughput: 0: 45257.9. Samples: 662296100. Policy #0 lag: (min: 0.0, avg: 28.8, max: 72.0) [2024-03-21 02:41:45,522][03784] Avg episode reward: [(0, '1.092')] [2024-03-21 02:41:49,552][04017] Updated weights for policy 0, policy_version 20176 (0.0025) [2024-03-21 02:41:50,521][03784] Fps is (10 sec: 42598.1, 60 sec: 44782.9, 300 sec: 46319.5). Total num frames: 661192704. Throughput: 0: 45622.2. Samples: 662583900. Policy #0 lag: (min: 0.0, avg: 28.8, max: 72.0) [2024-03-21 02:41:50,522][03784] Avg episode reward: [(0, '1.445')] [2024-03-21 02:41:54,878][04017] Updated weights for policy 0, policy_version 20186 (0.0016) [2024-03-21 02:41:55,521][03784] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 46541.7). Total num frames: 661487616. Throughput: 0: 45244.5. Samples: 662850700. Policy #0 lag: (min: 2.0, avg: 26.2, max: 57.0) [2024-03-21 02:41:55,522][03784] Avg episode reward: [(0, '1.445')] [2024-03-21 02:42:00,325][04017] Updated weights for policy 0, policy_version 20196 (0.0011) [2024-03-21 02:42:00,521][03784] Fps is (10 sec: 58981.9, 60 sec: 42598.3, 300 sec: 46652.7). Total num frames: 661782528. Throughput: 0: 46044.3. Samples: 663006300. Policy #0 lag: (min: 2.0, avg: 26.2, max: 57.0) [2024-03-21 02:42:00,522][03784] Avg episode reward: [(0, '1.117')] [2024-03-21 02:42:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000020196_661782528.pth... [2024-03-21 02:42:00,648][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000019854_650575872.pth [2024-03-21 02:42:01,927][03995] Signal inference workers to stop experience collection... (13300 times) [2024-03-21 02:42:01,927][03995] Signal inference workers to resume experience collection... (13300 times) [2024-03-21 02:42:02,016][04017] InferenceWorker_p0-w0: stopping experience collection (13300 times) [2024-03-21 02:42:02,016][04017] InferenceWorker_p0-w0: resuming experience collection (13300 times) [2024-03-21 02:42:04,759][04017] Updated weights for policy 0, policy_version 20206 (0.0012) [2024-03-21 02:42:05,521][03784] Fps is (10 sec: 68812.9, 60 sec: 45875.2, 300 sec: 47652.4). Total num frames: 662175744. Throughput: 0: 46072.8. Samples: 663267600. Policy #0 lag: (min: 2.0, avg: 26.2, max: 57.0) [2024-03-21 02:42:05,522][03784] Avg episode reward: [(0, '1.288')] [2024-03-21 02:42:10,521][03784] Fps is (10 sec: 62259.8, 60 sec: 47513.6, 300 sec: 47985.7). Total num frames: 662405120. Throughput: 0: 45824.4. Samples: 663526700. Policy #0 lag: (min: 2.0, avg: 46.5, max: 83.0) [2024-03-21 02:42:10,522][03784] Avg episode reward: [(0, '1.576')] [2024-03-21 02:42:11,253][04017] Updated weights for policy 0, policy_version 20216 (0.0015) [2024-03-21 02:42:15,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45875.2, 300 sec: 47208.1). Total num frames: 662536192. Throughput: 0: 46302.2. Samples: 663674700. Policy #0 lag: (min: 2.0, avg: 46.5, max: 83.0) [2024-03-21 02:42:15,522][03784] Avg episode reward: [(0, '1.174')] [2024-03-21 02:42:20,197][04017] Updated weights for policy 0, policy_version 20226 (0.0019) [2024-03-21 02:42:20,521][03784] Fps is (10 sec: 39321.4, 60 sec: 46422.5, 300 sec: 46652.7). Total num frames: 662798336. Throughput: 0: 46740.0. Samples: 663978300. Policy #0 lag: (min: 2.0, avg: 46.5, max: 83.0) [2024-03-21 02:42:20,522][03784] Avg episode reward: [(0, '1.174')] [2024-03-21 02:42:25,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46421.3, 300 sec: 46097.4). Total num frames: 662962176. Throughput: 0: 46893.3. Samples: 664259200. Policy #0 lag: (min: 0.0, avg: 50.8, max: 109.0) [2024-03-21 02:42:25,522][03784] Avg episode reward: [(0, '1.208')] [2024-03-21 02:42:30,171][04017] Updated weights for policy 0, policy_version 20236 (0.0011) [2024-03-21 02:42:30,521][03784] Fps is (10 sec: 32767.6, 60 sec: 45875.0, 300 sec: 45986.2). Total num frames: 663126016. Throughput: 0: 46866.4. Samples: 664405100. Policy #0 lag: (min: 0.0, avg: 50.8, max: 109.0) [2024-03-21 02:42:30,522][03784] Avg episode reward: [(0, '0.612')] [2024-03-21 02:42:35,521][03784] Fps is (10 sec: 39321.8, 60 sec: 48059.9, 300 sec: 45875.2). Total num frames: 663355392. Throughput: 0: 46713.4. Samples: 664686000. Policy #0 lag: (min: 0.0, avg: 50.8, max: 109.0) [2024-03-21 02:42:35,522][03784] Avg episode reward: [(0, '0.825')] [2024-03-21 02:42:37,370][04017] Updated weights for policy 0, policy_version 20246 (0.0011) [2024-03-21 02:42:40,521][03784] Fps is (10 sec: 45876.1, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 663584768. Throughput: 0: 47300.0. Samples: 664979200. Policy #0 lag: (min: 0.0, avg: 43.8, max: 100.0) [2024-03-21 02:42:40,522][03784] Avg episode reward: [(0, '1.364')] [2024-03-21 02:42:44,286][04017] Updated weights for policy 0, policy_version 20256 (0.0010) [2024-03-21 02:42:45,521][03784] Fps is (10 sec: 42597.9, 60 sec: 45875.1, 300 sec: 45986.3). Total num frames: 663781376. Throughput: 0: 47355.6. Samples: 665137300. Policy #0 lag: (min: 0.0, avg: 43.8, max: 100.0) [2024-03-21 02:42:45,522][03784] Avg episode reward: [(0, '1.364')] [2024-03-21 02:42:49,597][04017] Updated weights for policy 0, policy_version 20266 (0.0011) [2024-03-21 02:42:50,521][03784] Fps is (10 sec: 55705.7, 60 sec: 49152.0, 300 sec: 46652.7). Total num frames: 664141824. Throughput: 0: 48024.5. Samples: 665428700. Policy #0 lag: (min: 0.0, avg: 43.8, max: 100.0) [2024-03-21 02:42:50,522][03784] Avg episode reward: [(0, '0.612')] [2024-03-21 02:42:51,631][03995] Signal inference workers to stop experience collection... (13350 times) [2024-03-21 02:42:51,631][03995] Signal inference workers to resume experience collection... (13350 times) [2024-03-21 02:42:51,806][04017] InferenceWorker_p0-w0: stopping experience collection (13350 times) [2024-03-21 02:42:51,806][04017] InferenceWorker_p0-w0: resuming experience collection (13350 times) [2024-03-21 02:42:53,833][04017] Updated weights for policy 0, policy_version 20276 (0.0015) [2024-03-21 02:42:55,521][03784] Fps is (10 sec: 65536.3, 60 sec: 49152.0, 300 sec: 47097.1). Total num frames: 664436736. Throughput: 0: 48417.8. Samples: 665705500. Policy #0 lag: (min: 1.0, avg: 36.2, max: 90.0) [2024-03-21 02:42:55,522][03784] Avg episode reward: [(0, '0.686')] [2024-03-21 02:43:00,521][03784] Fps is (10 sec: 55705.0, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 664698880. Throughput: 0: 48508.8. Samples: 665857600. Policy #0 lag: (min: 1.0, avg: 36.2, max: 90.0) [2024-03-21 02:43:00,522][03784] Avg episode reward: [(0, '0.686')] [2024-03-21 02:43:01,297][04017] Updated weights for policy 0, policy_version 20286 (0.0010) [2024-03-21 02:43:05,521][03784] Fps is (10 sec: 42598.0, 60 sec: 44782.9, 300 sec: 47319.2). Total num frames: 664862720. Throughput: 0: 48415.5. Samples: 666157000. Policy #0 lag: (min: 1.0, avg: 36.2, max: 90.0) [2024-03-21 02:43:05,522][03784] Avg episode reward: [(0, '0.619')] [2024-03-21 02:43:09,872][04017] Updated weights for policy 0, policy_version 20296 (0.0011) [2024-03-21 02:43:10,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44236.8, 300 sec: 47208.1). Total num frames: 665059328. Throughput: 0: 48831.0. Samples: 666456600. Policy #0 lag: (min: 0.0, avg: 44.8, max: 96.0) [2024-03-21 02:43:10,522][03784] Avg episode reward: [(0, '0.619')] [2024-03-21 02:43:15,521][03784] Fps is (10 sec: 49152.5, 60 sec: 46967.5, 300 sec: 47319.2). Total num frames: 665354240. Throughput: 0: 48513.5. Samples: 666588200. Policy #0 lag: (min: 0.0, avg: 44.8, max: 96.0) [2024-03-21 02:43:15,522][03784] Avg episode reward: [(0, '0.804')] [2024-03-21 02:43:15,538][04017] Updated weights for policy 0, policy_version 20306 (0.0011) [2024-03-21 02:43:20,521][03784] Fps is (10 sec: 55705.8, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 665616384. Throughput: 0: 48291.0. Samples: 666859100. Policy #0 lag: (min: 0.0, avg: 40.5, max: 119.0) [2024-03-21 02:43:20,522][03784] Avg episode reward: [(0, '0.804')] [2024-03-21 02:43:21,385][04017] Updated weights for policy 0, policy_version 20316 (0.0015) [2024-03-21 02:43:25,521][03784] Fps is (10 sec: 52429.1, 60 sec: 48605.9, 300 sec: 46986.0). Total num frames: 665878528. Throughput: 0: 47893.4. Samples: 667134400. Policy #0 lag: (min: 0.0, avg: 40.5, max: 119.0) [2024-03-21 02:43:25,522][03784] Avg episode reward: [(0, '1.350')] [2024-03-21 02:43:27,982][04017] Updated weights for policy 0, policy_version 20326 (0.0013) [2024-03-21 02:43:30,521][03784] Fps is (10 sec: 58982.8, 60 sec: 51336.7, 300 sec: 46986.0). Total num frames: 666206208. Throughput: 0: 47733.4. Samples: 667285300. Policy #0 lag: (min: 0.0, avg: 40.5, max: 119.0) [2024-03-21 02:43:30,522][03784] Avg episode reward: [(0, '0.598')] [2024-03-21 02:43:35,521][03784] Fps is (10 sec: 36044.6, 60 sec: 48059.7, 300 sec: 46097.4). Total num frames: 666238976. Throughput: 0: 47915.5. Samples: 667584900. Policy #0 lag: (min: 0.0, avg: 36.6, max: 81.0) [2024-03-21 02:43:35,522][03784] Avg episode reward: [(0, '0.529')] [2024-03-21 02:43:36,907][04017] Updated weights for policy 0, policy_version 20336 (0.0015) [2024-03-21 02:43:40,419][04017] Updated weights for policy 0, policy_version 20346 (0.0022) [2024-03-21 02:43:40,521][03784] Fps is (10 sec: 49151.6, 60 sec: 51882.6, 300 sec: 47097.0). Total num frames: 666697728. Throughput: 0: 47742.2. Samples: 667853900. Policy #0 lag: (min: 0.0, avg: 36.6, max: 81.0) [2024-03-21 02:43:40,522][03784] Avg episode reward: [(0, '0.675')] [2024-03-21 02:43:44,971][03995] Signal inference workers to stop experience collection... (13400 times) [2024-03-21 02:43:44,972][03995] Signal inference workers to resume experience collection... (13400 times) [2024-03-21 02:43:45,025][04017] InferenceWorker_p0-w0: stopping experience collection (13400 times) [2024-03-21 02:43:45,026][04017] InferenceWorker_p0-w0: resuming experience collection (13400 times) [2024-03-21 02:43:45,521][03784] Fps is (10 sec: 65536.4, 60 sec: 51882.8, 300 sec: 46986.0). Total num frames: 666894336. Throughput: 0: 47693.5. Samples: 668003800. Policy #0 lag: (min: 0.0, avg: 36.6, max: 81.0) [2024-03-21 02:43:45,522][03784] Avg episode reward: [(0, '0.869')] [2024-03-21 02:43:50,521][03784] Fps is (10 sec: 29491.8, 60 sec: 47513.7, 300 sec: 46652.8). Total num frames: 666992640. Throughput: 0: 48122.5. Samples: 668322500. Policy #0 lag: (min: 0.0, avg: 39.6, max: 94.0) [2024-03-21 02:43:50,522][03784] Avg episode reward: [(0, '0.869')] [2024-03-21 02:43:50,805][04017] Updated weights for policy 0, policy_version 20356 (0.0010) [2024-03-21 02:43:55,521][03784] Fps is (10 sec: 29491.0, 60 sec: 45875.2, 300 sec: 46874.9). Total num frames: 667189248. Throughput: 0: 48262.3. Samples: 668628400. Policy #0 lag: (min: 0.0, avg: 39.6, max: 94.0) [2024-03-21 02:43:55,522][03784] Avg episode reward: [(0, '1.442')] [2024-03-21 02:43:58,487][04017] Updated weights for policy 0, policy_version 20366 (0.0016) [2024-03-21 02:44:00,521][03784] Fps is (10 sec: 52427.7, 60 sec: 46967.5, 300 sec: 47874.6). Total num frames: 667516928. Throughput: 0: 48695.5. Samples: 668779500. Policy #0 lag: (min: 0.0, avg: 39.6, max: 94.0) [2024-03-21 02:44:00,522][03784] Avg episode reward: [(0, '1.455')] [2024-03-21 02:44:00,830][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000020372_667549696.pth... [2024-03-21 02:44:00,957][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000020029_656310272.pth [2024-03-21 02:44:02,200][04017] Updated weights for policy 0, policy_version 20376 (0.0011) [2024-03-21 02:44:05,526][03784] Fps is (10 sec: 58952.8, 60 sec: 48601.9, 300 sec: 47318.4). Total num frames: 667779072. Throughput: 0: 48254.7. Samples: 669030800. Policy #0 lag: (min: 0.0, avg: 39.6, max: 94.0) [2024-03-21 02:44:05,527][03784] Avg episode reward: [(0, '1.210')] [2024-03-21 02:44:10,521][03784] Fps is (10 sec: 42598.5, 60 sec: 48059.8, 300 sec: 47208.1). Total num frames: 667942912. Throughput: 0: 48735.5. Samples: 669327500. Policy #0 lag: (min: 0.0, avg: 32.6, max: 69.0) [2024-03-21 02:44:10,522][03784] Avg episode reward: [(0, '1.185')] [2024-03-21 02:44:11,226][04017] Updated weights for policy 0, policy_version 20386 (0.0010) [2024-03-21 02:44:15,521][03784] Fps is (10 sec: 52455.3, 60 sec: 49152.0, 300 sec: 48096.8). Total num frames: 668303360. Throughput: 0: 48182.2. Samples: 669453500. Policy #0 lag: (min: 0.0, avg: 32.6, max: 69.0) [2024-03-21 02:44:15,522][03784] Avg episode reward: [(0, '0.893')] [2024-03-21 02:44:15,937][04017] Updated weights for policy 0, policy_version 20396 (0.0014) [2024-03-21 02:44:20,521][03784] Fps is (10 sec: 55705.2, 60 sec: 48059.7, 300 sec: 47985.7). Total num frames: 668499968. Throughput: 0: 47573.2. Samples: 669725700. Policy #0 lag: (min: 0.0, avg: 32.6, max: 69.0) [2024-03-21 02:44:20,522][03784] Avg episode reward: [(0, '0.755')] [2024-03-21 02:44:25,145][04017] Updated weights for policy 0, policy_version 20406 (0.0015) [2024-03-21 02:44:25,521][03784] Fps is (10 sec: 39320.8, 60 sec: 46967.3, 300 sec: 47541.4). Total num frames: 668696576. Throughput: 0: 47657.6. Samples: 669998500. Policy #0 lag: (min: 0.0, avg: 52.5, max: 116.0) [2024-03-21 02:44:25,522][03784] Avg episode reward: [(0, '0.955')] [2024-03-21 02:44:30,172][04017] Updated weights for policy 0, policy_version 20416 (0.0014) [2024-03-21 02:44:30,521][03784] Fps is (10 sec: 52429.0, 60 sec: 46967.4, 300 sec: 47097.0). Total num frames: 669024256. Throughput: 0: 47322.1. Samples: 670133300. Policy #0 lag: (min: 0.0, avg: 52.5, max: 116.0) [2024-03-21 02:44:30,522][03784] Avg episode reward: [(0, '0.836')] [2024-03-21 02:44:33,770][03995] Signal inference workers to stop experience collection... (13450 times) [2024-03-21 02:44:33,771][03995] Signal inference workers to resume experience collection... (13450 times) [2024-03-21 02:44:33,822][04017] InferenceWorker_p0-w0: stopping experience collection (13450 times) [2024-03-21 02:44:33,823][04017] InferenceWorker_p0-w0: resuming experience collection (13450 times) [2024-03-21 02:44:34,994][04017] Updated weights for policy 0, policy_version 20426 (0.0011) [2024-03-21 02:44:35,521][03784] Fps is (10 sec: 62260.6, 60 sec: 51336.6, 300 sec: 46652.7). Total num frames: 669319168. Throughput: 0: 45693.2. Samples: 670378700. Policy #0 lag: (min: 0.0, avg: 52.5, max: 116.0) [2024-03-21 02:44:35,522][03784] Avg episode reward: [(0, '0.571')] [2024-03-21 02:44:40,521][03784] Fps is (10 sec: 45875.6, 60 sec: 46421.4, 300 sec: 46874.9). Total num frames: 669483008. Throughput: 0: 44848.9. Samples: 670646600. Policy #0 lag: (min: 0.0, avg: 41.2, max: 77.0) [2024-03-21 02:44:40,522][03784] Avg episode reward: [(0, '0.806')] [2024-03-21 02:44:45,298][04017] Updated weights for policy 0, policy_version 20436 (0.0023) [2024-03-21 02:44:45,521][03784] Fps is (10 sec: 32767.9, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 669646848. Throughput: 0: 44384.5. Samples: 670776800. Policy #0 lag: (min: 0.0, avg: 41.2, max: 77.0) [2024-03-21 02:44:45,522][03784] Avg episode reward: [(0, '1.683')] [2024-03-21 02:44:45,952][03995] Saving new best policy, reward=1.683! [2024-03-21 02:44:50,521][03784] Fps is (10 sec: 29491.3, 60 sec: 46421.2, 300 sec: 45986.3). Total num frames: 669777920. Throughput: 0: 44711.7. Samples: 671042600. Policy #0 lag: (min: 0.0, avg: 41.2, max: 77.0) [2024-03-21 02:44:50,522][03784] Avg episode reward: [(0, '1.396')] [2024-03-21 02:44:53,911][04017] Updated weights for policy 0, policy_version 20446 (0.0029) [2024-03-21 02:44:55,521][03784] Fps is (10 sec: 45875.6, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 670105600. Throughput: 0: 44353.5. Samples: 671323400. Policy #0 lag: (min: 0.0, avg: 34.1, max: 78.0) [2024-03-21 02:44:55,522][03784] Avg episode reward: [(0, '0.890')] [2024-03-21 02:45:00,480][04017] Updated weights for policy 0, policy_version 20456 (0.0011) [2024-03-21 02:45:00,521][03784] Fps is (10 sec: 52428.2, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 670302208. Throughput: 0: 44675.4. Samples: 671463900. Policy #0 lag: (min: 0.0, avg: 34.1, max: 78.0) [2024-03-21 02:45:00,522][03784] Avg episode reward: [(0, '1.186')] [2024-03-21 02:45:05,521][03784] Fps is (10 sec: 32767.6, 60 sec: 44240.5, 300 sec: 46874.9). Total num frames: 670433280. Throughput: 0: 45042.3. Samples: 671752600. Policy #0 lag: (min: 0.0, avg: 34.1, max: 78.0) [2024-03-21 02:45:05,522][03784] Avg episode reward: [(0, '0.618')] [2024-03-21 02:45:10,353][04017] Updated weights for policy 0, policy_version 20466 (0.0019) [2024-03-21 02:45:10,521][03784] Fps is (10 sec: 32767.7, 60 sec: 44782.8, 300 sec: 46652.7). Total num frames: 670629888. Throughput: 0: 45211.1. Samples: 672033000. Policy #0 lag: (min: 1.0, avg: 35.5, max: 71.0) [2024-03-21 02:45:10,522][03784] Avg episode reward: [(0, '0.678')] [2024-03-21 02:45:15,521][03784] Fps is (10 sec: 29491.2, 60 sec: 40413.8, 300 sec: 46208.4). Total num frames: 670728192. Throughput: 0: 45537.8. Samples: 672182500. Policy #0 lag: (min: 1.0, avg: 35.5, max: 71.0) [2024-03-21 02:45:15,522][03784] Avg episode reward: [(0, '1.077')] [2024-03-21 02:45:20,056][04017] Updated weights for policy 0, policy_version 20476 (0.0015) [2024-03-21 02:45:20,521][03784] Fps is (10 sec: 36045.3, 60 sec: 41506.2, 300 sec: 46319.5). Total num frames: 670990336. Throughput: 0: 46391.0. Samples: 672466300. Policy #0 lag: (min: 3.0, avg: 30.7, max: 63.0) [2024-03-21 02:45:20,522][03784] Avg episode reward: [(0, '1.564')] [2024-03-21 02:45:23,887][04017] Updated weights for policy 0, policy_version 20486 (0.0020) [2024-03-21 02:45:25,521][03784] Fps is (10 sec: 68812.7, 60 sec: 45329.2, 300 sec: 46986.0). Total num frames: 671416320. Throughput: 0: 45742.2. Samples: 672705000. Policy #0 lag: (min: 3.0, avg: 30.7, max: 63.0) [2024-03-21 02:45:25,526][03784] Avg episode reward: [(0, '0.866')] [2024-03-21 02:45:29,433][03995] Signal inference workers to stop experience collection... (13500 times) [2024-03-21 02:45:29,518][04017] InferenceWorker_p0-w0: stopping experience collection (13500 times) [2024-03-21 02:45:29,689][03995] Signal inference workers to resume experience collection... (13500 times) [2024-03-21 02:45:29,690][04017] InferenceWorker_p0-w0: resuming experience collection (13500 times) [2024-03-21 02:45:29,988][04017] Updated weights for policy 0, policy_version 20496 (0.0014) [2024-03-21 02:45:30,521][03784] Fps is (10 sec: 65536.2, 60 sec: 43690.7, 300 sec: 46986.0). Total num frames: 671645696. Throughput: 0: 46373.3. Samples: 672863600. Policy #0 lag: (min: 3.0, avg: 30.7, max: 63.0) [2024-03-21 02:45:30,522][03784] Avg episode reward: [(0, '1.041')] [2024-03-21 02:45:34,833][04017] Updated weights for policy 0, policy_version 20506 (0.0014) [2024-03-21 02:45:35,521][03784] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 47208.1). Total num frames: 671940608. Throughput: 0: 46013.3. Samples: 673113200. Policy #0 lag: (min: 1.0, avg: 55.4, max: 113.0) [2024-03-21 02:45:35,531][03784] Avg episode reward: [(0, '1.041')] [2024-03-21 02:45:40,521][03784] Fps is (10 sec: 52428.2, 60 sec: 44782.8, 300 sec: 47430.3). Total num frames: 672169984. Throughput: 0: 45793.1. Samples: 673384100. Policy #0 lag: (min: 1.0, avg: 55.4, max: 113.0) [2024-03-21 02:45:40,522][03784] Avg episode reward: [(0, '0.472')] [2024-03-21 02:45:42,734][04017] Updated weights for policy 0, policy_version 20516 (0.0011) [2024-03-21 02:45:45,521][03784] Fps is (10 sec: 49152.3, 60 sec: 46421.4, 300 sec: 47208.1). Total num frames: 672432128. Throughput: 0: 45600.1. Samples: 673515900. Policy #0 lag: (min: 1.0, avg: 55.4, max: 113.0) [2024-03-21 02:45:45,522][03784] Avg episode reward: [(0, '0.969')] [2024-03-21 02:45:49,974][04017] Updated weights for policy 0, policy_version 20526 (0.0011) [2024-03-21 02:45:50,521][03784] Fps is (10 sec: 45875.6, 60 sec: 47513.6, 300 sec: 46430.6). Total num frames: 672628736. Throughput: 0: 45620.0. Samples: 673805500. Policy #0 lag: (min: 0.0, avg: 42.0, max: 92.0) [2024-03-21 02:45:50,522][03784] Avg episode reward: [(0, '0.631')] [2024-03-21 02:45:54,599][04017] Updated weights for policy 0, policy_version 20536 (0.0011) [2024-03-21 02:45:55,521][03784] Fps is (10 sec: 52428.9, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 672956416. Throughput: 0: 45266.9. Samples: 674070000. Policy #0 lag: (min: 0.0, avg: 42.0, max: 92.0) [2024-03-21 02:45:55,522][03784] Avg episode reward: [(0, '0.414')] [2024-03-21 02:46:00,521][03784] Fps is (10 sec: 39321.4, 60 sec: 45329.1, 300 sec: 46097.3). Total num frames: 673021952. Throughput: 0: 45291.1. Samples: 674220600. Policy #0 lag: (min: 0.0, avg: 42.0, max: 92.0) [2024-03-21 02:46:00,522][03784] Avg episode reward: [(0, '0.812')] [2024-03-21 02:46:00,536][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000020539_673021952.pth... [2024-03-21 02:46:00,697][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000020196_661782528.pth [2024-03-21 02:46:05,521][03784] Fps is (10 sec: 22937.5, 60 sec: 45875.2, 300 sec: 46208.4). Total num frames: 673185792. Throughput: 0: 45191.1. Samples: 674499900. Policy #0 lag: (min: 1.0, avg: 27.0, max: 67.0) [2024-03-21 02:46:05,522][03784] Avg episode reward: [(0, '1.377')] [2024-03-21 02:46:06,251][04017] Updated weights for policy 0, policy_version 20546 (0.0011) [2024-03-21 02:46:10,521][03784] Fps is (10 sec: 45875.7, 60 sec: 47513.8, 300 sec: 46430.6). Total num frames: 673480704. Throughput: 0: 45971.2. Samples: 674773700. Policy #0 lag: (min: 1.0, avg: 27.0, max: 67.0) [2024-03-21 02:46:10,522][03784] Avg episode reward: [(0, '0.651')] [2024-03-21 02:46:13,032][04017] Updated weights for policy 0, policy_version 20556 (0.0009) [2024-03-21 02:46:15,521][03784] Fps is (10 sec: 52428.9, 60 sec: 49698.2, 300 sec: 46430.9). Total num frames: 673710080. Throughput: 0: 45800.0. Samples: 674924600. Policy #0 lag: (min: 1.0, avg: 27.0, max: 67.0) [2024-03-21 02:46:15,522][03784] Avg episode reward: [(0, '1.215')] [2024-03-21 02:46:18,030][04017] Updated weights for policy 0, policy_version 20566 (0.0012) [2024-03-21 02:46:20,521][03784] Fps is (10 sec: 52428.9, 60 sec: 50244.3, 300 sec: 46874.9). Total num frames: 674004992. Throughput: 0: 46364.5. Samples: 675199600. Policy #0 lag: (min: 1.0, avg: 50.8, max: 108.0) [2024-03-21 02:46:20,522][03784] Avg episode reward: [(0, '0.827')] [2024-03-21 02:46:25,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45329.1, 300 sec: 46652.8). Total num frames: 674136064. Throughput: 0: 46706.8. Samples: 675485900. Policy #0 lag: (min: 1.0, avg: 50.8, max: 108.0) [2024-03-21 02:46:25,522][03784] Avg episode reward: [(0, '1.404')] [2024-03-21 02:46:27,166][04017] Updated weights for policy 0, policy_version 20576 (0.0016) [2024-03-21 02:46:29,888][03995] Signal inference workers to stop experience collection... (13550 times) [2024-03-21 02:46:29,958][04017] InferenceWorker_p0-w0: stopping experience collection (13550 times) [2024-03-21 02:46:30,207][03995] Signal inference workers to resume experience collection... (13550 times) [2024-03-21 02:46:30,207][04017] InferenceWorker_p0-w0: resuming experience collection (13550 times) [2024-03-21 02:46:30,521][03784] Fps is (10 sec: 39321.2, 60 sec: 45875.2, 300 sec: 47208.1). Total num frames: 674398208. Throughput: 0: 46857.7. Samples: 675624500. Policy #0 lag: (min: 1.0, avg: 50.8, max: 108.0) [2024-03-21 02:46:30,522][03784] Avg episode reward: [(0, '1.595')] [2024-03-21 02:46:34,719][04017] Updated weights for policy 0, policy_version 20586 (0.0013) [2024-03-21 02:46:35,521][03784] Fps is (10 sec: 45874.6, 60 sec: 44236.8, 300 sec: 46874.9). Total num frames: 674594816. Throughput: 0: 47008.8. Samples: 675920900. Policy #0 lag: (min: 0.0, avg: 41.1, max: 85.0) [2024-03-21 02:46:35,522][03784] Avg episode reward: [(0, '1.061')] [2024-03-21 02:46:40,092][04017] Updated weights for policy 0, policy_version 20596 (0.0022) [2024-03-21 02:46:40,521][03784] Fps is (10 sec: 52428.6, 60 sec: 45875.2, 300 sec: 47097.0). Total num frames: 674922496. Throughput: 0: 47255.5. Samples: 676196500. Policy #0 lag: (min: 0.0, avg: 41.1, max: 85.0) [2024-03-21 02:46:40,522][03784] Avg episode reward: [(0, '0.849')] [2024-03-21 02:46:45,521][03784] Fps is (10 sec: 55706.5, 60 sec: 45329.1, 300 sec: 47319.2). Total num frames: 675151872. Throughput: 0: 47077.9. Samples: 676339100. Policy #0 lag: (min: 0.0, avg: 41.1, max: 85.0) [2024-03-21 02:46:45,522][03784] Avg episode reward: [(0, '1.399')] [2024-03-21 02:46:47,076][04017] Updated weights for policy 0, policy_version 20606 (0.0011) [2024-03-21 02:46:50,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45329.1, 300 sec: 46986.0). Total num frames: 675348480. Throughput: 0: 47711.1. Samples: 676646900. Policy #0 lag: (min: 1.0, avg: 36.6, max: 117.0) [2024-03-21 02:46:50,522][03784] Avg episode reward: [(0, '1.399')] [2024-03-21 02:46:55,137][04017] Updated weights for policy 0, policy_version 20616 (0.0011) [2024-03-21 02:46:55,521][03784] Fps is (10 sec: 42597.7, 60 sec: 43690.6, 300 sec: 46763.8). Total num frames: 675577856. Throughput: 0: 48171.0. Samples: 676941400. Policy #0 lag: (min: 1.0, avg: 36.6, max: 117.0) [2024-03-21 02:46:55,522][03784] Avg episode reward: [(0, '1.190')] [2024-03-21 02:46:58,925][04017] Updated weights for policy 0, policy_version 20626 (0.0010) [2024-03-21 02:47:00,521][03784] Fps is (10 sec: 52428.8, 60 sec: 47513.7, 300 sec: 46430.6). Total num frames: 675872768. Throughput: 0: 47884.4. Samples: 677079400. Policy #0 lag: (min: 1.0, avg: 36.6, max: 117.0) [2024-03-21 02:47:00,522][03784] Avg episode reward: [(0, '1.190')] [2024-03-21 02:47:04,385][04017] Updated weights for policy 0, policy_version 20636 (0.0011) [2024-03-21 02:47:05,521][03784] Fps is (10 sec: 65536.7, 60 sec: 50790.4, 300 sec: 46874.9). Total num frames: 676233216. Throughput: 0: 48460.0. Samples: 677380300. Policy #0 lag: (min: 1.0, avg: 38.9, max: 70.0) [2024-03-21 02:47:05,522][03784] Avg episode reward: [(0, '1.190')] [2024-03-21 02:47:10,521][03784] Fps is (10 sec: 55705.1, 60 sec: 49151.9, 300 sec: 47097.0). Total num frames: 676429824. Throughput: 0: 48459.9. Samples: 677666600. Policy #0 lag: (min: 1.0, avg: 38.9, max: 70.0) [2024-03-21 02:47:10,522][03784] Avg episode reward: [(0, '1.310')] [2024-03-21 02:47:11,174][04017] Updated weights for policy 0, policy_version 20646 (0.0014) [2024-03-21 02:47:15,521][03784] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 47319.2). Total num frames: 676757504. Throughput: 0: 48308.9. Samples: 677798400. Policy #0 lag: (min: 1.0, avg: 38.9, max: 70.0) [2024-03-21 02:47:15,522][03784] Avg episode reward: [(0, '0.363')] [2024-03-21 02:47:19,619][04017] Updated weights for policy 0, policy_version 20656 (0.0011) [2024-03-21 02:47:20,339][03995] Signal inference workers to stop experience collection... (13600 times) [2024-03-21 02:47:20,401][03995] Signal inference workers to resume experience collection... (13600 times) [2024-03-21 02:47:20,487][04017] InferenceWorker_p0-w0: stopping experience collection (13600 times) [2024-03-21 02:47:20,520][04017] InferenceWorker_p0-w0: resuming experience collection (13600 times) [2024-03-21 02:47:20,521][03784] Fps is (10 sec: 49152.0, 60 sec: 48605.8, 300 sec: 47319.2). Total num frames: 676921344. Throughput: 0: 48531.1. Samples: 678104800. Policy #0 lag: (min: 0.0, avg: 52.1, max: 114.0) [2024-03-21 02:47:20,522][03784] Avg episode reward: [(0, '0.582')] [2024-03-21 02:47:25,521][03784] Fps is (10 sec: 32767.9, 60 sec: 49151.9, 300 sec: 47319.2). Total num frames: 677085184. Throughput: 0: 48391.2. Samples: 678374100. Policy #0 lag: (min: 0.0, avg: 52.1, max: 114.0) [2024-03-21 02:47:25,522][03784] Avg episode reward: [(0, '1.205')] [2024-03-21 02:47:28,138][04017] Updated weights for policy 0, policy_version 20666 (0.0016) [2024-03-21 02:47:30,521][03784] Fps is (10 sec: 29491.5, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 677216256. Throughput: 0: 48122.2. Samples: 678504600. Policy #0 lag: (min: 0.0, avg: 52.1, max: 114.0) [2024-03-21 02:47:30,522][03784] Avg episode reward: [(0, '0.871')] [2024-03-21 02:47:35,521][03784] Fps is (10 sec: 39321.6, 60 sec: 48059.8, 300 sec: 47097.1). Total num frames: 677478400. Throughput: 0: 47975.5. Samples: 678805800. Policy #0 lag: (min: 0.0, avg: 30.5, max: 83.0) [2024-03-21 02:47:35,522][03784] Avg episode reward: [(0, '0.784')] [2024-03-21 02:47:36,112][04017] Updated weights for policy 0, policy_version 20676 (0.0016) [2024-03-21 02:47:40,521][03784] Fps is (10 sec: 49151.8, 60 sec: 46421.4, 300 sec: 47208.1). Total num frames: 677707776. Throughput: 0: 47720.1. Samples: 679088800. Policy #0 lag: (min: 0.0, avg: 30.5, max: 83.0) [2024-03-21 02:47:40,522][03784] Avg episode reward: [(0, '0.571')] [2024-03-21 02:47:42,985][04017] Updated weights for policy 0, policy_version 20686 (0.0010) [2024-03-21 02:47:45,521][03784] Fps is (10 sec: 45875.7, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 677937152. Throughput: 0: 47780.1. Samples: 679229500. Policy #0 lag: (min: 0.0, avg: 30.5, max: 83.0) [2024-03-21 02:47:45,522][03784] Avg episode reward: [(0, '0.661')] [2024-03-21 02:47:48,685][04017] Updated weights for policy 0, policy_version 20696 (0.0010) [2024-03-21 02:47:50,521][03784] Fps is (10 sec: 58981.8, 60 sec: 49151.9, 300 sec: 46986.0). Total num frames: 678297600. Throughput: 0: 47342.1. Samples: 679510700. Policy #0 lag: (min: 1.0, avg: 31.7, max: 64.0) [2024-03-21 02:47:50,522][03784] Avg episode reward: [(0, '1.438')] [2024-03-21 02:47:52,804][04017] Updated weights for policy 0, policy_version 20706 (0.0016) [2024-03-21 02:47:55,521][03784] Fps is (10 sec: 68812.4, 60 sec: 50790.5, 300 sec: 47208.2). Total num frames: 678625280. Throughput: 0: 46833.4. Samples: 679774100. Policy #0 lag: (min: 1.0, avg: 31.7, max: 64.0) [2024-03-21 02:47:55,522][03784] Avg episode reward: [(0, '1.438')] [2024-03-21 02:48:00,521][03784] Fps is (10 sec: 49152.3, 60 sec: 48605.8, 300 sec: 47208.2). Total num frames: 678789120. Throughput: 0: 47111.1. Samples: 679918400. Policy #0 lag: (min: 1.0, avg: 31.7, max: 64.0) [2024-03-21 02:48:00,522][03784] Avg episode reward: [(0, '0.722')] [2024-03-21 02:48:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000020715_678789120.pth... [2024-03-21 02:48:00,649][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000020372_667549696.pth [2024-03-21 02:48:01,937][04017] Updated weights for policy 0, policy_version 20716 (0.0015) [2024-03-21 02:48:05,521][03784] Fps is (10 sec: 42599.0, 60 sec: 46967.6, 300 sec: 47430.3). Total num frames: 679051264. Throughput: 0: 46678.0. Samples: 680205300. Policy #0 lag: (min: 0.0, avg: 46.1, max: 106.0) [2024-03-21 02:48:05,521][03784] Avg episode reward: [(0, '0.778')] [2024-03-21 02:48:06,508][04017] Updated weights for policy 0, policy_version 20726 (0.0019) [2024-03-21 02:48:10,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.5, 300 sec: 47097.1). Total num frames: 679247872. Throughput: 0: 46764.5. Samples: 680478500. Policy #0 lag: (min: 0.0, avg: 46.1, max: 106.0) [2024-03-21 02:48:10,522][03784] Avg episode reward: [(0, '0.522')] [2024-03-21 02:48:15,531][03784] Fps is (10 sec: 29462.4, 60 sec: 43137.6, 300 sec: 46540.2). Total num frames: 679346176. Throughput: 0: 46983.2. Samples: 680619300. Policy #0 lag: (min: 0.0, avg: 46.1, max: 106.0) [2024-03-21 02:48:15,532][03784] Avg episode reward: [(0, '1.299')] [2024-03-21 02:48:16,376][03995] Signal inference workers to stop experience collection... (13650 times) [2024-03-21 02:48:16,450][03995] Signal inference workers to resume experience collection... (13650 times) [2024-03-21 02:48:16,468][04017] InferenceWorker_p0-w0: stopping experience collection (13650 times) [2024-03-21 02:48:16,519][04017] InferenceWorker_p0-w0: resuming experience collection (13650 times) [2024-03-21 02:48:17,195][04017] Updated weights for policy 0, policy_version 20736 (0.0010) [2024-03-21 02:48:20,521][03784] Fps is (10 sec: 42598.2, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 679673856. Throughput: 0: 46504.4. Samples: 680898500. Policy #0 lag: (min: 1.0, avg: 24.4, max: 58.0) [2024-03-21 02:48:20,522][03784] Avg episode reward: [(0, '1.273')] [2024-03-21 02:48:23,444][04017] Updated weights for policy 0, policy_version 20746 (0.0024) [2024-03-21 02:48:25,521][03784] Fps is (10 sec: 45919.3, 60 sec: 45329.1, 300 sec: 46097.4). Total num frames: 679804928. Throughput: 0: 46257.8. Samples: 681170400. Policy #0 lag: (min: 1.0, avg: 24.4, max: 58.0) [2024-03-21 02:48:25,522][03784] Avg episode reward: [(0, '0.682')] [2024-03-21 02:48:30,521][03784] Fps is (10 sec: 26214.6, 60 sec: 45329.0, 300 sec: 46430.6). Total num frames: 679936000. Throughput: 0: 45993.3. Samples: 681299200. Policy #0 lag: (min: 1.0, avg: 24.4, max: 58.0) [2024-03-21 02:48:30,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 02:48:34,086][04017] Updated weights for policy 0, policy_version 20756 (0.0011) [2024-03-21 02:48:35,521][03784] Fps is (10 sec: 39321.7, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 680198144. Throughput: 0: 46500.1. Samples: 681603200. Policy #0 lag: (min: 0.0, avg: 34.1, max: 80.0) [2024-03-21 02:48:35,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 02:48:38,155][04017] Updated weights for policy 0, policy_version 20766 (0.0011) [2024-03-21 02:48:40,521][03784] Fps is (10 sec: 65535.4, 60 sec: 48059.7, 300 sec: 46430.6). Total num frames: 680591360. Throughput: 0: 46404.4. Samples: 681862300. Policy #0 lag: (min: 0.0, avg: 34.1, max: 80.0) [2024-03-21 02:48:40,523][03784] Avg episode reward: [(0, '1.079')] [2024-03-21 02:48:42,725][04017] Updated weights for policy 0, policy_version 20776 (0.0011) [2024-03-21 02:48:45,521][03784] Fps is (10 sec: 65535.4, 60 sec: 48605.7, 300 sec: 46985.9). Total num frames: 680853504. Throughput: 0: 46426.6. Samples: 682007600. Policy #0 lag: (min: 0.0, avg: 34.1, max: 80.0) [2024-03-21 02:48:45,522][03784] Avg episode reward: [(0, '1.118')] [2024-03-21 02:48:50,521][03784] Fps is (10 sec: 45875.6, 60 sec: 45875.3, 300 sec: 46986.0). Total num frames: 681050112. Throughput: 0: 47124.3. Samples: 682325900. Policy #0 lag: (min: 0.0, avg: 41.2, max: 78.0) [2024-03-21 02:48:50,522][03784] Avg episode reward: [(0, '1.118')] [2024-03-21 02:48:51,799][04017] Updated weights for policy 0, policy_version 20786 (0.0015) [2024-03-21 02:48:55,521][03784] Fps is (10 sec: 45875.5, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 681312256. Throughput: 0: 47117.8. Samples: 682598800. Policy #0 lag: (min: 0.0, avg: 41.2, max: 78.0) [2024-03-21 02:48:55,522][03784] Avg episode reward: [(0, '0.976')] [2024-03-21 02:48:56,792][04017] Updated weights for policy 0, policy_version 20796 (0.0011) [2024-03-21 02:49:00,521][03784] Fps is (10 sec: 58981.6, 60 sec: 47513.5, 300 sec: 46986.8). Total num frames: 681639936. Throughput: 0: 46338.7. Samples: 682704100. Policy #0 lag: (min: 0.0, avg: 41.2, max: 78.0) [2024-03-21 02:49:00,522][03784] Avg episode reward: [(0, '0.987')] [2024-03-21 02:49:01,674][04017] Updated weights for policy 0, policy_version 20806 (0.0020) [2024-03-21 02:49:02,874][03995] Signal inference workers to stop experience collection... (13700 times) [2024-03-21 02:49:02,948][03995] Signal inference workers to resume experience collection... (13700 times) [2024-03-21 02:49:02,962][04017] InferenceWorker_p0-w0: stopping experience collection (13700 times) [2024-03-21 02:49:03,007][04017] InferenceWorker_p0-w0: resuming experience collection (13700 times) [2024-03-21 02:49:05,521][03784] Fps is (10 sec: 55705.6, 60 sec: 46967.3, 300 sec: 47208.1). Total num frames: 681869312. Throughput: 0: 46500.0. Samples: 682991000. Policy #0 lag: (min: 4.0, avg: 50.3, max: 116.0) [2024-03-21 02:49:05,522][03784] Avg episode reward: [(0, '1.123')] [2024-03-21 02:49:10,521][03784] Fps is (10 sec: 42599.0, 60 sec: 46967.5, 300 sec: 46652.7). Total num frames: 682065920. Throughput: 0: 46540.0. Samples: 683264700. Policy #0 lag: (min: 4.0, avg: 50.3, max: 116.0) [2024-03-21 02:49:10,522][03784] Avg episode reward: [(0, '0.859')] [2024-03-21 02:49:15,521][03784] Fps is (10 sec: 19660.7, 60 sec: 45336.3, 300 sec: 45986.3). Total num frames: 682065920. Throughput: 0: 46984.3. Samples: 683413500. Policy #0 lag: (min: 4.0, avg: 50.3, max: 116.0) [2024-03-21 02:49:15,522][03784] Avg episode reward: [(0, '1.456')] [2024-03-21 02:49:15,674][04017] Updated weights for policy 0, policy_version 20816 (0.0009) [2024-03-21 02:49:19,327][04017] Updated weights for policy 0, policy_version 20826 (0.0015) [2024-03-21 02:49:20,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 682426368. Throughput: 0: 45922.2. Samples: 683669700. Policy #0 lag: (min: 1.0, avg: 40.6, max: 85.0) [2024-03-21 02:49:20,522][03784] Avg episode reward: [(0, '1.405')] [2024-03-21 02:49:25,521][03784] Fps is (10 sec: 52429.4, 60 sec: 46421.4, 300 sec: 45986.3). Total num frames: 682590208. Throughput: 0: 46311.2. Samples: 683946300. Policy #0 lag: (min: 1.0, avg: 40.6, max: 85.0) [2024-03-21 02:49:25,522][03784] Avg episode reward: [(0, '1.260')] [2024-03-21 02:49:30,207][04017] Updated weights for policy 0, policy_version 20836 (0.0012) [2024-03-21 02:49:30,521][03784] Fps is (10 sec: 32768.0, 60 sec: 46967.5, 300 sec: 45542.0). Total num frames: 682754048. Throughput: 0: 46337.9. Samples: 684092800. Policy #0 lag: (min: 1.0, avg: 40.6, max: 85.0) [2024-03-21 02:49:30,522][03784] Avg episode reward: [(0, '0.560')] [2024-03-21 02:49:33,576][04017] Updated weights for policy 0, policy_version 20846 (0.0020) [2024-03-21 02:49:35,521][03784] Fps is (10 sec: 62259.2, 60 sec: 50244.3, 300 sec: 46541.7). Total num frames: 683212800. Throughput: 0: 45035.6. Samples: 684352500. Policy #0 lag: (min: 2.0, avg: 33.1, max: 74.0) [2024-03-21 02:49:35,522][03784] Avg episode reward: [(0, '0.560')] [2024-03-21 02:49:39,165][04017] Updated weights for policy 0, policy_version 20856 (0.0012) [2024-03-21 02:49:40,521][03784] Fps is (10 sec: 75365.4, 60 sec: 48605.8, 300 sec: 46986.0). Total num frames: 683507712. Throughput: 0: 45035.5. Samples: 684625400. Policy #0 lag: (min: 2.0, avg: 33.1, max: 74.0) [2024-03-21 02:49:40,522][03784] Avg episode reward: [(0, '1.114')] [2024-03-21 02:49:42,939][04017] Updated weights for policy 0, policy_version 20866 (0.0011) [2024-03-21 02:49:45,521][03784] Fps is (10 sec: 62258.7, 60 sec: 49698.2, 300 sec: 47652.4). Total num frames: 683835392. Throughput: 0: 45622.3. Samples: 684757100. Policy #0 lag: (min: 0.0, avg: 46.6, max: 83.0) [2024-03-21 02:49:45,522][03784] Avg episode reward: [(0, '1.237')] [2024-03-21 02:49:50,521][03784] Fps is (10 sec: 45875.6, 60 sec: 48605.8, 300 sec: 46986.0). Total num frames: 683966464. Throughput: 0: 45962.2. Samples: 685059300. Policy #0 lag: (min: 0.0, avg: 46.6, max: 83.0) [2024-03-21 02:49:50,522][03784] Avg episode reward: [(0, '1.407')] [2024-03-21 02:49:55,521][03784] Fps is (10 sec: 13107.3, 60 sec: 44236.8, 300 sec: 46319.5). Total num frames: 683966464. Throughput: 0: 46275.6. Samples: 685347100. Policy #0 lag: (min: 0.0, avg: 46.6, max: 83.0) [2024-03-21 02:49:55,522][03784] Avg episode reward: [(0, '1.237')] [2024-03-21 02:49:56,143][03995] Signal inference workers to stop experience collection... (13750 times) [2024-03-21 02:49:56,144][03995] Signal inference workers to resume experience collection... (13750 times) [2024-03-21 02:49:56,207][04017] InferenceWorker_p0-w0: stopping experience collection (13750 times) [2024-03-21 02:49:56,207][04017] InferenceWorker_p0-w0: resuming experience collection (13750 times) [2024-03-21 02:49:57,173][04017] Updated weights for policy 0, policy_version 20876 (0.0010) [2024-03-21 02:50:00,521][03784] Fps is (10 sec: 22937.2, 60 sec: 42598.4, 300 sec: 46652.7). Total num frames: 684195840. Throughput: 0: 46071.0. Samples: 685486700. Policy #0 lag: (min: 0.0, avg: 41.0, max: 108.0) [2024-03-21 02:50:00,522][03784] Avg episode reward: [(0, '0.930')] [2024-03-21 02:50:00,789][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000020881_684228608.pth... [2024-03-21 02:50:00,882][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000020539_673021952.pth [2024-03-21 02:50:05,521][03784] Fps is (10 sec: 39321.6, 60 sec: 41506.2, 300 sec: 46541.7). Total num frames: 684359680. Throughput: 0: 47133.4. Samples: 685790700. Policy #0 lag: (min: 0.0, avg: 41.0, max: 108.0) [2024-03-21 02:50:05,522][03784] Avg episode reward: [(0, '0.635')] [2024-03-21 02:50:06,142][04017] Updated weights for policy 0, policy_version 20886 (0.0011) [2024-03-21 02:50:10,521][03784] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 47097.1). Total num frames: 684621824. Throughput: 0: 47837.7. Samples: 686099000. Policy #0 lag: (min: 0.0, avg: 41.0, max: 108.0) [2024-03-21 02:50:10,528][03784] Avg episode reward: [(0, '1.497')] [2024-03-21 02:50:13,272][04017] Updated weights for policy 0, policy_version 20896 (0.0011) [2024-03-21 02:50:15,521][03784] Fps is (10 sec: 49151.6, 60 sec: 46421.4, 300 sec: 46986.0). Total num frames: 684851200. Throughput: 0: 47951.0. Samples: 686250600. Policy #0 lag: (min: 1.0, avg: 44.8, max: 103.0) [2024-03-21 02:50:15,522][03784] Avg episode reward: [(0, '1.497')] [2024-03-21 02:50:18,258][04017] Updated weights for policy 0, policy_version 20906 (0.0010) [2024-03-21 02:50:20,521][03784] Fps is (10 sec: 55706.2, 60 sec: 45875.3, 300 sec: 46652.8). Total num frames: 685178880. Throughput: 0: 47977.8. Samples: 686511500. Policy #0 lag: (min: 1.0, avg: 44.8, max: 103.0) [2024-03-21 02:50:20,522][03784] Avg episode reward: [(0, '1.045')] [2024-03-21 02:50:22,326][04017] Updated weights for policy 0, policy_version 20916 (0.0013) [2024-03-21 02:50:25,521][03784] Fps is (10 sec: 75367.1, 60 sec: 50244.3, 300 sec: 47319.2). Total num frames: 685604864. Throughput: 0: 47457.9. Samples: 686761000. Policy #0 lag: (min: 1.0, avg: 44.8, max: 103.0) [2024-03-21 02:50:25,522][03784] Avg episode reward: [(0, '0.481')] [2024-03-21 02:50:27,123][04017] Updated weights for policy 0, policy_version 20926 (0.0012) [2024-03-21 02:50:30,521][03784] Fps is (10 sec: 72089.2, 60 sec: 52428.8, 300 sec: 47319.2). Total num frames: 685899776. Throughput: 0: 47337.9. Samples: 686887300. Policy #0 lag: (min: 1.0, avg: 41.2, max: 71.0) [2024-03-21 02:50:30,522][03784] Avg episode reward: [(0, '1.417')] [2024-03-21 02:50:32,318][04017] Updated weights for policy 0, policy_version 20936 (0.0012) [2024-03-21 02:50:35,521][03784] Fps is (10 sec: 65535.2, 60 sec: 50790.3, 300 sec: 47763.5). Total num frames: 686260224. Throughput: 0: 46784.4. Samples: 687164600. Policy #0 lag: (min: 1.0, avg: 41.2, max: 71.0) [2024-03-21 02:50:35,522][03784] Avg episode reward: [(0, '1.417')] [2024-03-21 02:50:40,088][04017] Updated weights for policy 0, policy_version 20946 (0.0012) [2024-03-21 02:50:40,521][03784] Fps is (10 sec: 45874.2, 60 sec: 47513.5, 300 sec: 47208.1). Total num frames: 686358528. Throughput: 0: 46493.1. Samples: 687439300. Policy #0 lag: (min: 1.0, avg: 41.2, max: 71.0) [2024-03-21 02:50:40,522][03784] Avg episode reward: [(0, '0.486')] [2024-03-21 02:50:44,947][03995] Signal inference workers to stop experience collection... (13800 times) [2024-03-21 02:50:45,014][04017] InferenceWorker_p0-w0: stopping experience collection (13800 times) [2024-03-21 02:50:45,237][03995] Signal inference workers to resume experience collection... (13800 times) [2024-03-21 02:50:45,238][04017] InferenceWorker_p0-w0: resuming experience collection (13800 times) [2024-03-21 02:50:45,521][03784] Fps is (10 sec: 13107.2, 60 sec: 42598.4, 300 sec: 46652.7). Total num frames: 686391296. Throughput: 0: 46693.5. Samples: 687587900. Policy #0 lag: (min: 0.0, avg: 47.6, max: 82.0) [2024-03-21 02:50:45,522][03784] Avg episode reward: [(0, '0.683')] [2024-03-21 02:50:50,521][03784] Fps is (10 sec: 22938.0, 60 sec: 43690.7, 300 sec: 46208.4). Total num frames: 686587904. Throughput: 0: 45782.2. Samples: 687850900. Policy #0 lag: (min: 0.0, avg: 47.6, max: 82.0) [2024-03-21 02:50:50,522][03784] Avg episode reward: [(0, '1.062')] [2024-03-21 02:50:52,600][04017] Updated weights for policy 0, policy_version 20956 (0.0018) [2024-03-21 02:50:55,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46967.4, 300 sec: 46652.7). Total num frames: 686784512. Throughput: 0: 45402.2. Samples: 688142100. Policy #0 lag: (min: 0.0, avg: 47.6, max: 82.0) [2024-03-21 02:50:55,522][03784] Avg episode reward: [(0, '0.976')] [2024-03-21 02:51:00,521][03784] Fps is (10 sec: 39320.9, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 686981120. Throughput: 0: 45448.7. Samples: 688295800. Policy #0 lag: (min: 0.0, avg: 44.6, max: 114.0) [2024-03-21 02:51:00,522][03784] Avg episode reward: [(0, '0.770')] [2024-03-21 02:51:00,694][04017] Updated weights for policy 0, policy_version 20966 (0.0011) [2024-03-21 02:51:05,521][03784] Fps is (10 sec: 45875.3, 60 sec: 48059.7, 300 sec: 46652.7). Total num frames: 687243264. Throughput: 0: 46153.2. Samples: 688588400. Policy #0 lag: (min: 0.0, avg: 44.6, max: 114.0) [2024-03-21 02:51:05,522][03784] Avg episode reward: [(0, '1.163')] [2024-03-21 02:51:06,289][04017] Updated weights for policy 0, policy_version 20976 (0.0015) [2024-03-21 02:51:10,521][03784] Fps is (10 sec: 62259.6, 60 sec: 49698.0, 300 sec: 47097.0). Total num frames: 687603712. Throughput: 0: 46477.6. Samples: 688852500. Policy #0 lag: (min: 0.0, avg: 44.6, max: 114.0) [2024-03-21 02:51:10,522][03784] Avg episode reward: [(0, '0.808')] [2024-03-21 02:51:12,049][04017] Updated weights for policy 0, policy_version 20986 (0.0010) [2024-03-21 02:51:15,521][03784] Fps is (10 sec: 62259.2, 60 sec: 50244.3, 300 sec: 46986.0). Total num frames: 687865856. Throughput: 0: 46908.8. Samples: 688998200. Policy #0 lag: (min: 1.0, avg: 39.4, max: 106.0) [2024-03-21 02:51:15,522][03784] Avg episode reward: [(0, '1.352')] [2024-03-21 02:51:17,272][04017] Updated weights for policy 0, policy_version 20996 (0.0011) [2024-03-21 02:51:20,521][03784] Fps is (10 sec: 58982.9, 60 sec: 50244.2, 300 sec: 47652.4). Total num frames: 688193536. Throughput: 0: 46935.6. Samples: 689276700. Policy #0 lag: (min: 1.0, avg: 39.4, max: 106.0) [2024-03-21 02:51:20,522][03784] Avg episode reward: [(0, '0.425')] [2024-03-21 02:51:23,539][04017] Updated weights for policy 0, policy_version 21006 (0.0016) [2024-03-21 02:51:25,521][03784] Fps is (10 sec: 52428.3, 60 sec: 46421.2, 300 sec: 47430.3). Total num frames: 688390144. Throughput: 0: 47086.7. Samples: 689558200. Policy #0 lag: (min: 1.0, avg: 39.4, max: 106.0) [2024-03-21 02:51:25,522][03784] Avg episode reward: [(0, '1.116')] [2024-03-21 02:51:30,521][03784] Fps is (10 sec: 29491.1, 60 sec: 43144.5, 300 sec: 47097.1). Total num frames: 688488448. Throughput: 0: 47024.4. Samples: 689704000. Policy #0 lag: (min: 0.0, avg: 46.3, max: 104.0) [2024-03-21 02:51:30,522][03784] Avg episode reward: [(0, '1.133')] [2024-03-21 02:51:33,991][04017] Updated weights for policy 0, policy_version 21016 (0.0014) [2024-03-21 02:51:35,521][03784] Fps is (10 sec: 36045.4, 60 sec: 41506.2, 300 sec: 46874.9). Total num frames: 688750592. Throughput: 0: 47860.1. Samples: 690004600. Policy #0 lag: (min: 0.0, avg: 46.3, max: 104.0) [2024-03-21 02:51:35,521][03784] Avg episode reward: [(0, '1.133')] [2024-03-21 02:51:37,800][03995] Signal inference workers to stop experience collection... (13850 times) [2024-03-21 02:51:37,873][03995] Signal inference workers to resume experience collection... (13850 times) [2024-03-21 02:51:37,877][04017] InferenceWorker_p0-w0: stopping experience collection (13850 times) [2024-03-21 02:51:37,924][04017] InferenceWorker_p0-w0: resuming experience collection (13850 times) [2024-03-21 02:51:38,226][04017] Updated weights for policy 0, policy_version 21026 (0.0025) [2024-03-21 02:51:40,521][03784] Fps is (10 sec: 55705.4, 60 sec: 44783.0, 300 sec: 47097.0). Total num frames: 689045504. Throughput: 0: 47333.3. Samples: 690272100. Policy #0 lag: (min: 0.0, avg: 46.3, max: 104.0) [2024-03-21 02:51:40,522][03784] Avg episode reward: [(0, '0.871')] [2024-03-21 02:51:45,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46421.4, 300 sec: 46874.9). Total num frames: 689176576. Throughput: 0: 47169.1. Samples: 690418400. Policy #0 lag: (min: 0.0, avg: 46.3, max: 104.0) [2024-03-21 02:51:45,522][03784] Avg episode reward: [(0, '1.262')] [2024-03-21 02:51:50,335][04017] Updated weights for policy 0, policy_version 21036 (0.0015) [2024-03-21 02:51:50,521][03784] Fps is (10 sec: 26214.6, 60 sec: 45329.1, 300 sec: 46541.7). Total num frames: 689307648. Throughput: 0: 47362.2. Samples: 690719700. Policy #0 lag: (min: 0.0, avg: 48.0, max: 125.0) [2024-03-21 02:51:50,522][03784] Avg episode reward: [(0, '0.904')] [2024-03-21 02:51:54,872][04017] Updated weights for policy 0, policy_version 21046 (0.0016) [2024-03-21 02:51:55,521][03784] Fps is (10 sec: 52428.8, 60 sec: 48605.9, 300 sec: 46874.9). Total num frames: 689700864. Throughput: 0: 47433.5. Samples: 690987000. Policy #0 lag: (min: 0.0, avg: 48.0, max: 125.0) [2024-03-21 02:51:55,522][03784] Avg episode reward: [(0, '1.398')] [2024-03-21 02:52:00,521][03784] Fps is (10 sec: 58982.2, 60 sec: 48606.0, 300 sec: 46319.5). Total num frames: 689897472. Throughput: 0: 47473.3. Samples: 691134500. Policy #0 lag: (min: 0.0, avg: 48.0, max: 125.0) [2024-03-21 02:52:00,522][03784] Avg episode reward: [(0, '1.014')] [2024-03-21 02:52:01,087][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000021056_689963008.pth... [2024-03-21 02:52:01,118][04017] Updated weights for policy 0, policy_version 21056 (0.0009) [2024-03-21 02:52:01,208][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000020715_678789120.pth [2024-03-21 02:52:05,521][03784] Fps is (10 sec: 42597.8, 60 sec: 48059.7, 300 sec: 46430.6). Total num frames: 690126848. Throughput: 0: 48097.7. Samples: 691441100. Policy #0 lag: (min: 0.0, avg: 29.9, max: 72.0) [2024-03-21 02:52:05,522][03784] Avg episode reward: [(0, '1.014')] [2024-03-21 02:52:07,194][04017] Updated weights for policy 0, policy_version 21066 (0.0009) [2024-03-21 02:52:10,521][03784] Fps is (10 sec: 58981.8, 60 sec: 48059.7, 300 sec: 46541.6). Total num frames: 690487296. Throughput: 0: 47897.8. Samples: 691713600. Policy #0 lag: (min: 0.0, avg: 29.9, max: 72.0) [2024-03-21 02:52:10,522][03784] Avg episode reward: [(0, '1.099')] [2024-03-21 02:52:11,744][04017] Updated weights for policy 0, policy_version 21076 (0.0011) [2024-03-21 02:52:15,521][03784] Fps is (10 sec: 58982.3, 60 sec: 47513.5, 300 sec: 46763.8). Total num frames: 690716672. Throughput: 0: 47588.8. Samples: 691845500. Policy #0 lag: (min: 0.0, avg: 29.9, max: 72.0) [2024-03-21 02:52:15,522][03784] Avg episode reward: [(0, '1.294')] [2024-03-21 02:52:20,277][04017] Updated weights for policy 0, policy_version 21086 (0.0011) [2024-03-21 02:52:20,521][03784] Fps is (10 sec: 45876.3, 60 sec: 45875.3, 300 sec: 46986.0). Total num frames: 690946048. Throughput: 0: 47320.0. Samples: 692134000. Policy #0 lag: (min: 0.0, avg: 42.7, max: 92.0) [2024-03-21 02:52:20,521][03784] Avg episode reward: [(0, '0.945')] [2024-03-21 02:52:25,521][03784] Fps is (10 sec: 52429.2, 60 sec: 47513.7, 300 sec: 47541.4). Total num frames: 691240960. Throughput: 0: 47300.0. Samples: 692400600. Policy #0 lag: (min: 0.0, avg: 42.7, max: 92.0) [2024-03-21 02:52:25,522][03784] Avg episode reward: [(0, '0.533')] [2024-03-21 02:52:26,441][04017] Updated weights for policy 0, policy_version 21096 (0.0015) [2024-03-21 02:52:27,918][03995] Signal inference workers to stop experience collection... (13900 times) [2024-03-21 02:52:28,001][04017] InferenceWorker_p0-w0: stopping experience collection (13900 times) [2024-03-21 02:52:28,144][03995] Signal inference workers to resume experience collection... (13900 times) [2024-03-21 02:52:28,144][04017] InferenceWorker_p0-w0: resuming experience collection (13900 times) [2024-03-21 02:52:30,521][03784] Fps is (10 sec: 52427.5, 60 sec: 49698.1, 300 sec: 47430.3). Total num frames: 691470336. Throughput: 0: 47104.2. Samples: 692538100. Policy #0 lag: (min: 0.0, avg: 43.8, max: 94.0) [2024-03-21 02:52:30,522][03784] Avg episode reward: [(0, '1.456')] [2024-03-21 02:52:35,039][04017] Updated weights for policy 0, policy_version 21106 (0.0025) [2024-03-21 02:52:35,521][03784] Fps is (10 sec: 36044.7, 60 sec: 47513.5, 300 sec: 47097.0). Total num frames: 691601408. Throughput: 0: 47106.6. Samples: 692839500. Policy #0 lag: (min: 0.0, avg: 43.8, max: 94.0) [2024-03-21 02:52:35,522][03784] Avg episode reward: [(0, '1.066')] [2024-03-21 02:52:40,525][03784] Fps is (10 sec: 22929.0, 60 sec: 44234.0, 300 sec: 46652.1). Total num frames: 691699712. Throughput: 0: 47140.3. Samples: 693108500. Policy #0 lag: (min: 0.0, avg: 43.8, max: 94.0) [2024-03-21 02:52:40,526][03784] Avg episode reward: [(0, '1.246')] [2024-03-21 02:52:44,219][04017] Updated weights for policy 0, policy_version 21116 (0.0010) [2024-03-21 02:52:45,521][03784] Fps is (10 sec: 42598.6, 60 sec: 47513.5, 300 sec: 46541.7). Total num frames: 692027392. Throughput: 0: 46802.2. Samples: 693240600. Policy #0 lag: (min: 1.0, avg: 28.9, max: 89.0) [2024-03-21 02:52:45,522][03784] Avg episode reward: [(0, '0.644')] [2024-03-21 02:52:50,521][03784] Fps is (10 sec: 52449.1, 60 sec: 48605.8, 300 sec: 46097.3). Total num frames: 692224000. Throughput: 0: 46360.1. Samples: 693527300. Policy #0 lag: (min: 1.0, avg: 28.9, max: 89.0) [2024-03-21 02:52:50,522][03784] Avg episode reward: [(0, '0.983')] [2024-03-21 02:52:50,716][04017] Updated weights for policy 0, policy_version 21126 (0.0011) [2024-03-21 02:52:55,035][04017] Updated weights for policy 0, policy_version 21136 (0.0011) [2024-03-21 02:52:55,521][03784] Fps is (10 sec: 58982.9, 60 sec: 48605.9, 300 sec: 46874.9). Total num frames: 692617216. Throughput: 0: 46918.0. Samples: 693824900. Policy #0 lag: (min: 1.0, avg: 28.9, max: 89.0) [2024-03-21 02:52:55,522][03784] Avg episode reward: [(0, '0.983')] [2024-03-21 02:53:00,521][03784] Fps is (10 sec: 58983.1, 60 sec: 48606.0, 300 sec: 46652.7). Total num frames: 692813824. Throughput: 0: 47260.2. Samples: 693972200. Policy #0 lag: (min: 0.0, avg: 43.8, max: 92.0) [2024-03-21 02:53:00,522][03784] Avg episode reward: [(0, '0.554')] [2024-03-21 02:53:04,309][04017] Updated weights for policy 0, policy_version 21146 (0.0010) [2024-03-21 02:53:05,521][03784] Fps is (10 sec: 36044.2, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 692977664. Throughput: 0: 47464.2. Samples: 694269900. Policy #0 lag: (min: 0.0, avg: 43.8, max: 92.0) [2024-03-21 02:53:05,522][03784] Avg episode reward: [(0, '0.798')] [2024-03-21 02:53:08,720][04017] Updated weights for policy 0, policy_version 21156 (0.0011) [2024-03-21 02:53:10,521][03784] Fps is (10 sec: 42597.8, 60 sec: 45875.2, 300 sec: 47098.6). Total num frames: 693239808. Throughput: 0: 47722.2. Samples: 694548100. Policy #0 lag: (min: 0.0, avg: 43.8, max: 92.0) [2024-03-21 02:53:10,522][03784] Avg episode reward: [(0, '0.946')] [2024-03-21 02:53:15,453][04017] Updated weights for policy 0, policy_version 21166 (0.0021) [2024-03-21 02:53:15,521][03784] Fps is (10 sec: 58982.8, 60 sec: 47513.7, 300 sec: 47097.1). Total num frames: 693567488. Throughput: 0: 47593.5. Samples: 694679800. Policy #0 lag: (min: 0.0, avg: 33.3, max: 74.0) [2024-03-21 02:53:15,522][03784] Avg episode reward: [(0, '0.871')] [2024-03-21 02:53:20,521][03784] Fps is (10 sec: 55705.5, 60 sec: 47513.4, 300 sec: 47430.3). Total num frames: 693796864. Throughput: 0: 47684.4. Samples: 694985300. Policy #0 lag: (min: 0.0, avg: 33.3, max: 74.0) [2024-03-21 02:53:20,522][03784] Avg episode reward: [(0, '0.871')] [2024-03-21 02:53:23,555][04017] Updated weights for policy 0, policy_version 21176 (0.0010) [2024-03-21 02:53:25,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45329.1, 300 sec: 47541.4). Total num frames: 693960704. Throughput: 0: 48073.1. Samples: 695271600. Policy #0 lag: (min: 0.0, avg: 33.3, max: 74.0) [2024-03-21 02:53:25,522][03784] Avg episode reward: [(0, '0.846')] [2024-03-21 02:53:26,358][03995] Signal inference workers to stop experience collection... (13950 times) [2024-03-21 02:53:26,406][04017] InferenceWorker_p0-w0: stopping experience collection (13950 times) [2024-03-21 02:53:26,431][03995] Signal inference workers to resume experience collection... (13950 times) [2024-03-21 02:53:26,446][04017] InferenceWorker_p0-w0: resuming experience collection (13950 times) [2024-03-21 02:53:30,521][03784] Fps is (10 sec: 39322.0, 60 sec: 45329.2, 300 sec: 47430.3). Total num frames: 694190080. Throughput: 0: 48195.6. Samples: 695409400. Policy #0 lag: (min: 3.0, avg: 39.5, max: 114.0) [2024-03-21 02:53:30,522][03784] Avg episode reward: [(0, '0.666')] [2024-03-21 02:53:33,159][04017] Updated weights for policy 0, policy_version 21186 (0.0015) [2024-03-21 02:53:35,521][03784] Fps is (10 sec: 39321.6, 60 sec: 45875.3, 300 sec: 46652.8). Total num frames: 694353920. Throughput: 0: 47957.8. Samples: 695685400. Policy #0 lag: (min: 3.0, avg: 39.5, max: 114.0) [2024-03-21 02:53:35,522][03784] Avg episode reward: [(0, '0.637')] [2024-03-21 02:53:39,579][04017] Updated weights for policy 0, policy_version 21196 (0.0010) [2024-03-21 02:53:40,521][03784] Fps is (10 sec: 42598.1, 60 sec: 48609.0, 300 sec: 46652.7). Total num frames: 694616064. Throughput: 0: 47519.9. Samples: 695963300. Policy #0 lag: (min: 3.0, avg: 39.5, max: 114.0) [2024-03-21 02:53:40,522][03784] Avg episode reward: [(0, '0.781')] [2024-03-21 02:53:43,720][04017] Updated weights for policy 0, policy_version 21206 (0.0020) [2024-03-21 02:53:45,521][03784] Fps is (10 sec: 58981.3, 60 sec: 48605.8, 300 sec: 47097.0). Total num frames: 694943744. Throughput: 0: 47124.2. Samples: 696092800. Policy #0 lag: (min: 1.0, avg: 44.6, max: 90.0) [2024-03-21 02:53:45,522][03784] Avg episode reward: [(0, '1.306')] [2024-03-21 02:53:50,048][04017] Updated weights for policy 0, policy_version 21216 (0.0011) [2024-03-21 02:53:50,521][03784] Fps is (10 sec: 62259.5, 60 sec: 50244.3, 300 sec: 47208.1). Total num frames: 695238656. Throughput: 0: 46975.6. Samples: 696383800. Policy #0 lag: (min: 1.0, avg: 44.6, max: 90.0) [2024-03-21 02:53:50,522][03784] Avg episode reward: [(0, '0.505')] [2024-03-21 02:53:55,521][03784] Fps is (10 sec: 52429.6, 60 sec: 47513.5, 300 sec: 46874.9). Total num frames: 695468032. Throughput: 0: 46626.7. Samples: 696646300. Policy #0 lag: (min: 1.0, avg: 44.6, max: 90.0) [2024-03-21 02:53:55,522][03784] Avg episode reward: [(0, '0.588')] [2024-03-21 02:53:55,883][04017] Updated weights for policy 0, policy_version 21226 (0.0011) [2024-03-21 02:54:00,521][03784] Fps is (10 sec: 45874.8, 60 sec: 48059.6, 300 sec: 46874.9). Total num frames: 695697408. Throughput: 0: 46548.8. Samples: 696774500. Policy #0 lag: (min: 1.0, avg: 41.3, max: 80.0) [2024-03-21 02:54:00,522][03784] Avg episode reward: [(0, '0.660')] [2024-03-21 02:54:00,641][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000021232_695730176.pth... [2024-03-21 02:54:00,800][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000020881_684228608.pth [2024-03-21 02:54:02,879][04017] Updated weights for policy 0, policy_version 21236 (0.0012) [2024-03-21 02:54:05,521][03784] Fps is (10 sec: 55705.5, 60 sec: 50790.5, 300 sec: 47319.2). Total num frames: 696025088. Throughput: 0: 45606.7. Samples: 697037600. Policy #0 lag: (min: 1.0, avg: 41.3, max: 80.0) [2024-03-21 02:54:05,531][03784] Avg episode reward: [(0, '0.663')] [2024-03-21 02:54:10,015][04017] Updated weights for policy 0, policy_version 21246 (0.0011) [2024-03-21 02:54:10,521][03784] Fps is (10 sec: 49152.8, 60 sec: 49152.1, 300 sec: 47874.6). Total num frames: 696188928. Throughput: 0: 45580.0. Samples: 697322700. Policy #0 lag: (min: 1.0, avg: 41.3, max: 80.0) [2024-03-21 02:54:10,522][03784] Avg episode reward: [(0, '1.024')] [2024-03-21 02:54:15,521][03784] Fps is (10 sec: 32768.4, 60 sec: 46421.4, 300 sec: 47208.1). Total num frames: 696352768. Throughput: 0: 45853.4. Samples: 697472800. Policy #0 lag: (min: 0.0, avg: 49.3, max: 111.0) [2024-03-21 02:54:15,530][03784] Avg episode reward: [(0, '0.530')] [2024-03-21 02:54:19,331][03995] Signal inference workers to stop experience collection... (14000 times) [2024-03-21 02:54:19,434][04017] InferenceWorker_p0-w0: stopping experience collection (14000 times) [2024-03-21 02:54:19,579][03995] Signal inference workers to resume experience collection... (14000 times) [2024-03-21 02:54:19,579][04017] InferenceWorker_p0-w0: resuming experience collection (14000 times) [2024-03-21 02:54:20,186][04017] Updated weights for policy 0, policy_version 21256 (0.0019) [2024-03-21 02:54:20,521][03784] Fps is (10 sec: 32767.9, 60 sec: 45329.1, 300 sec: 47208.1). Total num frames: 696516608. Throughput: 0: 46713.3. Samples: 697787500. Policy #0 lag: (min: 0.0, avg: 49.3, max: 111.0) [2024-03-21 02:54:20,530][03784] Avg episode reward: [(0, '1.109')] [2024-03-21 02:54:25,521][03784] Fps is (10 sec: 26214.2, 60 sec: 44236.8, 300 sec: 46986.0). Total num frames: 696614912. Throughput: 0: 46946.8. Samples: 698075900. Policy #0 lag: (min: 0.0, avg: 49.3, max: 111.0) [2024-03-21 02:54:25,522][03784] Avg episode reward: [(0, '1.503')] [2024-03-21 02:54:29,361][04017] Updated weights for policy 0, policy_version 21266 (0.0014) [2024-03-21 02:54:30,521][03784] Fps is (10 sec: 42598.0, 60 sec: 45875.1, 300 sec: 46541.6). Total num frames: 696942592. Throughput: 0: 47175.7. Samples: 698215700. Policy #0 lag: (min: 2.0, avg: 32.8, max: 87.0) [2024-03-21 02:54:30,522][03784] Avg episode reward: [(0, '0.674')] [2024-03-21 02:54:33,197][04017] Updated weights for policy 0, policy_version 21276 (0.0011) [2024-03-21 02:54:35,521][03784] Fps is (10 sec: 68811.6, 60 sec: 49151.9, 300 sec: 46763.8). Total num frames: 697303040. Throughput: 0: 46037.6. Samples: 698455500. Policy #0 lag: (min: 2.0, avg: 32.8, max: 87.0) [2024-03-21 02:54:35,522][03784] Avg episode reward: [(0, '0.668')] [2024-03-21 02:54:40,521][03784] Fps is (10 sec: 52429.2, 60 sec: 47513.7, 300 sec: 46208.4). Total num frames: 697466880. Throughput: 0: 46928.9. Samples: 698758100. Policy #0 lag: (min: 2.0, avg: 32.8, max: 87.0) [2024-03-21 02:54:40,522][03784] Avg episode reward: [(0, '1.164')] [2024-03-21 02:54:40,743][04017] Updated weights for policy 0, policy_version 21286 (0.0016) [2024-03-21 02:54:45,521][03784] Fps is (10 sec: 42599.2, 60 sec: 46421.5, 300 sec: 46652.8). Total num frames: 697729024. Throughput: 0: 47077.9. Samples: 698893000. Policy #0 lag: (min: 0.0, avg: 34.9, max: 72.0) [2024-03-21 02:54:45,522][03784] Avg episode reward: [(0, '0.709')] [2024-03-21 02:54:47,922][04017] Updated weights for policy 0, policy_version 21296 (0.0010) [2024-03-21 02:54:50,521][03784] Fps is (10 sec: 49151.5, 60 sec: 45329.0, 300 sec: 47430.3). Total num frames: 697958400. Throughput: 0: 47637.7. Samples: 699181300. Policy #0 lag: (min: 0.0, avg: 34.9, max: 72.0) [2024-03-21 02:54:50,522][03784] Avg episode reward: [(0, '1.173')] [2024-03-21 02:54:52,762][04017] Updated weights for policy 0, policy_version 21306 (0.0014) [2024-03-21 02:54:55,521][03784] Fps is (10 sec: 52427.9, 60 sec: 46421.2, 300 sec: 47652.5). Total num frames: 698253312. Throughput: 0: 47533.1. Samples: 699461700. Policy #0 lag: (min: 0.0, avg: 34.9, max: 72.0) [2024-03-21 02:54:55,522][03784] Avg episode reward: [(0, '1.173')] [2024-03-21 02:55:00,521][03784] Fps is (10 sec: 45875.8, 60 sec: 45329.2, 300 sec: 47652.4). Total num frames: 698417152. Throughput: 0: 47342.2. Samples: 699603200. Policy #0 lag: (min: 0.0, avg: 34.9, max: 72.0) [2024-03-21 02:55:00,522][03784] Avg episode reward: [(0, '0.707')] [2024-03-21 02:55:03,135][04017] Updated weights for policy 0, policy_version 21316 (0.0010) [2024-03-21 02:55:05,521][03784] Fps is (10 sec: 39322.2, 60 sec: 43690.7, 300 sec: 47541.4). Total num frames: 698646528. Throughput: 0: 46533.3. Samples: 699881500. Policy #0 lag: (min: 0.0, avg: 42.5, max: 106.0) [2024-03-21 02:55:05,522][03784] Avg episode reward: [(0, '1.106')] [2024-03-21 02:55:08,048][04017] Updated weights for policy 0, policy_version 21326 (0.0011) [2024-03-21 02:55:10,521][03784] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 47541.4). Total num frames: 698875904. Throughput: 0: 46166.7. Samples: 700153400. Policy #0 lag: (min: 0.0, avg: 42.5, max: 106.0) [2024-03-21 02:55:10,522][03784] Avg episode reward: [(0, '1.338')] [2024-03-21 02:55:11,495][03995] Signal inference workers to stop experience collection... (14050 times) [2024-03-21 02:55:11,574][04017] InferenceWorker_p0-w0: stopping experience collection (14050 times) [2024-03-21 02:55:11,814][03995] Signal inference workers to resume experience collection... (14050 times) [2024-03-21 02:55:11,814][04017] InferenceWorker_p0-w0: resuming experience collection (14050 times) [2024-03-21 02:55:15,493][04017] Updated weights for policy 0, policy_version 21336 (0.0012) [2024-03-21 02:55:15,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46421.3, 300 sec: 47319.2). Total num frames: 699138048. Throughput: 0: 45806.8. Samples: 700277000. Policy #0 lag: (min: 0.0, avg: 42.5, max: 106.0) [2024-03-21 02:55:15,522][03784] Avg episode reward: [(0, '0.379')] [2024-03-21 02:55:20,412][04017] Updated weights for policy 0, policy_version 21346 (0.0011) [2024-03-21 02:55:20,521][03784] Fps is (10 sec: 58982.3, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 699465728. Throughput: 0: 46471.3. Samples: 700546700. Policy #0 lag: (min: 2.0, avg: 34.2, max: 72.0) [2024-03-21 02:55:20,522][03784] Avg episode reward: [(0, '0.686')] [2024-03-21 02:55:25,521][03784] Fps is (10 sec: 39321.1, 60 sec: 48605.7, 300 sec: 46208.4). Total num frames: 699531264. Throughput: 0: 45659.9. Samples: 700812800. Policy #0 lag: (min: 2.0, avg: 34.2, max: 72.0) [2024-03-21 02:55:25,522][03784] Avg episode reward: [(0, '1.195')] [2024-03-21 02:55:30,521][03784] Fps is (10 sec: 26214.2, 60 sec: 46421.4, 300 sec: 45653.0). Total num frames: 699727872. Throughput: 0: 45742.1. Samples: 700951400. Policy #0 lag: (min: 2.0, avg: 34.2, max: 72.0) [2024-03-21 02:55:30,522][03784] Avg episode reward: [(0, '1.396')] [2024-03-21 02:55:31,998][04017] Updated weights for policy 0, policy_version 21356 (0.0020) [2024-03-21 02:55:35,521][03784] Fps is (10 sec: 52429.9, 60 sec: 45875.4, 300 sec: 46430.6). Total num frames: 700055552. Throughput: 0: 45338.0. Samples: 701221500. Policy #0 lag: (min: 2.0, avg: 40.3, max: 106.0) [2024-03-21 02:55:35,522][03784] Avg episode reward: [(0, '0.560')] [2024-03-21 02:55:37,257][04017] Updated weights for policy 0, policy_version 21366 (0.0014) [2024-03-21 02:55:40,521][03784] Fps is (10 sec: 45875.7, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 700186624. Throughput: 0: 45511.3. Samples: 701509700. Policy #0 lag: (min: 2.0, avg: 40.3, max: 106.0) [2024-03-21 02:55:40,522][03784] Avg episode reward: [(0, '1.265')] [2024-03-21 02:55:43,938][04017] Updated weights for policy 0, policy_version 21376 (0.0015) [2024-03-21 02:55:45,521][03784] Fps is (10 sec: 52428.2, 60 sec: 47513.6, 300 sec: 47430.3). Total num frames: 700579840. Throughput: 0: 45146.6. Samples: 701634800. Policy #0 lag: (min: 2.0, avg: 40.3, max: 106.0) [2024-03-21 02:55:45,522][03784] Avg episode reward: [(0, '1.458')] [2024-03-21 02:55:50,521][03784] Fps is (10 sec: 55705.4, 60 sec: 46421.4, 300 sec: 47319.2). Total num frames: 700743680. Throughput: 0: 45346.7. Samples: 701922100. Policy #0 lag: (min: 0.0, avg: 39.2, max: 70.0) [2024-03-21 02:55:50,522][03784] Avg episode reward: [(0, '1.157')] [2024-03-21 02:55:51,206][04017] Updated weights for policy 0, policy_version 21386 (0.0015) [2024-03-21 02:55:55,521][03784] Fps is (10 sec: 22937.6, 60 sec: 42598.5, 300 sec: 46874.9). Total num frames: 700809216. Throughput: 0: 46493.3. Samples: 702245600. Policy #0 lag: (min: 0.0, avg: 39.2, max: 70.0) [2024-03-21 02:55:55,522][03784] Avg episode reward: [(0, '1.125')] [2024-03-21 02:56:00,521][03784] Fps is (10 sec: 22938.0, 60 sec: 42598.5, 300 sec: 46541.7). Total num frames: 700973056. Throughput: 0: 47044.6. Samples: 702394000. Policy #0 lag: (min: 0.0, avg: 39.2, max: 70.0) [2024-03-21 02:56:00,521][03784] Avg episode reward: [(0, '1.438')] [2024-03-21 02:56:00,799][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000021394_701038592.pth... [2024-03-21 02:56:00,935][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000021056_689963008.pth [2024-03-21 02:56:01,588][04017] Updated weights for policy 0, policy_version 21396 (0.0010) [2024-03-21 02:56:05,521][03784] Fps is (10 sec: 52428.4, 60 sec: 44782.9, 300 sec: 46541.7). Total num frames: 701333504. Throughput: 0: 47171.0. Samples: 702669400. Policy #0 lag: (min: 0.0, avg: 33.0, max: 112.0) [2024-03-21 02:56:05,522][03784] Avg episode reward: [(0, '0.968')] [2024-03-21 02:56:05,876][03995] Signal inference workers to stop experience collection... (14100 times) [2024-03-21 02:56:05,916][04017] InferenceWorker_p0-w0: stopping experience collection (14100 times) [2024-03-21 02:56:06,117][03995] Signal inference workers to resume experience collection... (14100 times) [2024-03-21 02:56:06,118][04017] InferenceWorker_p0-w0: resuming experience collection (14100 times) [2024-03-21 02:56:06,414][04017] Updated weights for policy 0, policy_version 21406 (0.0020) [2024-03-21 02:56:10,521][03784] Fps is (10 sec: 65535.0, 60 sec: 45875.2, 300 sec: 46652.8). Total num frames: 701628416. Throughput: 0: 46698.0. Samples: 702914200. Policy #0 lag: (min: 0.0, avg: 33.0, max: 112.0) [2024-03-21 02:56:10,522][03784] Avg episode reward: [(0, '0.979')] [2024-03-21 02:56:12,862][04017] Updated weights for policy 0, policy_version 21416 (0.0014) [2024-03-21 02:56:15,521][03784] Fps is (10 sec: 55706.1, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 701890560. Throughput: 0: 46513.4. Samples: 703044500. Policy #0 lag: (min: 0.0, avg: 33.0, max: 112.0) [2024-03-21 02:56:15,522][03784] Avg episode reward: [(0, '0.535')] [2024-03-21 02:56:18,513][04017] Updated weights for policy 0, policy_version 21426 (0.0010) [2024-03-21 02:56:20,521][03784] Fps is (10 sec: 55705.3, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 702185472. Throughput: 0: 46408.8. Samples: 703309900. Policy #0 lag: (min: 4.0, avg: 42.6, max: 77.0) [2024-03-21 02:56:20,522][03784] Avg episode reward: [(0, '1.156')] [2024-03-21 02:56:25,521][03784] Fps is (10 sec: 49152.2, 60 sec: 47513.7, 300 sec: 47097.1). Total num frames: 702382080. Throughput: 0: 46893.3. Samples: 703619900. Policy #0 lag: (min: 4.0, avg: 42.6, max: 77.0) [2024-03-21 02:56:25,522][03784] Avg episode reward: [(0, '1.156')] [2024-03-21 02:56:27,212][04017] Updated weights for policy 0, policy_version 21436 (0.0014) [2024-03-21 02:56:30,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46967.5, 300 sec: 46763.8). Total num frames: 702545920. Throughput: 0: 47460.0. Samples: 703770500. Policy #0 lag: (min: 4.0, avg: 42.6, max: 77.0) [2024-03-21 02:56:30,522][03784] Avg episode reward: [(0, '0.727')] [2024-03-21 02:56:34,878][04017] Updated weights for policy 0, policy_version 21446 (0.0012) [2024-03-21 02:56:35,521][03784] Fps is (10 sec: 39320.8, 60 sec: 45328.9, 300 sec: 46541.7). Total num frames: 702775296. Throughput: 0: 47348.7. Samples: 704052800. Policy #0 lag: (min: 1.0, avg: 36.0, max: 78.0) [2024-03-21 02:56:35,522][03784] Avg episode reward: [(0, '0.666')] [2024-03-21 02:56:39,403][04017] Updated weights for policy 0, policy_version 21456 (0.0016) [2024-03-21 02:56:40,521][03784] Fps is (10 sec: 52428.6, 60 sec: 48059.7, 300 sec: 47097.0). Total num frames: 703070208. Throughput: 0: 46788.9. Samples: 704351100. Policy #0 lag: (min: 1.0, avg: 36.0, max: 78.0) [2024-03-21 02:56:40,522][03784] Avg episode reward: [(0, '0.666')] [2024-03-21 02:56:45,521][03784] Fps is (10 sec: 42599.2, 60 sec: 43690.7, 300 sec: 47097.1). Total num frames: 703201280. Throughput: 0: 46939.8. Samples: 704506300. Policy #0 lag: (min: 1.0, avg: 36.0, max: 78.0) [2024-03-21 02:56:45,522][03784] Avg episode reward: [(0, '0.644')] [2024-03-21 02:56:50,521][03784] Fps is (10 sec: 29491.2, 60 sec: 43690.6, 300 sec: 46319.5). Total num frames: 703365120. Throughput: 0: 47077.8. Samples: 704787900. Policy #0 lag: (min: 2.0, avg: 43.1, max: 88.0) [2024-03-21 02:56:50,522][03784] Avg episode reward: [(0, '1.261')] [2024-03-21 02:56:51,467][04017] Updated weights for policy 0, policy_version 21466 (0.0019) [2024-03-21 02:56:55,521][03784] Fps is (10 sec: 42598.1, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 703627264. Throughput: 0: 47966.6. Samples: 705072700. Policy #0 lag: (min: 2.0, avg: 43.1, max: 88.0) [2024-03-21 02:56:55,522][03784] Avg episode reward: [(0, '0.960')] [2024-03-21 02:56:56,838][04017] Updated weights for policy 0, policy_version 21476 (0.0012) [2024-03-21 02:57:00,521][03784] Fps is (10 sec: 52428.6, 60 sec: 48605.7, 300 sec: 46652.8). Total num frames: 703889408. Throughput: 0: 47851.1. Samples: 705197800. Policy #0 lag: (min: 2.0, avg: 43.1, max: 88.0) [2024-03-21 02:57:00,522][03784] Avg episode reward: [(0, '0.616')] [2024-03-21 02:57:01,316][03995] Signal inference workers to stop experience collection... (14150 times) [2024-03-21 02:57:01,417][04017] InferenceWorker_p0-w0: stopping experience collection (14150 times) [2024-03-21 02:57:01,521][03995] Signal inference workers to resume experience collection... (14150 times) [2024-03-21 02:57:01,521][04017] InferenceWorker_p0-w0: resuming experience collection (14150 times) [2024-03-21 02:57:01,857][04017] Updated weights for policy 0, policy_version 21486 (0.0017) [2024-03-21 02:57:05,521][03784] Fps is (10 sec: 65536.0, 60 sec: 49152.0, 300 sec: 46763.8). Total num frames: 704282624. Throughput: 0: 47911.1. Samples: 705465900. Policy #0 lag: (min: 1.0, avg: 42.7, max: 101.0) [2024-03-21 02:57:05,522][03784] Avg episode reward: [(0, '1.015')] [2024-03-21 02:57:07,593][04017] Updated weights for policy 0, policy_version 21496 (0.0018) [2024-03-21 02:57:10,521][03784] Fps is (10 sec: 62258.9, 60 sec: 48059.6, 300 sec: 46763.8). Total num frames: 704512000. Throughput: 0: 47771.0. Samples: 705769600. Policy #0 lag: (min: 1.0, avg: 42.7, max: 101.0) [2024-03-21 02:57:10,522][03784] Avg episode reward: [(0, '1.015')] [2024-03-21 02:57:15,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 704643072. Throughput: 0: 47611.1. Samples: 705913000. Policy #0 lag: (min: 1.0, avg: 42.7, max: 101.0) [2024-03-21 02:57:15,522][03784] Avg episode reward: [(0, '1.167')] [2024-03-21 02:57:16,644][04017] Updated weights for policy 0, policy_version 21506 (0.0019) [2024-03-21 02:57:20,521][03784] Fps is (10 sec: 36045.3, 60 sec: 44783.0, 300 sec: 46208.4). Total num frames: 704872448. Throughput: 0: 47175.7. Samples: 706175700. Policy #0 lag: (min: 0.0, avg: 33.1, max: 71.0) [2024-03-21 02:57:20,522][03784] Avg episode reward: [(0, '1.349')] [2024-03-21 02:57:22,988][04017] Updated weights for policy 0, policy_version 21516 (0.0010) [2024-03-21 02:57:25,521][03784] Fps is (10 sec: 58981.8, 60 sec: 47513.5, 300 sec: 46652.8). Total num frames: 705232896. Throughput: 0: 46753.3. Samples: 706455000. Policy #0 lag: (min: 0.0, avg: 33.1, max: 71.0) [2024-03-21 02:57:25,522][03784] Avg episode reward: [(0, '1.079')] [2024-03-21 02:57:27,529][04017] Updated weights for policy 0, policy_version 21526 (0.0013) [2024-03-21 02:57:30,521][03784] Fps is (10 sec: 58981.9, 60 sec: 48605.8, 300 sec: 46986.0). Total num frames: 705462272. Throughput: 0: 46393.2. Samples: 706594000. Policy #0 lag: (min: 0.0, avg: 33.1, max: 71.0) [2024-03-21 02:57:30,522][03784] Avg episode reward: [(0, '0.581')] [2024-03-21 02:57:35,521][03784] Fps is (10 sec: 36045.4, 60 sec: 46967.7, 300 sec: 47097.7). Total num frames: 705593344. Throughput: 0: 46953.5. Samples: 706900800. Policy #0 lag: (min: 0.0, avg: 47.9, max: 112.0) [2024-03-21 02:57:35,521][03784] Avg episode reward: [(0, '1.292')] [2024-03-21 02:57:39,124][04017] Updated weights for policy 0, policy_version 21536 (0.0010) [2024-03-21 02:57:40,521][03784] Fps is (10 sec: 29491.6, 60 sec: 44783.0, 300 sec: 46541.7). Total num frames: 705757184. Throughput: 0: 47302.3. Samples: 707201300. Policy #0 lag: (min: 0.0, avg: 47.9, max: 112.0) [2024-03-21 02:57:40,521][03784] Avg episode reward: [(0, '1.339')] [2024-03-21 02:57:43,139][04017] Updated weights for policy 0, policy_version 21546 (0.0011) [2024-03-21 02:57:45,521][03784] Fps is (10 sec: 55705.1, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 706150400. Throughput: 0: 47326.7. Samples: 707327500. Policy #0 lag: (min: 0.0, avg: 47.9, max: 112.0) [2024-03-21 02:57:45,522][03784] Avg episode reward: [(0, '0.774')] [2024-03-21 02:57:49,228][04017] Updated weights for policy 0, policy_version 21556 (0.0011) [2024-03-21 02:57:49,584][03995] Signal inference workers to stop experience collection... (14200 times) [2024-03-21 02:57:49,712][04017] InferenceWorker_p0-w0: stopping experience collection (14200 times) [2024-03-21 02:57:49,817][03995] Signal inference workers to resume experience collection... (14200 times) [2024-03-21 02:57:49,818][04017] InferenceWorker_p0-w0: resuming experience collection (14200 times) [2024-03-21 02:57:50,521][03784] Fps is (10 sec: 68812.3, 60 sec: 51336.6, 300 sec: 46874.9). Total num frames: 706445312. Throughput: 0: 47840.1. Samples: 707618700. Policy #0 lag: (min: 2.0, avg: 39.4, max: 74.0) [2024-03-21 02:57:50,522][03784] Avg episode reward: [(0, '0.868')] [2024-03-21 02:57:55,521][03784] Fps is (10 sec: 39321.6, 60 sec: 48605.9, 300 sec: 46541.7). Total num frames: 706543616. Throughput: 0: 47522.3. Samples: 707908100. Policy #0 lag: (min: 2.0, avg: 39.4, max: 74.0) [2024-03-21 02:57:55,522][03784] Avg episode reward: [(0, '0.526')] [2024-03-21 02:57:59,843][04017] Updated weights for policy 0, policy_version 21566 (0.0012) [2024-03-21 02:58:00,521][03784] Fps is (10 sec: 26214.4, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 706707456. Throughput: 0: 47522.2. Samples: 708051500. Policy #0 lag: (min: 2.0, avg: 39.4, max: 74.0) [2024-03-21 02:58:00,522][03784] Avg episode reward: [(0, '0.873')] [2024-03-21 02:58:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000021567_706707456.pth... [2024-03-21 02:58:00,629][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000021232_695730176.pth [2024-03-21 02:58:05,521][03784] Fps is (10 sec: 42598.2, 60 sec: 44782.9, 300 sec: 46541.7). Total num frames: 706969600. Throughput: 0: 48371.0. Samples: 708352400. Policy #0 lag: (min: 1.0, avg: 49.5, max: 108.0) [2024-03-21 02:58:05,522][03784] Avg episode reward: [(0, '1.244')] [2024-03-21 02:58:07,893][04017] Updated weights for policy 0, policy_version 21576 (0.0020) [2024-03-21 02:58:10,521][03784] Fps is (10 sec: 49152.1, 60 sec: 44783.0, 300 sec: 46208.4). Total num frames: 707198976. Throughput: 0: 48242.3. Samples: 708625900. Policy #0 lag: (min: 1.0, avg: 49.5, max: 108.0) [2024-03-21 02:58:10,522][03784] Avg episode reward: [(0, '1.359')] [2024-03-21 02:58:13,311][04017] Updated weights for policy 0, policy_version 21586 (0.0020) [2024-03-21 02:58:15,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46421.2, 300 sec: 46208.4). Total num frames: 707428352. Throughput: 0: 47995.5. Samples: 708753800. Policy #0 lag: (min: 1.0, avg: 49.5, max: 108.0) [2024-03-21 02:58:15,522][03784] Avg episode reward: [(0, '1.061')] [2024-03-21 02:58:20,521][03784] Fps is (10 sec: 42597.9, 60 sec: 45875.1, 300 sec: 46319.5). Total num frames: 707624960. Throughput: 0: 47097.6. Samples: 709020200. Policy #0 lag: (min: 0.0, avg: 32.4, max: 67.0) [2024-03-21 02:58:20,522][03784] Avg episode reward: [(0, '0.652')] [2024-03-21 02:58:21,599][04017] Updated weights for policy 0, policy_version 21596 (0.0011) [2024-03-21 02:58:25,521][03784] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 46430.6). Total num frames: 707887104. Throughput: 0: 46428.7. Samples: 709290600. Policy #0 lag: (min: 0.0, avg: 32.4, max: 67.0) [2024-03-21 02:58:25,522][03784] Avg episode reward: [(0, '0.472')] [2024-03-21 02:58:26,515][04017] Updated weights for policy 0, policy_version 21606 (0.0016) [2024-03-21 02:58:30,521][03784] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 46430.6). Total num frames: 708050944. Throughput: 0: 46671.0. Samples: 709427700. Policy #0 lag: (min: 0.0, avg: 32.4, max: 67.0) [2024-03-21 02:58:30,522][03784] Avg episode reward: [(0, '0.719')] [2024-03-21 02:58:34,911][04017] Updated weights for policy 0, policy_version 21616 (0.0015) [2024-03-21 02:58:35,521][03784] Fps is (10 sec: 45875.5, 60 sec: 45875.1, 300 sec: 46541.7). Total num frames: 708345856. Throughput: 0: 45991.0. Samples: 709688300. Policy #0 lag: (min: 0.0, avg: 66.2, max: 114.0) [2024-03-21 02:58:35,522][03784] Avg episode reward: [(0, '0.630')] [2024-03-21 02:58:40,196][04017] Updated weights for policy 0, policy_version 21626 (0.0018) [2024-03-21 02:58:40,521][03784] Fps is (10 sec: 62259.9, 60 sec: 48605.8, 300 sec: 46541.7). Total num frames: 708673536. Throughput: 0: 45315.6. Samples: 709947300. Policy #0 lag: (min: 0.0, avg: 66.2, max: 114.0) [2024-03-21 02:58:40,522][03784] Avg episode reward: [(0, '1.178')] [2024-03-21 02:58:44,772][03995] Signal inference workers to stop experience collection... (14250 times) [2024-03-21 02:58:44,844][03995] Signal inference workers to resume experience collection... (14250 times) [2024-03-21 02:58:44,849][04017] InferenceWorker_p0-w0: stopping experience collection (14250 times) [2024-03-21 02:58:44,898][04017] InferenceWorker_p0-w0: resuming experience collection (14250 times) [2024-03-21 02:58:45,521][03784] Fps is (10 sec: 58982.7, 60 sec: 46421.3, 300 sec: 46430.6). Total num frames: 708935680. Throughput: 0: 44973.3. Samples: 710075300. Policy #0 lag: (min: 0.0, avg: 66.2, max: 114.0) [2024-03-21 02:58:45,522][03784] Avg episode reward: [(0, '0.745')] [2024-03-21 02:58:46,413][04017] Updated weights for policy 0, policy_version 21636 (0.0019) [2024-03-21 02:58:50,521][03784] Fps is (10 sec: 42598.0, 60 sec: 44236.7, 300 sec: 46208.4). Total num frames: 709099520. Throughput: 0: 44820.0. Samples: 710369300. Policy #0 lag: (min: 0.0, avg: 28.2, max: 84.0) [2024-03-21 02:58:50,522][03784] Avg episode reward: [(0, '1.310')] [2024-03-21 02:58:54,549][04017] Updated weights for policy 0, policy_version 21646 (0.0010) [2024-03-21 02:58:55,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46967.4, 300 sec: 46319.5). Total num frames: 709361664. Throughput: 0: 44964.4. Samples: 710649300. Policy #0 lag: (min: 0.0, avg: 28.2, max: 84.0) [2024-03-21 02:58:55,522][03784] Avg episode reward: [(0, '1.335')] [2024-03-21 02:59:00,521][03784] Fps is (10 sec: 49152.4, 60 sec: 48059.7, 300 sec: 45986.3). Total num frames: 709591040. Throughput: 0: 45431.2. Samples: 710798200. Policy #0 lag: (min: 0.0, avg: 28.2, max: 84.0) [2024-03-21 02:59:00,522][03784] Avg episode reward: [(0, '0.707')] [2024-03-21 02:59:00,897][04017] Updated weights for policy 0, policy_version 21656 (0.0012) [2024-03-21 02:59:05,521][03784] Fps is (10 sec: 55706.0, 60 sec: 49152.1, 300 sec: 46541.7). Total num frames: 709918720. Throughput: 0: 45640.1. Samples: 711074000. Policy #0 lag: (min: 2.0, avg: 32.2, max: 73.0) [2024-03-21 02:59:05,522][03784] Avg episode reward: [(0, '1.089')] [2024-03-21 02:59:06,249][04017] Updated weights for policy 0, policy_version 21666 (0.0015) [2024-03-21 02:59:10,521][03784] Fps is (10 sec: 45874.5, 60 sec: 47513.5, 300 sec: 46430.6). Total num frames: 710049792. Throughput: 0: 46077.8. Samples: 711364100. Policy #0 lag: (min: 2.0, avg: 32.2, max: 73.0) [2024-03-21 02:59:10,522][03784] Avg episode reward: [(0, '1.410')] [2024-03-21 02:59:15,521][03784] Fps is (10 sec: 32767.8, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 710246400. Throughput: 0: 45842.3. Samples: 711490600. Policy #0 lag: (min: 2.0, avg: 32.2, max: 73.0) [2024-03-21 02:59:15,522][03784] Avg episode reward: [(0, '1.185')] [2024-03-21 02:59:16,086][04017] Updated weights for policy 0, policy_version 21676 (0.0010) [2024-03-21 02:59:20,521][03784] Fps is (10 sec: 36045.1, 60 sec: 46421.4, 300 sec: 46763.8). Total num frames: 710410240. Throughput: 0: 46431.1. Samples: 711777700. Policy #0 lag: (min: 0.0, avg: 51.3, max: 111.0) [2024-03-21 02:59:20,522][03784] Avg episode reward: [(0, '1.135')] [2024-03-21 02:59:25,521][03784] Fps is (10 sec: 32768.1, 60 sec: 44783.0, 300 sec: 46208.4). Total num frames: 710574080. Throughput: 0: 47675.5. Samples: 712092700. Policy #0 lag: (min: 0.0, avg: 51.3, max: 111.0) [2024-03-21 02:59:25,522][03784] Avg episode reward: [(0, '1.245')] [2024-03-21 02:59:26,495][04017] Updated weights for policy 0, policy_version 21686 (0.0019) [2024-03-21 02:59:30,521][03784] Fps is (10 sec: 39321.3, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 710803456. Throughput: 0: 47906.6. Samples: 712231100. Policy #0 lag: (min: 0.0, avg: 51.3, max: 111.0) [2024-03-21 02:59:30,522][03784] Avg episode reward: [(0, '1.128')] [2024-03-21 02:59:33,066][04017] Updated weights for policy 0, policy_version 21696 (0.0011) [2024-03-21 02:59:35,521][03784] Fps is (10 sec: 49152.1, 60 sec: 45329.1, 300 sec: 46097.4). Total num frames: 711065600. Throughput: 0: 47364.5. Samples: 712500700. Policy #0 lag: (min: 0.0, avg: 51.3, max: 111.0) [2024-03-21 02:59:35,522][03784] Avg episode reward: [(0, '1.573')] [2024-03-21 02:59:39,106][04017] Updated weights for policy 0, policy_version 21706 (0.0016) [2024-03-21 02:59:40,521][03784] Fps is (10 sec: 52429.1, 60 sec: 44236.7, 300 sec: 46097.3). Total num frames: 711327744. Throughput: 0: 46986.7. Samples: 712763700. Policy #0 lag: (min: 0.0, avg: 32.4, max: 67.0) [2024-03-21 02:59:40,522][03784] Avg episode reward: [(0, '1.216')] [2024-03-21 02:59:44,948][04017] Updated weights for policy 0, policy_version 21716 (0.0011) [2024-03-21 02:59:45,521][03784] Fps is (10 sec: 55705.3, 60 sec: 44782.9, 300 sec: 46319.5). Total num frames: 711622656. Throughput: 0: 46251.0. Samples: 712879500. Policy #0 lag: (min: 0.0, avg: 32.4, max: 67.0) [2024-03-21 02:59:45,522][03784] Avg episode reward: [(0, '1.479')] [2024-03-21 02:59:48,628][03995] Signal inference workers to stop experience collection... (14300 times) [2024-03-21 02:59:48,691][03995] Signal inference workers to resume experience collection... (14300 times) [2024-03-21 02:59:48,696][04017] InferenceWorker_p0-w0: stopping experience collection (14300 times) [2024-03-21 02:59:48,738][04017] InferenceWorker_p0-w0: resuming experience collection (14300 times) [2024-03-21 02:59:49,633][04017] Updated weights for policy 0, policy_version 21726 (0.0016) [2024-03-21 02:59:50,521][03784] Fps is (10 sec: 58982.6, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 711917568. Throughput: 0: 45828.8. Samples: 713136300. Policy #0 lag: (min: 0.0, avg: 32.4, max: 67.0) [2024-03-21 02:59:50,522][03784] Avg episode reward: [(0, '1.516')] [2024-03-21 02:59:55,262][04017] Updated weights for policy 0, policy_version 21736 (0.0013) [2024-03-21 02:59:55,521][03784] Fps is (10 sec: 62259.3, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 712245248. Throughput: 0: 45775.7. Samples: 713424000. Policy #0 lag: (min: 3.0, avg: 58.3, max: 115.0) [2024-03-21 02:59:55,522][03784] Avg episode reward: [(0, '1.282')] [2024-03-21 03:00:00,521][03784] Fps is (10 sec: 42598.0, 60 sec: 45875.1, 300 sec: 46430.6). Total num frames: 712343552. Throughput: 0: 46157.7. Samples: 713567700. Policy #0 lag: (min: 3.0, avg: 58.3, max: 115.0) [2024-03-21 03:00:00,522][03784] Avg episode reward: [(0, '1.032')] [2024-03-21 03:00:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000021739_712343552.pth... [2024-03-21 03:00:00,688][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000021394_701038592.pth [2024-03-21 03:00:05,521][03784] Fps is (10 sec: 22937.5, 60 sec: 42598.3, 300 sec: 46097.3). Total num frames: 712474624. Throughput: 0: 46433.3. Samples: 713867200. Policy #0 lag: (min: 3.0, avg: 58.3, max: 115.0) [2024-03-21 03:00:05,522][03784] Avg episode reward: [(0, '1.132')] [2024-03-21 03:00:07,083][04017] Updated weights for policy 0, policy_version 21746 (0.0015) [2024-03-21 03:00:10,521][03784] Fps is (10 sec: 39322.0, 60 sec: 44783.0, 300 sec: 46097.4). Total num frames: 712736768. Throughput: 0: 44988.9. Samples: 714117200. Policy #0 lag: (min: 0.0, avg: 39.6, max: 79.0) [2024-03-21 03:00:10,522][03784] Avg episode reward: [(0, '1.246')] [2024-03-21 03:00:13,636][04017] Updated weights for policy 0, policy_version 21756 (0.0012) [2024-03-21 03:00:15,521][03784] Fps is (10 sec: 52428.8, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 712998912. Throughput: 0: 45024.5. Samples: 714257200. Policy #0 lag: (min: 0.0, avg: 39.6, max: 79.0) [2024-03-21 03:00:15,522][03784] Avg episode reward: [(0, '0.547')] [2024-03-21 03:00:20,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46421.3, 300 sec: 46319.5). Total num frames: 713195520. Throughput: 0: 45235.5. Samples: 714536300. Policy #0 lag: (min: 0.0, avg: 39.6, max: 79.0) [2024-03-21 03:00:20,522][03784] Avg episode reward: [(0, '0.951')] [2024-03-21 03:00:20,607][04017] Updated weights for policy 0, policy_version 21766 (0.0011) [2024-03-21 03:00:25,521][03784] Fps is (10 sec: 45875.7, 60 sec: 48059.8, 300 sec: 46541.7). Total num frames: 713457664. Throughput: 0: 45893.4. Samples: 714828900. Policy #0 lag: (min: 1.0, avg: 39.6, max: 80.0) [2024-03-21 03:00:25,522][03784] Avg episode reward: [(0, '0.608')] [2024-03-21 03:00:26,228][04017] Updated weights for policy 0, policy_version 21776 (0.0011) [2024-03-21 03:00:30,521][03784] Fps is (10 sec: 55705.6, 60 sec: 49152.0, 300 sec: 46430.6). Total num frames: 713752576. Throughput: 0: 46462.2. Samples: 714970300. Policy #0 lag: (min: 1.0, avg: 39.6, max: 80.0) [2024-03-21 03:00:30,522][03784] Avg episode reward: [(0, '0.593')] [2024-03-21 03:00:33,112][04017] Updated weights for policy 0, policy_version 21786 (0.0015) [2024-03-21 03:00:35,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 713981952. Throughput: 0: 47393.4. Samples: 715269000. Policy #0 lag: (min: 1.0, avg: 39.6, max: 80.0) [2024-03-21 03:00:35,522][03784] Avg episode reward: [(0, '0.593')] [2024-03-21 03:00:40,521][03784] Fps is (10 sec: 36043.7, 60 sec: 46421.1, 300 sec: 45875.1). Total num frames: 714113024. Throughput: 0: 47333.0. Samples: 715554000. Policy #0 lag: (min: 0.0, avg: 46.1, max: 88.0) [2024-03-21 03:00:40,522][03784] Avg episode reward: [(0, '0.762')] [2024-03-21 03:00:41,540][04017] Updated weights for policy 0, policy_version 21796 (0.0009) [2024-03-21 03:00:45,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46421.3, 300 sec: 46319.5). Total num frames: 714407936. Throughput: 0: 47153.4. Samples: 715689600. Policy #0 lag: (min: 0.0, avg: 46.1, max: 88.0) [2024-03-21 03:00:45,522][03784] Avg episode reward: [(0, '1.220')] [2024-03-21 03:00:48,923][03995] Signal inference workers to stop experience collection... (14350 times) [2024-03-21 03:00:49,027][04017] InferenceWorker_p0-w0: stopping experience collection (14350 times) [2024-03-21 03:00:49,115][03995] Signal inference workers to resume experience collection... (14350 times) [2024-03-21 03:00:49,116][04017] InferenceWorker_p0-w0: resuming experience collection (14350 times) [2024-03-21 03:00:49,120][04017] Updated weights for policy 0, policy_version 21806 (0.0012) [2024-03-21 03:00:50,521][03784] Fps is (10 sec: 52430.1, 60 sec: 45329.0, 300 sec: 46874.9). Total num frames: 714637312. Throughput: 0: 47337.7. Samples: 715997400. Policy #0 lag: (min: 0.0, avg: 46.1, max: 88.0) [2024-03-21 03:00:50,523][03784] Avg episode reward: [(0, '1.220')] [2024-03-21 03:00:54,099][04017] Updated weights for policy 0, policy_version 21816 (0.0024) [2024-03-21 03:00:55,521][03784] Fps is (10 sec: 58982.6, 60 sec: 45875.2, 300 sec: 47541.3). Total num frames: 714997760. Throughput: 0: 48044.4. Samples: 716279200. Policy #0 lag: (min: 0.0, avg: 42.1, max: 92.0) [2024-03-21 03:00:55,522][03784] Avg episode reward: [(0, '1.132')] [2024-03-21 03:00:57,231][04017] Updated weights for policy 0, policy_version 21826 (0.0012) [2024-03-21 03:01:00,521][03784] Fps is (10 sec: 75367.1, 60 sec: 50790.5, 300 sec: 47652.5). Total num frames: 715390976. Throughput: 0: 47802.3. Samples: 716408300. Policy #0 lag: (min: 0.0, avg: 42.1, max: 92.0) [2024-03-21 03:01:00,522][03784] Avg episode reward: [(0, '1.403')] [2024-03-21 03:01:04,171][04017] Updated weights for policy 0, policy_version 21836 (0.0021) [2024-03-21 03:01:05,521][03784] Fps is (10 sec: 55705.9, 60 sec: 51336.6, 300 sec: 47208.1). Total num frames: 715554816. Throughput: 0: 47846.8. Samples: 716689400. Policy #0 lag: (min: 0.0, avg: 42.1, max: 92.0) [2024-03-21 03:01:05,522][03784] Avg episode reward: [(0, '0.955')] [2024-03-21 03:01:10,521][03784] Fps is (10 sec: 22937.6, 60 sec: 48059.7, 300 sec: 46541.7). Total num frames: 715620352. Throughput: 0: 48311.1. Samples: 717002900. Policy #0 lag: (min: 0.0, avg: 42.1, max: 92.0) [2024-03-21 03:01:10,522][03784] Avg episode reward: [(0, '0.814')] [2024-03-21 03:01:15,521][03784] Fps is (10 sec: 26214.0, 60 sec: 46967.5, 300 sec: 46208.4). Total num frames: 715816960. Throughput: 0: 48564.4. Samples: 717155700. Policy #0 lag: (min: 0.0, avg: 38.1, max: 75.0) [2024-03-21 03:01:15,522][03784] Avg episode reward: [(0, '0.645')] [2024-03-21 03:01:16,056][04017] Updated weights for policy 0, policy_version 21846 (0.0010) [2024-03-21 03:01:20,521][03784] Fps is (10 sec: 36044.5, 60 sec: 46421.3, 300 sec: 46097.3). Total num frames: 715980800. Throughput: 0: 48535.5. Samples: 717453100. Policy #0 lag: (min: 0.0, avg: 38.1, max: 75.0) [2024-03-21 03:01:20,522][03784] Avg episode reward: [(0, '0.703')] [2024-03-21 03:01:25,204][04017] Updated weights for policy 0, policy_version 21856 (0.0010) [2024-03-21 03:01:25,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45329.0, 300 sec: 46208.4). Total num frames: 716177408. Throughput: 0: 48811.5. Samples: 717750500. Policy #0 lag: (min: 0.0, avg: 38.1, max: 75.0) [2024-03-21 03:01:25,522][03784] Avg episode reward: [(0, '0.703')] [2024-03-21 03:01:30,515][04017] Updated weights for policy 0, policy_version 21866 (0.0015) [2024-03-21 03:01:30,521][03784] Fps is (10 sec: 52428.8, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 716505088. Throughput: 0: 48797.7. Samples: 717885500. Policy #0 lag: (min: 0.0, avg: 34.7, max: 72.0) [2024-03-21 03:01:30,522][03784] Avg episode reward: [(0, '0.912')] [2024-03-21 03:01:35,521][03784] Fps is (10 sec: 58982.5, 60 sec: 46421.3, 300 sec: 46430.6). Total num frames: 716767232. Throughput: 0: 47162.3. Samples: 718119700. Policy #0 lag: (min: 0.0, avg: 34.7, max: 72.0) [2024-03-21 03:01:35,522][03784] Avg episode reward: [(0, '0.985')] [2024-03-21 03:01:36,266][04017] Updated weights for policy 0, policy_version 21876 (0.0012) [2024-03-21 03:01:38,561][03995] Signal inference workers to stop experience collection... (14400 times) [2024-03-21 03:01:38,562][03995] Signal inference workers to resume experience collection... (14400 times) [2024-03-21 03:01:38,636][04017] InferenceWorker_p0-w0: stopping experience collection (14400 times) [2024-03-21 03:01:38,636][04017] InferenceWorker_p0-w0: resuming experience collection (14400 times) [2024-03-21 03:01:40,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48606.1, 300 sec: 46874.9). Total num frames: 717029376. Throughput: 0: 47431.0. Samples: 718413600. Policy #0 lag: (min: 0.0, avg: 34.7, max: 72.0) [2024-03-21 03:01:40,522][03784] Avg episode reward: [(0, '1.505')] [2024-03-21 03:01:41,725][04017] Updated weights for policy 0, policy_version 21886 (0.0016) [2024-03-21 03:01:45,521][03784] Fps is (10 sec: 58981.6, 60 sec: 49151.9, 300 sec: 47430.3). Total num frames: 717357056. Throughput: 0: 47653.2. Samples: 718552700. Policy #0 lag: (min: 1.0, avg: 35.5, max: 66.0) [2024-03-21 03:01:45,522][03784] Avg episode reward: [(0, '1.505')] [2024-03-21 03:01:48,118][04017] Updated weights for policy 0, policy_version 21896 (0.0016) [2024-03-21 03:01:50,521][03784] Fps is (10 sec: 62259.5, 60 sec: 50244.3, 300 sec: 47541.4). Total num frames: 717651968. Throughput: 0: 47911.0. Samples: 718845400. Policy #0 lag: (min: 1.0, avg: 35.5, max: 66.0) [2024-03-21 03:01:50,522][03784] Avg episode reward: [(0, '1.069')] [2024-03-21 03:01:54,608][04017] Updated weights for policy 0, policy_version 21906 (0.0012) [2024-03-21 03:01:55,521][03784] Fps is (10 sec: 49152.9, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 717848576. Throughput: 0: 47062.2. Samples: 719120700. Policy #0 lag: (min: 1.0, avg: 35.5, max: 66.0) [2024-03-21 03:01:55,522][03784] Avg episode reward: [(0, '1.399')] [2024-03-21 03:02:00,521][03784] Fps is (10 sec: 42598.6, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 718077952. Throughput: 0: 46769.0. Samples: 719260300. Policy #0 lag: (min: 0.0, avg: 38.8, max: 73.0) [2024-03-21 03:02:00,522][03784] Avg episode reward: [(0, '0.725')] [2024-03-21 03:02:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000021914_718077952.pth... [2024-03-21 03:02:00,664][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000021567_706707456.pth [2024-03-21 03:02:01,459][04017] Updated weights for policy 0, policy_version 21916 (0.0016) [2024-03-21 03:02:05,521][03784] Fps is (10 sec: 58981.8, 60 sec: 48059.6, 300 sec: 47208.1). Total num frames: 718438400. Throughput: 0: 46800.0. Samples: 719559100. Policy #0 lag: (min: 0.0, avg: 38.8, max: 73.0) [2024-03-21 03:02:05,522][03784] Avg episode reward: [(0, '1.443')] [2024-03-21 03:02:08,569][04017] Updated weights for policy 0, policy_version 21926 (0.0012) [2024-03-21 03:02:10,521][03784] Fps is (10 sec: 39321.2, 60 sec: 47513.5, 300 sec: 46874.9). Total num frames: 718471168. Throughput: 0: 46906.6. Samples: 719861300. Policy #0 lag: (min: 0.0, avg: 38.8, max: 73.0) [2024-03-21 03:02:10,522][03784] Avg episode reward: [(0, '0.875')] [2024-03-21 03:02:15,521][03784] Fps is (10 sec: 22937.6, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 718667776. Throughput: 0: 47280.0. Samples: 720013100. Policy #0 lag: (min: 0.0, avg: 26.9, max: 70.0) [2024-03-21 03:02:15,522][03784] Avg episode reward: [(0, '0.917')] [2024-03-21 03:02:17,530][04017] Updated weights for policy 0, policy_version 21936 (0.0013) [2024-03-21 03:02:20,521][03784] Fps is (10 sec: 52429.4, 60 sec: 50244.3, 300 sec: 46652.8). Total num frames: 718995456. Throughput: 0: 48022.3. Samples: 720280700. Policy #0 lag: (min: 0.0, avg: 26.9, max: 70.0) [2024-03-21 03:02:20,522][03784] Avg episode reward: [(0, '0.487')] [2024-03-21 03:02:22,507][03995] Signal inference workers to stop experience collection... (14450 times) [2024-03-21 03:02:22,508][03995] Signal inference workers to resume experience collection... (14450 times) [2024-03-21 03:02:22,578][04017] InferenceWorker_p0-w0: stopping experience collection (14450 times) [2024-03-21 03:02:22,578][04017] InferenceWorker_p0-w0: resuming experience collection (14450 times) [2024-03-21 03:02:22,852][04017] Updated weights for policy 0, policy_version 21946 (0.0011) [2024-03-21 03:02:25,521][03784] Fps is (10 sec: 52429.4, 60 sec: 50244.3, 300 sec: 46541.7). Total num frames: 719192064. Throughput: 0: 47860.2. Samples: 720567300. Policy #0 lag: (min: 0.0, avg: 26.9, max: 70.0) [2024-03-21 03:02:25,522][03784] Avg episode reward: [(0, '0.805')] [2024-03-21 03:02:30,521][03784] Fps is (10 sec: 42597.7, 60 sec: 48605.8, 300 sec: 46874.9). Total num frames: 719421440. Throughput: 0: 48000.0. Samples: 720712700. Policy #0 lag: (min: 0.0, avg: 29.1, max: 61.0) [2024-03-21 03:02:30,522][03784] Avg episode reward: [(0, '0.784')] [2024-03-21 03:02:30,716][04017] Updated weights for policy 0, policy_version 21956 (0.0015) [2024-03-21 03:02:35,521][03784] Fps is (10 sec: 52428.4, 60 sec: 49152.0, 300 sec: 47319.2). Total num frames: 719716352. Throughput: 0: 47808.9. Samples: 720996800. Policy #0 lag: (min: 0.0, avg: 29.1, max: 61.0) [2024-03-21 03:02:35,522][03784] Avg episode reward: [(0, '1.307')] [2024-03-21 03:02:38,222][04017] Updated weights for policy 0, policy_version 21966 (0.0017) [2024-03-21 03:02:40,521][03784] Fps is (10 sec: 55706.6, 60 sec: 49152.1, 300 sec: 46874.9). Total num frames: 719978496. Throughput: 0: 47840.0. Samples: 721273500. Policy #0 lag: (min: 0.0, avg: 29.1, max: 61.0) [2024-03-21 03:02:40,522][03784] Avg episode reward: [(0, '1.069')] [2024-03-21 03:02:42,787][04017] Updated weights for policy 0, policy_version 21976 (0.0020) [2024-03-21 03:02:45,521][03784] Fps is (10 sec: 55706.1, 60 sec: 48606.0, 300 sec: 46874.9). Total num frames: 720273408. Throughput: 0: 47844.5. Samples: 721413300. Policy #0 lag: (min: 1.0, avg: 48.0, max: 102.0) [2024-03-21 03:02:45,522][03784] Avg episode reward: [(0, '0.858')] [2024-03-21 03:02:49,282][04017] Updated weights for policy 0, policy_version 21986 (0.0010) [2024-03-21 03:02:50,521][03784] Fps is (10 sec: 49151.6, 60 sec: 46967.5, 300 sec: 47208.1). Total num frames: 720470016. Throughput: 0: 47688.9. Samples: 721705100. Policy #0 lag: (min: 1.0, avg: 48.0, max: 102.0) [2024-03-21 03:02:50,522][03784] Avg episode reward: [(0, '0.858')] [2024-03-21 03:02:55,521][03784] Fps is (10 sec: 19660.8, 60 sec: 43690.7, 300 sec: 46652.8). Total num frames: 720470016. Throughput: 0: 47442.4. Samples: 721996200. Policy #0 lag: (min: 1.0, avg: 48.0, max: 102.0) [2024-03-21 03:02:55,522][03784] Avg episode reward: [(0, '0.975')] [2024-03-21 03:03:00,521][03784] Fps is (10 sec: 19660.8, 60 sec: 43144.5, 300 sec: 46430.6). Total num frames: 720666624. Throughput: 0: 46875.6. Samples: 722122500. Policy #0 lag: (min: 0.0, avg: 26.5, max: 64.0) [2024-03-21 03:03:00,522][03784] Avg episode reward: [(0, '0.585')] [2024-03-21 03:03:03,485][04017] Updated weights for policy 0, policy_version 21996 (0.0017) [2024-03-21 03:03:05,521][03784] Fps is (10 sec: 42598.5, 60 sec: 40960.1, 300 sec: 46430.6). Total num frames: 720896000. Throughput: 0: 46628.9. Samples: 722379000. Policy #0 lag: (min: 0.0, avg: 26.5, max: 64.0) [2024-03-21 03:03:05,522][03784] Avg episode reward: [(0, '1.322')] [2024-03-21 03:03:07,535][04017] Updated weights for policy 0, policy_version 22006 (0.0017) [2024-03-21 03:03:10,521][03784] Fps is (10 sec: 58982.6, 60 sec: 46421.4, 300 sec: 46874.9). Total num frames: 721256448. Throughput: 0: 45542.2. Samples: 722616700. Policy #0 lag: (min: 0.0, avg: 26.5, max: 64.0) [2024-03-21 03:03:10,522][03784] Avg episode reward: [(0, '0.764')] [2024-03-21 03:03:12,698][04017] Updated weights for policy 0, policy_version 22016 (0.0015) [2024-03-21 03:03:14,658][03995] Signal inference workers to stop experience collection... (14500 times) [2024-03-21 03:03:14,731][03995] Signal inference workers to resume experience collection... (14500 times) [2024-03-21 03:03:14,737][04017] InferenceWorker_p0-w0: stopping experience collection (14500 times) [2024-03-21 03:03:14,787][04017] InferenceWorker_p0-w0: resuming experience collection (14500 times) [2024-03-21 03:03:15,521][03784] Fps is (10 sec: 65535.6, 60 sec: 48059.8, 300 sec: 47208.1). Total num frames: 721551360. Throughput: 0: 45504.6. Samples: 722760400. Policy #0 lag: (min: 0.0, avg: 26.5, max: 64.0) [2024-03-21 03:03:15,522][03784] Avg episode reward: [(0, '1.442')] [2024-03-21 03:03:18,505][04017] Updated weights for policy 0, policy_version 22026 (0.0011) [2024-03-21 03:03:20,521][03784] Fps is (10 sec: 58982.2, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 721846272. Throughput: 0: 45708.9. Samples: 723053700. Policy #0 lag: (min: 0.0, avg: 43.5, max: 106.0) [2024-03-21 03:03:20,522][03784] Avg episode reward: [(0, '0.630')] [2024-03-21 03:03:25,303][04017] Updated weights for policy 0, policy_version 22036 (0.0016) [2024-03-21 03:03:25,521][03784] Fps is (10 sec: 52428.9, 60 sec: 48059.7, 300 sec: 47541.4). Total num frames: 722075648. Throughput: 0: 45466.7. Samples: 723319500. Policy #0 lag: (min: 0.0, avg: 43.5, max: 106.0) [2024-03-21 03:03:25,522][03784] Avg episode reward: [(0, '0.583')] [2024-03-21 03:03:30,521][03784] Fps is (10 sec: 45875.0, 60 sec: 48059.8, 300 sec: 47319.2). Total num frames: 722305024. Throughput: 0: 45211.0. Samples: 723447800. Policy #0 lag: (min: 0.0, avg: 43.5, max: 106.0) [2024-03-21 03:03:30,522][03784] Avg episode reward: [(0, '0.561')] [2024-03-21 03:03:31,331][04017] Updated weights for policy 0, policy_version 22046 (0.0012) [2024-03-21 03:03:35,521][03784] Fps is (10 sec: 39321.7, 60 sec: 45875.3, 300 sec: 46763.8). Total num frames: 722468864. Throughput: 0: 45064.5. Samples: 723733000. Policy #0 lag: (min: 0.0, avg: 37.2, max: 103.0) [2024-03-21 03:03:35,522][03784] Avg episode reward: [(0, '1.329')] [2024-03-21 03:03:40,521][03784] Fps is (10 sec: 36045.2, 60 sec: 44782.9, 300 sec: 46541.7). Total num frames: 722665472. Throughput: 0: 45282.2. Samples: 724033900. Policy #0 lag: (min: 0.0, avg: 37.2, max: 103.0) [2024-03-21 03:03:40,522][03784] Avg episode reward: [(0, '0.615')] [2024-03-21 03:03:41,474][04017] Updated weights for policy 0, policy_version 22056 (0.0011) [2024-03-21 03:03:45,521][03784] Fps is (10 sec: 39321.1, 60 sec: 43144.4, 300 sec: 46652.7). Total num frames: 722862080. Throughput: 0: 45688.8. Samples: 724178500. Policy #0 lag: (min: 0.0, avg: 37.2, max: 103.0) [2024-03-21 03:03:45,522][03784] Avg episode reward: [(0, '1.290')] [2024-03-21 03:03:47,782][04017] Updated weights for policy 0, policy_version 22066 (0.0015) [2024-03-21 03:03:50,521][03784] Fps is (10 sec: 45874.9, 60 sec: 44236.8, 300 sec: 46652.7). Total num frames: 723124224. Throughput: 0: 46446.6. Samples: 724469100. Policy #0 lag: (min: 0.0, avg: 37.2, max: 103.0) [2024-03-21 03:03:50,522][03784] Avg episode reward: [(0, '0.991')] [2024-03-21 03:03:55,521][03784] Fps is (10 sec: 49152.5, 60 sec: 48059.7, 300 sec: 46652.7). Total num frames: 723353600. Throughput: 0: 46988.9. Samples: 724731200. Policy #0 lag: (min: 0.0, avg: 35.0, max: 81.0) [2024-03-21 03:03:55,522][03784] Avg episode reward: [(0, '0.641')] [2024-03-21 03:03:55,792][04017] Updated weights for policy 0, policy_version 22076 (0.0014) [2024-03-21 03:04:00,521][03784] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 46430.6). Total num frames: 723615744. Throughput: 0: 46842.1. Samples: 724868300. Policy #0 lag: (min: 0.0, avg: 35.0, max: 81.0) [2024-03-21 03:04:00,522][03784] Avg episode reward: [(0, '1.178')] [2024-03-21 03:04:00,630][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000022084_723648512.pth... [2024-03-21 03:04:00,772][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000021739_712343552.pth [2024-03-21 03:04:01,524][04017] Updated weights for policy 0, policy_version 22086 (0.0026) [2024-03-21 03:04:03,976][03995] Signal inference workers to stop experience collection... (14550 times) [2024-03-21 03:04:04,040][04017] InferenceWorker_p0-w0: stopping experience collection (14550 times) [2024-03-21 03:04:04,047][03995] Signal inference workers to resume experience collection... (14550 times) [2024-03-21 03:04:04,097][04017] InferenceWorker_p0-w0: resuming experience collection (14550 times) [2024-03-21 03:04:05,521][03784] Fps is (10 sec: 58982.0, 60 sec: 50790.3, 300 sec: 47097.1). Total num frames: 723943424. Throughput: 0: 47033.3. Samples: 725170200. Policy #0 lag: (min: 0.0, avg: 35.0, max: 81.0) [2024-03-21 03:04:05,522][03784] Avg episode reward: [(0, '1.178')] [2024-03-21 03:04:09,491][04017] Updated weights for policy 0, policy_version 22097 (0.0015) [2024-03-21 03:04:10,521][03784] Fps is (10 sec: 49152.5, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 724107264. Throughput: 0: 47491.1. Samples: 725456600. Policy #0 lag: (min: 0.0, avg: 37.6, max: 73.0) [2024-03-21 03:04:10,522][03784] Avg episode reward: [(0, '1.178')] [2024-03-21 03:04:15,521][03784] Fps is (10 sec: 42598.6, 60 sec: 46967.5, 300 sec: 47319.2). Total num frames: 724369408. Throughput: 0: 47629.0. Samples: 725591100. Policy #0 lag: (min: 0.0, avg: 37.6, max: 73.0) [2024-03-21 03:04:15,522][03784] Avg episode reward: [(0, '1.169')] [2024-03-21 03:04:19,305][04017] Updated weights for policy 0, policy_version 22107 (0.0016) [2024-03-21 03:04:20,521][03784] Fps is (10 sec: 29491.1, 60 sec: 42598.4, 300 sec: 46874.9). Total num frames: 724402176. Throughput: 0: 47648.8. Samples: 725877200. Policy #0 lag: (min: 0.0, avg: 37.6, max: 73.0) [2024-03-21 03:04:20,522][03784] Avg episode reward: [(0, '1.071')] [2024-03-21 03:04:25,521][03784] Fps is (10 sec: 19660.7, 60 sec: 41506.1, 300 sec: 46652.8). Total num frames: 724566016. Throughput: 0: 47168.8. Samples: 726156500. Policy #0 lag: (min: 0.0, avg: 38.2, max: 81.0) [2024-03-21 03:04:25,522][03784] Avg episode reward: [(0, '1.336')] [2024-03-21 03:04:27,544][04017] Updated weights for policy 0, policy_version 22117 (0.0010) [2024-03-21 03:04:30,521][03784] Fps is (10 sec: 49151.8, 60 sec: 43144.5, 300 sec: 46874.9). Total num frames: 724893696. Throughput: 0: 46560.0. Samples: 726273700. Policy #0 lag: (min: 0.0, avg: 38.2, max: 81.0) [2024-03-21 03:04:30,522][03784] Avg episode reward: [(0, '1.111')] [2024-03-21 03:04:34,692][04017] Updated weights for policy 0, policy_version 22127 (0.0012) [2024-03-21 03:04:35,521][03784] Fps is (10 sec: 55705.7, 60 sec: 44236.7, 300 sec: 46763.8). Total num frames: 725123072. Throughput: 0: 46895.5. Samples: 726579400. Policy #0 lag: (min: 0.0, avg: 38.2, max: 81.0) [2024-03-21 03:04:35,522][03784] Avg episode reward: [(0, '1.111')] [2024-03-21 03:04:39,424][04017] Updated weights for policy 0, policy_version 22137 (0.0018) [2024-03-21 03:04:40,521][03784] Fps is (10 sec: 58981.0, 60 sec: 46967.2, 300 sec: 46985.9). Total num frames: 725483520. Throughput: 0: 46370.8. Samples: 726817900. Policy #0 lag: (min: 1.0, avg: 32.9, max: 74.0) [2024-03-21 03:04:40,523][03784] Avg episode reward: [(0, '0.940')] [2024-03-21 03:04:45,049][04017] Updated weights for policy 0, policy_version 22147 (0.0017) [2024-03-21 03:04:45,521][03784] Fps is (10 sec: 62258.7, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 725745664. Throughput: 0: 46686.6. Samples: 726969200. Policy #0 lag: (min: 1.0, avg: 32.9, max: 74.0) [2024-03-21 03:04:45,522][03784] Avg episode reward: [(0, '1.232')] [2024-03-21 03:04:50,521][03784] Fps is (10 sec: 49152.3, 60 sec: 47513.4, 300 sec: 46541.6). Total num frames: 725975040. Throughput: 0: 45835.4. Samples: 727232800. Policy #0 lag: (min: 1.0, avg: 32.9, max: 74.0) [2024-03-21 03:04:50,523][03784] Avg episode reward: [(0, '0.754')] [2024-03-21 03:04:51,192][04017] Updated weights for policy 0, policy_version 22157 (0.0016) [2024-03-21 03:04:51,452][03995] Signal inference workers to stop experience collection... (14600 times) [2024-03-21 03:04:51,515][03995] Signal inference workers to resume experience collection... (14600 times) [2024-03-21 03:04:51,558][04017] InferenceWorker_p0-w0: stopping experience collection (14600 times) [2024-03-21 03:04:51,601][04017] InferenceWorker_p0-w0: resuming experience collection (14600 times) [2024-03-21 03:04:54,366][04017] Updated weights for policy 0, policy_version 22167 (0.0016) [2024-03-21 03:04:55,521][03784] Fps is (10 sec: 65536.8, 60 sec: 50790.4, 300 sec: 47652.5). Total num frames: 726401024. Throughput: 0: 45084.4. Samples: 727485400. Policy #0 lag: (min: 1.0, avg: 32.9, max: 74.0) [2024-03-21 03:04:55,522][03784] Avg episode reward: [(0, '1.412')] [2024-03-21 03:05:00,521][03784] Fps is (10 sec: 58983.4, 60 sec: 49152.0, 300 sec: 47763.5). Total num frames: 726564864. Throughput: 0: 45580.0. Samples: 727642200. Policy #0 lag: (min: 0.0, avg: 50.1, max: 106.0) [2024-03-21 03:05:00,522][03784] Avg episode reward: [(0, '1.374')] [2024-03-21 03:05:02,074][04017] Updated weights for policy 0, policy_version 22177 (0.0019) [2024-03-21 03:05:05,521][03784] Fps is (10 sec: 42598.3, 60 sec: 48059.8, 300 sec: 47763.5). Total num frames: 726827008. Throughput: 0: 46166.6. Samples: 727954700. Policy #0 lag: (min: 0.0, avg: 50.1, max: 106.0) [2024-03-21 03:05:05,522][03784] Avg episode reward: [(0, '1.374')] [2024-03-21 03:05:09,972][04017] Updated weights for policy 0, policy_version 22187 (0.0010) [2024-03-21 03:05:10,521][03784] Fps is (10 sec: 45875.4, 60 sec: 48605.8, 300 sec: 47541.4). Total num frames: 727023616. Throughput: 0: 46437.8. Samples: 728246200. Policy #0 lag: (min: 0.0, avg: 50.1, max: 106.0) [2024-03-21 03:05:10,522][03784] Avg episode reward: [(0, '1.374')] [2024-03-21 03:05:15,521][03784] Fps is (10 sec: 22937.6, 60 sec: 44782.9, 300 sec: 46986.0). Total num frames: 727056384. Throughput: 0: 47448.9. Samples: 728408900. Policy #0 lag: (min: 0.0, avg: 40.5, max: 84.0) [2024-03-21 03:05:15,522][03784] Avg episode reward: [(0, '1.374')] [2024-03-21 03:05:20,521][03784] Fps is (10 sec: 19660.8, 60 sec: 46967.4, 300 sec: 46652.7). Total num frames: 727220224. Throughput: 0: 47255.6. Samples: 728705900. Policy #0 lag: (min: 0.0, avg: 40.5, max: 84.0) [2024-03-21 03:05:20,522][03784] Avg episode reward: [(0, '0.558')] [2024-03-21 03:05:22,917][04017] Updated weights for policy 0, policy_version 22197 (0.0011) [2024-03-21 03:05:25,521][03784] Fps is (10 sec: 42598.7, 60 sec: 48605.9, 300 sec: 46541.7). Total num frames: 727482368. Throughput: 0: 48744.8. Samples: 729011400. Policy #0 lag: (min: 0.0, avg: 40.5, max: 84.0) [2024-03-21 03:05:25,522][03784] Avg episode reward: [(0, '1.710')] [2024-03-21 03:05:25,522][03995] Saving new best policy, reward=1.710! [2024-03-21 03:05:30,392][04017] Updated weights for policy 0, policy_version 22207 (0.0017) [2024-03-21 03:05:30,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46421.3, 300 sec: 46430.6). Total num frames: 727678976. Throughput: 0: 48706.7. Samples: 729161000. Policy #0 lag: (min: 0.0, avg: 40.5, max: 84.0) [2024-03-21 03:05:30,522][03784] Avg episode reward: [(0, '0.864')] [2024-03-21 03:05:33,787][04017] Updated weights for policy 0, policy_version 22217 (0.0017) [2024-03-21 03:05:35,521][03784] Fps is (10 sec: 62258.8, 60 sec: 49698.1, 300 sec: 47430.3). Total num frames: 728104960. Throughput: 0: 48582.4. Samples: 729419000. Policy #0 lag: (min: 3.0, avg: 23.3, max: 93.0) [2024-03-21 03:05:35,522][03784] Avg episode reward: [(0, '0.933')] [2024-03-21 03:05:39,016][04017] Updated weights for policy 0, policy_version 22227 (0.0011) [2024-03-21 03:05:40,521][03784] Fps is (10 sec: 78642.1, 60 sec: 49698.2, 300 sec: 47652.4). Total num frames: 728465408. Throughput: 0: 48939.8. Samples: 729687700. Policy #0 lag: (min: 3.0, avg: 23.3, max: 93.0) [2024-03-21 03:05:40,522][03784] Avg episode reward: [(0, '1.024')] [2024-03-21 03:05:40,590][03995] Signal inference workers to stop experience collection... (14650 times) [2024-03-21 03:05:40,658][04017] InferenceWorker_p0-w0: stopping experience collection (14650 times) [2024-03-21 03:05:40,723][03995] Signal inference workers to resume experience collection... (14650 times) [2024-03-21 03:05:40,723][04017] InferenceWorker_p0-w0: resuming experience collection (14650 times) [2024-03-21 03:05:44,863][04017] Updated weights for policy 0, policy_version 22237 (0.0017) [2024-03-21 03:05:45,521][03784] Fps is (10 sec: 62259.4, 60 sec: 49698.2, 300 sec: 47763.5). Total num frames: 728727552. Throughput: 0: 48555.6. Samples: 729827200. Policy #0 lag: (min: 3.0, avg: 23.3, max: 93.0) [2024-03-21 03:05:45,522][03784] Avg episode reward: [(0, '1.094')] [2024-03-21 03:05:50,521][03784] Fps is (10 sec: 49152.8, 60 sec: 49698.3, 300 sec: 47319.2). Total num frames: 728956928. Throughput: 0: 47651.1. Samples: 730099000. Policy #0 lag: (min: 1.0, avg: 49.8, max: 97.0) [2024-03-21 03:05:50,522][03784] Avg episode reward: [(0, '1.022')] [2024-03-21 03:05:51,014][04017] Updated weights for policy 0, policy_version 22247 (0.0020) [2024-03-21 03:05:55,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 729186304. Throughput: 0: 47904.5. Samples: 730401900. Policy #0 lag: (min: 1.0, avg: 49.8, max: 97.0) [2024-03-21 03:05:55,522][03784] Avg episode reward: [(0, '1.022')] [2024-03-21 03:06:00,521][03784] Fps is (10 sec: 32768.3, 60 sec: 45329.1, 300 sec: 46541.7). Total num frames: 729284608. Throughput: 0: 47469.0. Samples: 730545000. Policy #0 lag: (min: 1.0, avg: 49.8, max: 97.0) [2024-03-21 03:06:00,522][03784] Avg episode reward: [(0, '1.267')] [2024-03-21 03:06:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000022256_729284608.pth... [2024-03-21 03:06:00,670][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000021914_718077952.pth [2024-03-21 03:06:04,263][04017] Updated weights for policy 0, policy_version 22257 (0.0011) [2024-03-21 03:06:05,521][03784] Fps is (10 sec: 13107.2, 60 sec: 41506.1, 300 sec: 46430.6). Total num frames: 729317376. Throughput: 0: 47615.6. Samples: 730848600. Policy #0 lag: (min: 0.0, avg: 41.0, max: 124.0) [2024-03-21 03:06:05,522][03784] Avg episode reward: [(0, '0.873')] [2024-03-21 03:06:10,521][03784] Fps is (10 sec: 29490.9, 60 sec: 42598.4, 300 sec: 46652.7). Total num frames: 729579520. Throughput: 0: 46864.4. Samples: 731120300. Policy #0 lag: (min: 0.0, avg: 41.0, max: 124.0) [2024-03-21 03:06:10,522][03784] Avg episode reward: [(0, '1.157')] [2024-03-21 03:06:11,273][04017] Updated weights for policy 0, policy_version 22267 (0.0017) [2024-03-21 03:06:15,488][04017] Updated weights for policy 0, policy_version 22277 (0.0015) [2024-03-21 03:06:15,521][03784] Fps is (10 sec: 65535.4, 60 sec: 48605.8, 300 sec: 47430.3). Total num frames: 729972736. Throughput: 0: 46515.5. Samples: 731254200. Policy #0 lag: (min: 0.0, avg: 41.0, max: 124.0) [2024-03-21 03:06:15,522][03784] Avg episode reward: [(0, '0.969')] [2024-03-21 03:06:20,521][03784] Fps is (10 sec: 62259.7, 60 sec: 49698.2, 300 sec: 47541.4). Total num frames: 730202112. Throughput: 0: 47369.0. Samples: 731550600. Policy #0 lag: (min: 0.0, avg: 69.1, max: 120.0) [2024-03-21 03:06:20,522][03784] Avg episode reward: [(0, '1.530')] [2024-03-21 03:06:21,313][04017] Updated weights for policy 0, policy_version 22287 (0.0011) [2024-03-21 03:06:25,521][03784] Fps is (10 sec: 49152.6, 60 sec: 49698.1, 300 sec: 47319.2). Total num frames: 730464256. Throughput: 0: 47464.6. Samples: 731823600. Policy #0 lag: (min: 0.0, avg: 69.1, max: 120.0) [2024-03-21 03:06:25,522][03784] Avg episode reward: [(0, '1.478')] [2024-03-21 03:06:28,292][04017] Updated weights for policy 0, policy_version 22297 (0.0016) [2024-03-21 03:06:30,521][03784] Fps is (10 sec: 65535.9, 60 sec: 52975.0, 300 sec: 47763.5). Total num frames: 730857472. Throughput: 0: 47364.5. Samples: 731958600. Policy #0 lag: (min: 0.0, avg: 69.1, max: 120.0) [2024-03-21 03:06:30,522][03784] Avg episode reward: [(0, '0.855')] [2024-03-21 03:06:33,878][04017] Updated weights for policy 0, policy_version 22307 (0.0021) [2024-03-21 03:06:35,035][03995] Signal inference workers to stop experience collection... (14700 times) [2024-03-21 03:06:35,035][03995] Signal inference workers to resume experience collection... (14700 times) [2024-03-21 03:06:35,099][04017] InferenceWorker_p0-w0: stopping experience collection (14700 times) [2024-03-21 03:06:35,099][04017] InferenceWorker_p0-w0: resuming experience collection (14700 times) [2024-03-21 03:06:35,521][03784] Fps is (10 sec: 55705.8, 60 sec: 48605.9, 300 sec: 47430.3). Total num frames: 731021312. Throughput: 0: 47429.0. Samples: 732233300. Policy #0 lag: (min: 1.0, avg: 38.2, max: 72.0) [2024-03-21 03:06:35,522][03784] Avg episode reward: [(0, '1.074')] [2024-03-21 03:06:40,368][04017] Updated weights for policy 0, policy_version 22317 (0.0020) [2024-03-21 03:06:40,521][03784] Fps is (10 sec: 42598.0, 60 sec: 46967.5, 300 sec: 47208.1). Total num frames: 731283456. Throughput: 0: 47111.0. Samples: 732521900. Policy #0 lag: (min: 1.0, avg: 38.2, max: 72.0) [2024-03-21 03:06:40,522][03784] Avg episode reward: [(0, '1.131')] [2024-03-21 03:06:45,521][03784] Fps is (10 sec: 58981.7, 60 sec: 48059.7, 300 sec: 47319.2). Total num frames: 731611136. Throughput: 0: 47031.0. Samples: 732661400. Policy #0 lag: (min: 1.0, avg: 38.2, max: 72.0) [2024-03-21 03:06:45,522][03784] Avg episode reward: [(0, '0.473')] [2024-03-21 03:06:45,545][04017] Updated weights for policy 0, policy_version 22327 (0.0010) [2024-03-21 03:06:50,521][03784] Fps is (10 sec: 45875.7, 60 sec: 46421.4, 300 sec: 47097.1). Total num frames: 731742208. Throughput: 0: 47102.2. Samples: 732968200. Policy #0 lag: (min: 1.0, avg: 38.2, max: 72.0) [2024-03-21 03:06:50,522][03784] Avg episode reward: [(0, '1.011')] [2024-03-21 03:06:55,521][03784] Fps is (10 sec: 26214.7, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 731873280. Throughput: 0: 47944.5. Samples: 733277800. Policy #0 lag: (min: 0.0, avg: 37.3, max: 78.0) [2024-03-21 03:06:55,522][03784] Avg episode reward: [(0, '0.932')] [2024-03-21 03:06:57,864][04017] Updated weights for policy 0, policy_version 22337 (0.0015) [2024-03-21 03:07:00,521][03784] Fps is (10 sec: 29490.9, 60 sec: 45875.1, 300 sec: 46097.4). Total num frames: 732037120. Throughput: 0: 48166.7. Samples: 733421700. Policy #0 lag: (min: 0.0, avg: 37.3, max: 78.0) [2024-03-21 03:07:00,522][03784] Avg episode reward: [(0, '1.029')] [2024-03-21 03:07:03,369][04017] Updated weights for policy 0, policy_version 22347 (0.0029) [2024-03-21 03:07:05,521][03784] Fps is (10 sec: 49151.6, 60 sec: 50790.4, 300 sec: 47097.1). Total num frames: 732364800. Throughput: 0: 47244.4. Samples: 733676600. Policy #0 lag: (min: 0.0, avg: 37.3, max: 78.0) [2024-03-21 03:07:05,522][03784] Avg episode reward: [(0, '1.051')] [2024-03-21 03:07:09,079][04017] Updated weights for policy 0, policy_version 22357 (0.0018) [2024-03-21 03:07:10,521][03784] Fps is (10 sec: 55706.3, 60 sec: 50244.3, 300 sec: 47208.1). Total num frames: 732594176. Throughput: 0: 47588.9. Samples: 733965100. Policy #0 lag: (min: 1.0, avg: 36.4, max: 82.0) [2024-03-21 03:07:10,522][03784] Avg episode reward: [(0, '0.894')] [2024-03-21 03:07:15,521][03784] Fps is (10 sec: 49152.5, 60 sec: 48059.9, 300 sec: 46986.0). Total num frames: 732856320. Throughput: 0: 47473.4. Samples: 734094900. Policy #0 lag: (min: 1.0, avg: 36.4, max: 82.0) [2024-03-21 03:07:15,522][03784] Avg episode reward: [(0, '1.266')] [2024-03-21 03:07:16,552][04017] Updated weights for policy 0, policy_version 22367 (0.0012) [2024-03-21 03:07:20,521][03784] Fps is (10 sec: 62258.7, 60 sec: 50244.2, 300 sec: 47541.4). Total num frames: 733216768. Throughput: 0: 47306.6. Samples: 734362100. Policy #0 lag: (min: 1.0, avg: 36.4, max: 82.0) [2024-03-21 03:07:20,522][03784] Avg episode reward: [(0, '1.148')] [2024-03-21 03:07:20,585][04017] Updated weights for policy 0, policy_version 22377 (0.0023) [2024-03-21 03:07:25,521][03784] Fps is (10 sec: 55705.5, 60 sec: 49152.0, 300 sec: 47430.3). Total num frames: 733413376. Throughput: 0: 47620.1. Samples: 734664800. Policy #0 lag: (min: 0.0, avg: 48.4, max: 103.0) [2024-03-21 03:07:25,522][03784] Avg episode reward: [(0, '1.217')] [2024-03-21 03:07:27,747][04017] Updated weights for policy 0, policy_version 22387 (0.0028) [2024-03-21 03:07:30,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46967.4, 300 sec: 47319.2). Total num frames: 733675520. Throughput: 0: 47697.8. Samples: 734807800. Policy #0 lag: (min: 0.0, avg: 48.4, max: 103.0) [2024-03-21 03:07:30,522][03784] Avg episode reward: [(0, '1.157')] [2024-03-21 03:07:33,595][03995] Signal inference workers to stop experience collection... (14750 times) [2024-03-21 03:07:33,595][03995] Signal inference workers to resume experience collection... (14750 times) [2024-03-21 03:07:33,642][04017] InferenceWorker_p0-w0: stopping experience collection (14750 times) [2024-03-21 03:07:33,642][04017] InferenceWorker_p0-w0: resuming experience collection (14750 times) [2024-03-21 03:07:35,521][03784] Fps is (10 sec: 32767.9, 60 sec: 45329.0, 300 sec: 46652.7). Total num frames: 733741056. Throughput: 0: 47157.8. Samples: 735090300. Policy #0 lag: (min: 0.0, avg: 48.4, max: 103.0) [2024-03-21 03:07:35,522][03784] Avg episode reward: [(0, '1.270')] [2024-03-21 03:07:40,521][03784] Fps is (10 sec: 19660.9, 60 sec: 43144.6, 300 sec: 46097.3). Total num frames: 733872128. Throughput: 0: 46564.4. Samples: 735373200. Policy #0 lag: (min: 0.0, avg: 34.1, max: 108.0) [2024-03-21 03:07:40,522][03784] Avg episode reward: [(0, '0.989')] [2024-03-21 03:07:42,774][04017] Updated weights for policy 0, policy_version 22397 (0.0011) [2024-03-21 03:07:45,521][03784] Fps is (10 sec: 26214.6, 60 sec: 39867.8, 300 sec: 45875.2). Total num frames: 734003200. Throughput: 0: 46351.3. Samples: 735507500. Policy #0 lag: (min: 0.0, avg: 34.1, max: 108.0) [2024-03-21 03:07:45,521][03784] Avg episode reward: [(0, '1.241')] [2024-03-21 03:07:47,583][04017] Updated weights for policy 0, policy_version 22407 (0.0020) [2024-03-21 03:07:50,521][03784] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 46986.0). Total num frames: 734330880. Throughput: 0: 46462.2. Samples: 735767400. Policy #0 lag: (min: 0.0, avg: 34.1, max: 108.0) [2024-03-21 03:07:50,522][03784] Avg episode reward: [(0, '0.540')] [2024-03-21 03:07:55,087][04017] Updated weights for policy 0, policy_version 22417 (0.0015) [2024-03-21 03:07:55,521][03784] Fps is (10 sec: 55705.0, 60 sec: 44782.9, 300 sec: 47097.1). Total num frames: 734560256. Throughput: 0: 46595.5. Samples: 736061900. Policy #0 lag: (min: 0.0, avg: 46.9, max: 105.0) [2024-03-21 03:07:55,522][03784] Avg episode reward: [(0, '0.780')] [2024-03-21 03:07:59,345][04017] Updated weights for policy 0, policy_version 22427 (0.0018) [2024-03-21 03:08:00,521][03784] Fps is (10 sec: 65536.0, 60 sec: 49152.0, 300 sec: 47763.5). Total num frames: 734986240. Throughput: 0: 46793.2. Samples: 736200600. Policy #0 lag: (min: 0.0, avg: 46.9, max: 105.0) [2024-03-21 03:08:00,522][03784] Avg episode reward: [(0, '0.847')] [2024-03-21 03:08:00,823][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000022432_735051776.pth... [2024-03-21 03:08:00,920][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000022084_723648512.pth [2024-03-21 03:08:02,604][04017] Updated weights for policy 0, policy_version 22437 (0.0012) [2024-03-21 03:08:05,521][03784] Fps is (10 sec: 85196.6, 60 sec: 50790.4, 300 sec: 47985.7). Total num frames: 735412224. Throughput: 0: 46255.6. Samples: 736443600. Policy #0 lag: (min: 0.0, avg: 46.9, max: 105.0) [2024-03-21 03:08:05,522][03784] Avg episode reward: [(0, '0.847')] [2024-03-21 03:08:10,521][03784] Fps is (10 sec: 52429.4, 60 sec: 48605.9, 300 sec: 47319.2). Total num frames: 735510528. Throughput: 0: 46062.2. Samples: 736737600. Policy #0 lag: (min: 0.0, avg: 46.9, max: 105.0) [2024-03-21 03:08:10,522][03784] Avg episode reward: [(0, '0.868')] [2024-03-21 03:08:12,482][04017] Updated weights for policy 0, policy_version 22447 (0.0019) [2024-03-21 03:08:15,521][03784] Fps is (10 sec: 26214.4, 60 sec: 46967.4, 300 sec: 46874.9). Total num frames: 735674368. Throughput: 0: 45948.9. Samples: 736875500. Policy #0 lag: (min: 0.0, avg: 53.3, max: 108.0) [2024-03-21 03:08:15,522][03784] Avg episode reward: [(0, '1.256')] [2024-03-21 03:08:20,417][04017] Updated weights for policy 0, policy_version 22457 (0.0011) [2024-03-21 03:08:20,521][03784] Fps is (10 sec: 36044.0, 60 sec: 44236.7, 300 sec: 46763.8). Total num frames: 735870976. Throughput: 0: 46026.5. Samples: 737161500. Policy #0 lag: (min: 0.0, avg: 53.3, max: 108.0) [2024-03-21 03:08:20,522][03784] Avg episode reward: [(0, '0.476')] [2024-03-21 03:08:24,583][03995] Signal inference workers to stop experience collection... (14800 times) [2024-03-21 03:08:24,647][04017] InferenceWorker_p0-w0: stopping experience collection (14800 times) [2024-03-21 03:08:24,805][03995] Signal inference workers to resume experience collection... (14800 times) [2024-03-21 03:08:24,805][04017] InferenceWorker_p0-w0: resuming experience collection (14800 times) [2024-03-21 03:08:25,357][04017] Updated weights for policy 0, policy_version 22467 (0.0011) [2024-03-21 03:08:25,521][03784] Fps is (10 sec: 52429.3, 60 sec: 46421.3, 300 sec: 47097.1). Total num frames: 736198656. Throughput: 0: 46142.3. Samples: 737449600. Policy #0 lag: (min: 0.0, avg: 53.3, max: 108.0) [2024-03-21 03:08:25,522][03784] Avg episode reward: [(0, '0.748')] [2024-03-21 03:08:30,521][03784] Fps is (10 sec: 45876.1, 60 sec: 44236.8, 300 sec: 46986.0). Total num frames: 736329728. Throughput: 0: 46726.6. Samples: 737610200. Policy #0 lag: (min: 0.0, avg: 27.4, max: 70.0) [2024-03-21 03:08:30,522][03784] Avg episode reward: [(0, '1.112')] [2024-03-21 03:08:34,263][04017] Updated weights for policy 0, policy_version 22477 (0.0014) [2024-03-21 03:08:35,521][03784] Fps is (10 sec: 39321.3, 60 sec: 47513.6, 300 sec: 47208.1). Total num frames: 736591872. Throughput: 0: 47224.5. Samples: 737892500. Policy #0 lag: (min: 0.0, avg: 27.4, max: 70.0) [2024-03-21 03:08:35,522][03784] Avg episode reward: [(0, '1.243')] [2024-03-21 03:08:39,878][04017] Updated weights for policy 0, policy_version 22487 (0.0015) [2024-03-21 03:08:40,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49698.1, 300 sec: 47430.3). Total num frames: 736854016. Throughput: 0: 47093.4. Samples: 738181100. Policy #0 lag: (min: 0.0, avg: 27.4, max: 70.0) [2024-03-21 03:08:40,522][03784] Avg episode reward: [(0, '1.243')] [2024-03-21 03:08:45,521][03784] Fps is (10 sec: 49152.2, 60 sec: 51336.5, 300 sec: 47319.2). Total num frames: 737083392. Throughput: 0: 47469.0. Samples: 738336700. Policy #0 lag: (min: 2.0, avg: 32.5, max: 70.0) [2024-03-21 03:08:45,522][03784] Avg episode reward: [(0, '0.986')] [2024-03-21 03:08:46,614][04017] Updated weights for policy 0, policy_version 22497 (0.0011) [2024-03-21 03:08:50,521][03784] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 47541.4). Total num frames: 737378304. Throughput: 0: 48333.3. Samples: 738618600. Policy #0 lag: (min: 2.0, avg: 32.5, max: 70.0) [2024-03-21 03:08:50,522][03784] Avg episode reward: [(0, '0.986')] [2024-03-21 03:08:54,026][04017] Updated weights for policy 0, policy_version 22507 (0.0013) [2024-03-21 03:08:55,521][03784] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 47319.2). Total num frames: 737574912. Throughput: 0: 47886.6. Samples: 738892500. Policy #0 lag: (min: 2.0, avg: 32.5, max: 70.0) [2024-03-21 03:08:55,522][03784] Avg episode reward: [(0, '1.211')] [2024-03-21 03:09:00,521][03784] Fps is (10 sec: 32767.8, 60 sec: 45329.0, 300 sec: 46652.7). Total num frames: 737705984. Throughput: 0: 47668.8. Samples: 739020600. Policy #0 lag: (min: 0.0, avg: 29.3, max: 64.0) [2024-03-21 03:09:00,522][03784] Avg episode reward: [(0, '1.180')] [2024-03-21 03:09:03,044][04017] Updated weights for policy 0, policy_version 22517 (0.0020) [2024-03-21 03:09:05,521][03784] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 46986.0). Total num frames: 737968128. Throughput: 0: 46809.1. Samples: 739267900. Policy #0 lag: (min: 0.0, avg: 29.3, max: 64.0) [2024-03-21 03:09:05,522][03784] Avg episode reward: [(0, '0.814')] [2024-03-21 03:09:10,521][03784] Fps is (10 sec: 32768.1, 60 sec: 42052.2, 300 sec: 46319.5). Total num frames: 738033664. Throughput: 0: 46915.4. Samples: 739560800. Policy #0 lag: (min: 0.0, avg: 29.3, max: 64.0) [2024-03-21 03:09:10,522][03784] Avg episode reward: [(0, '1.358')] [2024-03-21 03:09:13,532][04017] Updated weights for policy 0, policy_version 22527 (0.0020) [2024-03-21 03:09:15,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44236.9, 300 sec: 47208.1). Total num frames: 738328576. Throughput: 0: 46380.0. Samples: 739697300. Policy #0 lag: (min: 0.0, avg: 29.3, max: 64.0) [2024-03-21 03:09:15,521][03784] Avg episode reward: [(0, '1.235')] [2024-03-21 03:09:18,051][04017] Updated weights for policy 0, policy_version 22537 (0.0018) [2024-03-21 03:09:20,521][03784] Fps is (10 sec: 65536.7, 60 sec: 46967.6, 300 sec: 47874.6). Total num frames: 738689024. Throughput: 0: 45593.4. Samples: 739944200. Policy #0 lag: (min: 3.0, avg: 56.4, max: 111.0) [2024-03-21 03:09:20,522][03784] Avg episode reward: [(0, '1.191')] [2024-03-21 03:09:24,073][04017] Updated weights for policy 0, policy_version 22547 (0.0012) [2024-03-21 03:09:24,072][03995] Signal inference workers to stop experience collection... (14850 times) [2024-03-21 03:09:24,079][03995] Signal inference workers to resume experience collection... (14850 times) [2024-03-21 03:09:24,153][04017] InferenceWorker_p0-w0: stopping experience collection (14850 times) [2024-03-21 03:09:24,153][04017] InferenceWorker_p0-w0: resuming experience collection (14850 times) [2024-03-21 03:09:25,521][03784] Fps is (10 sec: 62258.3, 60 sec: 45875.1, 300 sec: 47652.4). Total num frames: 738951168. Throughput: 0: 45411.0. Samples: 740224600. Policy #0 lag: (min: 3.0, avg: 56.4, max: 111.0) [2024-03-21 03:09:25,522][03784] Avg episode reward: [(0, '0.866')] [2024-03-21 03:09:29,334][04017] Updated weights for policy 0, policy_version 22557 (0.0010) [2024-03-21 03:09:30,521][03784] Fps is (10 sec: 45874.8, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 739147776. Throughput: 0: 45184.4. Samples: 740370000. Policy #0 lag: (min: 3.0, avg: 56.4, max: 111.0) [2024-03-21 03:09:30,522][03784] Avg episode reward: [(0, '1.099')] [2024-03-21 03:09:35,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45329.1, 300 sec: 46874.9). Total num frames: 739311616. Throughput: 0: 45368.9. Samples: 740660200. Policy #0 lag: (min: 0.0, avg: 41.4, max: 73.0) [2024-03-21 03:09:35,522][03784] Avg episode reward: [(0, '1.400')] [2024-03-21 03:09:38,031][04017] Updated weights for policy 0, policy_version 22567 (0.0028) [2024-03-21 03:09:40,521][03784] Fps is (10 sec: 49152.6, 60 sec: 46421.4, 300 sec: 47097.1). Total num frames: 739639296. Throughput: 0: 45609.0. Samples: 740944900. Policy #0 lag: (min: 0.0, avg: 41.4, max: 73.0) [2024-03-21 03:09:40,522][03784] Avg episode reward: [(0, '0.424')] [2024-03-21 03:09:42,241][04017] Updated weights for policy 0, policy_version 22577 (0.0011) [2024-03-21 03:09:45,521][03784] Fps is (10 sec: 72089.5, 60 sec: 49152.0, 300 sec: 47652.5). Total num frames: 740032512. Throughput: 0: 45740.1. Samples: 741078900. Policy #0 lag: (min: 0.0, avg: 41.4, max: 73.0) [2024-03-21 03:09:45,522][03784] Avg episode reward: [(0, '1.040')] [2024-03-21 03:09:46,525][04017] Updated weights for policy 0, policy_version 22587 (0.0012) [2024-03-21 03:09:50,521][03784] Fps is (10 sec: 65535.6, 60 sec: 48605.9, 300 sec: 47097.1). Total num frames: 740294656. Throughput: 0: 46855.5. Samples: 741376400. Policy #0 lag: (min: 1.0, avg: 54.7, max: 119.0) [2024-03-21 03:09:50,522][03784] Avg episode reward: [(0, '1.007')] [2024-03-21 03:09:55,521][03784] Fps is (10 sec: 36044.7, 60 sec: 46967.4, 300 sec: 46874.9). Total num frames: 740392960. Throughput: 0: 47217.8. Samples: 741685600. Policy #0 lag: (min: 1.0, avg: 54.7, max: 119.0) [2024-03-21 03:09:55,522][03784] Avg episode reward: [(0, '1.007')] [2024-03-21 03:09:57,778][04017] Updated weights for policy 0, policy_version 22597 (0.0017) [2024-03-21 03:10:00,521][03784] Fps is (10 sec: 19660.7, 60 sec: 46421.4, 300 sec: 46319.5). Total num frames: 740491264. Throughput: 0: 47231.0. Samples: 741822700. Policy #0 lag: (min: 1.0, avg: 54.7, max: 119.0) [2024-03-21 03:10:00,522][03784] Avg episode reward: [(0, '1.331')] [2024-03-21 03:10:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000022598_740491264.pth... [2024-03-21 03:10:00,663][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000022256_729284608.pth [2024-03-21 03:10:04,806][04017] Updated weights for policy 0, policy_version 22607 (0.0016) [2024-03-21 03:10:05,521][03784] Fps is (10 sec: 45875.0, 60 sec: 48059.6, 300 sec: 46874.9). Total num frames: 740851712. Throughput: 0: 48475.4. Samples: 742125600. Policy #0 lag: (min: 3.0, avg: 30.0, max: 70.0) [2024-03-21 03:10:05,522][03784] Avg episode reward: [(0, '1.214')] [2024-03-21 03:10:10,521][03784] Fps is (10 sec: 55705.7, 60 sec: 50244.3, 300 sec: 47430.3). Total num frames: 741048320. Throughput: 0: 48284.5. Samples: 742397400. Policy #0 lag: (min: 3.0, avg: 30.0, max: 70.0) [2024-03-21 03:10:10,522][03784] Avg episode reward: [(0, '1.273')] [2024-03-21 03:10:11,550][04017] Updated weights for policy 0, policy_version 22617 (0.0018) [2024-03-21 03:10:15,521][03784] Fps is (10 sec: 29491.4, 60 sec: 46967.4, 300 sec: 47208.1). Total num frames: 741146624. Throughput: 0: 48308.9. Samples: 742543900. Policy #0 lag: (min: 3.0, avg: 30.0, max: 70.0) [2024-03-21 03:10:15,522][03784] Avg episode reward: [(0, '1.405')] [2024-03-21 03:10:15,760][03995] Signal inference workers to stop experience collection... (14900 times) [2024-03-21 03:10:15,775][03995] Signal inference workers to resume experience collection... (14900 times) [2024-03-21 03:10:15,814][04017] InferenceWorker_p0-w0: stopping experience collection (14900 times) [2024-03-21 03:10:15,814][04017] InferenceWorker_p0-w0: resuming experience collection (14900 times) [2024-03-21 03:10:20,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45329.0, 300 sec: 47208.1). Total num frames: 741408768. Throughput: 0: 47866.7. Samples: 742814200. Policy #0 lag: (min: 3.0, avg: 30.0, max: 70.0) [2024-03-21 03:10:20,522][03784] Avg episode reward: [(0, '1.199')] [2024-03-21 03:10:21,004][04017] Updated weights for policy 0, policy_version 22627 (0.0017) [2024-03-21 03:10:24,311][04017] Updated weights for policy 0, policy_version 22637 (0.0028) [2024-03-21 03:10:25,521][03784] Fps is (10 sec: 75366.2, 60 sec: 49152.0, 300 sec: 48207.8). Total num frames: 741900288. Throughput: 0: 46919.9. Samples: 743056300. Policy #0 lag: (min: 0.0, avg: 38.4, max: 85.0) [2024-03-21 03:10:25,522][03784] Avg episode reward: [(0, '0.839')] [2024-03-21 03:10:30,521][03784] Fps is (10 sec: 65536.5, 60 sec: 48606.0, 300 sec: 47319.2). Total num frames: 742064128. Throughput: 0: 47382.3. Samples: 743211100. Policy #0 lag: (min: 0.0, avg: 38.4, max: 85.0) [2024-03-21 03:10:30,522][03784] Avg episode reward: [(0, '1.058')] [2024-03-21 03:10:31,655][04017] Updated weights for policy 0, policy_version 22647 (0.0015) [2024-03-21 03:10:35,521][03784] Fps is (10 sec: 49152.2, 60 sec: 51336.5, 300 sec: 47208.2). Total num frames: 742391808. Throughput: 0: 47042.2. Samples: 743493300. Policy #0 lag: (min: 0.0, avg: 38.4, max: 85.0) [2024-03-21 03:10:35,522][03784] Avg episode reward: [(0, '1.183')] [2024-03-21 03:10:37,378][04017] Updated weights for policy 0, policy_version 22657 (0.0015) [2024-03-21 03:10:40,521][03784] Fps is (10 sec: 45874.7, 60 sec: 48059.6, 300 sec: 46763.8). Total num frames: 742522880. Throughput: 0: 46531.1. Samples: 743779500. Policy #0 lag: (min: 0.0, avg: 54.0, max: 110.0) [2024-03-21 03:10:40,522][03784] Avg episode reward: [(0, '1.195')] [2024-03-21 03:10:45,521][03784] Fps is (10 sec: 29491.2, 60 sec: 44236.8, 300 sec: 46541.7). Total num frames: 742686720. Throughput: 0: 46864.5. Samples: 743931600. Policy #0 lag: (min: 0.0, avg: 54.0, max: 110.0) [2024-03-21 03:10:45,522][03784] Avg episode reward: [(0, '1.372')] [2024-03-21 03:10:47,530][04017] Updated weights for policy 0, policy_version 22667 (0.0011) [2024-03-21 03:10:50,521][03784] Fps is (10 sec: 49151.5, 60 sec: 45328.9, 300 sec: 46874.9). Total num frames: 743014400. Throughput: 0: 46542.2. Samples: 744220000. Policy #0 lag: (min: 0.0, avg: 54.0, max: 110.0) [2024-03-21 03:10:50,522][03784] Avg episode reward: [(0, '1.044')] [2024-03-21 03:10:54,390][04017] Updated weights for policy 0, policy_version 22677 (0.0016) [2024-03-21 03:10:55,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45329.2, 300 sec: 46874.9). Total num frames: 743112704. Throughput: 0: 46533.4. Samples: 744491400. Policy #0 lag: (min: 0.0, avg: 38.1, max: 81.0) [2024-03-21 03:10:55,522][03784] Avg episode reward: [(0, '0.914')] [2024-03-21 03:11:00,521][03784] Fps is (10 sec: 36045.2, 60 sec: 48059.7, 300 sec: 47652.4). Total num frames: 743374848. Throughput: 0: 46537.8. Samples: 744638100. Policy #0 lag: (min: 0.0, avg: 38.1, max: 81.0) [2024-03-21 03:11:00,522][03784] Avg episode reward: [(0, '0.738')] [2024-03-21 03:11:00,690][04017] Updated weights for policy 0, policy_version 22687 (0.0011) [2024-03-21 03:11:01,684][03995] Signal inference workers to stop experience collection... (14950 times) [2024-03-21 03:11:01,685][03995] Signal inference workers to resume experience collection... (14950 times) [2024-03-21 03:11:01,756][04017] InferenceWorker_p0-w0: stopping experience collection (14950 times) [2024-03-21 03:11:01,756][04017] InferenceWorker_p0-w0: resuming experience collection (14950 times) [2024-03-21 03:11:05,521][03784] Fps is (10 sec: 39321.4, 60 sec: 44236.9, 300 sec: 47208.1). Total num frames: 743505920. Throughput: 0: 47162.3. Samples: 744936500. Policy #0 lag: (min: 0.0, avg: 38.1, max: 81.0) [2024-03-21 03:11:05,522][03784] Avg episode reward: [(0, '1.064')] [2024-03-21 03:11:08,871][04017] Updated weights for policy 0, policy_version 22697 (0.0014) [2024-03-21 03:11:10,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45875.3, 300 sec: 46874.9). Total num frames: 743800832. Throughput: 0: 48104.5. Samples: 745221000. Policy #0 lag: (min: 0.0, avg: 27.0, max: 56.0) [2024-03-21 03:11:10,522][03784] Avg episode reward: [(0, '1.278')] [2024-03-21 03:11:15,521][03784] Fps is (10 sec: 49151.7, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 743997440. Throughput: 0: 47911.0. Samples: 745367100. Policy #0 lag: (min: 0.0, avg: 27.0, max: 56.0) [2024-03-21 03:11:15,522][03784] Avg episode reward: [(0, '0.897')] [2024-03-21 03:11:16,122][04017] Updated weights for policy 0, policy_version 22707 (0.0014) [2024-03-21 03:11:20,371][04017] Updated weights for policy 0, policy_version 22717 (0.0012) [2024-03-21 03:11:20,521][03784] Fps is (10 sec: 58981.6, 60 sec: 49698.1, 300 sec: 47208.1). Total num frames: 744390656. Throughput: 0: 48117.7. Samples: 745658600. Policy #0 lag: (min: 0.0, avg: 27.0, max: 56.0) [2024-03-21 03:11:20,522][03784] Avg episode reward: [(0, '0.908')] [2024-03-21 03:11:25,521][03784] Fps is (10 sec: 68813.6, 60 sec: 46421.4, 300 sec: 46874.9). Total num frames: 744685568. Throughput: 0: 48017.9. Samples: 745940300. Policy #0 lag: (min: 0.0, avg: 27.0, max: 56.0) [2024-03-21 03:11:25,521][03784] Avg episode reward: [(0, '0.681')] [2024-03-21 03:11:25,698][04017] Updated weights for policy 0, policy_version 22727 (0.0018) [2024-03-21 03:11:29,771][04017] Updated weights for policy 0, policy_version 22737 (0.0016) [2024-03-21 03:11:30,521][03784] Fps is (10 sec: 72090.0, 60 sec: 50790.3, 300 sec: 47763.5). Total num frames: 745111552. Throughput: 0: 47451.1. Samples: 746066900. Policy #0 lag: (min: 2.0, avg: 34.9, max: 61.0) [2024-03-21 03:11:30,522][03784] Avg episode reward: [(0, '0.688')] [2024-03-21 03:11:35,521][03784] Fps is (10 sec: 55704.8, 60 sec: 47513.5, 300 sec: 47319.2). Total num frames: 745242624. Throughput: 0: 47402.3. Samples: 746353100. Policy #0 lag: (min: 2.0, avg: 34.9, max: 61.0) [2024-03-21 03:11:35,522][03784] Avg episode reward: [(0, '1.178')] [2024-03-21 03:11:40,521][03784] Fps is (10 sec: 22937.7, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 745340928. Throughput: 0: 47817.7. Samples: 746643200. Policy #0 lag: (min: 2.0, avg: 34.9, max: 61.0) [2024-03-21 03:11:40,522][03784] Avg episode reward: [(0, '1.089')] [2024-03-21 03:11:41,650][04017] Updated weights for policy 0, policy_version 22747 (0.0015) [2024-03-21 03:11:45,521][03784] Fps is (10 sec: 29491.2, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 745537536. Throughput: 0: 47731.1. Samples: 746786000. Policy #0 lag: (min: 0.0, avg: 39.1, max: 90.0) [2024-03-21 03:11:45,522][03784] Avg episode reward: [(0, '1.108')] [2024-03-21 03:11:48,455][04017] Updated weights for policy 0, policy_version 22757 (0.0023) [2024-03-21 03:11:50,521][03784] Fps is (10 sec: 36044.5, 60 sec: 44783.0, 300 sec: 46874.9). Total num frames: 745701376. Throughput: 0: 47848.8. Samples: 747089700. Policy #0 lag: (min: 0.0, avg: 39.1, max: 90.0) [2024-03-21 03:11:50,522][03784] Avg episode reward: [(0, '1.267')] [2024-03-21 03:11:50,811][03995] Signal inference workers to stop experience collection... (15000 times) [2024-03-21 03:11:50,811][03995] Signal inference workers to resume experience collection... (15000 times) [2024-03-21 03:11:50,869][04017] InferenceWorker_p0-w0: stopping experience collection (15000 times) [2024-03-21 03:11:50,869][04017] InferenceWorker_p0-w0: resuming experience collection (15000 times) [2024-03-21 03:11:54,026][04017] Updated weights for policy 0, policy_version 22767 (0.0015) [2024-03-21 03:11:55,521][03784] Fps is (10 sec: 52428.9, 60 sec: 49151.9, 300 sec: 47541.4). Total num frames: 746061824. Throughput: 0: 48099.9. Samples: 747385500. Policy #0 lag: (min: 0.0, avg: 39.1, max: 90.0) [2024-03-21 03:11:55,522][03784] Avg episode reward: [(0, '1.210')] [2024-03-21 03:12:00,521][03784] Fps is (10 sec: 55705.7, 60 sec: 48059.7, 300 sec: 47097.1). Total num frames: 746258432. Throughput: 0: 48122.2. Samples: 747532600. Policy #0 lag: (min: 0.0, avg: 29.4, max: 81.0) [2024-03-21 03:12:00,522][03784] Avg episode reward: [(0, '1.130')] [2024-03-21 03:12:00,813][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000022775_746291200.pth... [2024-03-21 03:12:00,898][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000022432_735051776.pth [2024-03-21 03:12:03,296][04017] Updated weights for policy 0, policy_version 22777 (0.0014) [2024-03-21 03:12:05,521][03784] Fps is (10 sec: 45875.5, 60 sec: 50244.3, 300 sec: 47208.1). Total num frames: 746520576. Throughput: 0: 48609.0. Samples: 747846000. Policy #0 lag: (min: 0.0, avg: 29.4, max: 81.0) [2024-03-21 03:12:05,522][03784] Avg episode reward: [(0, '1.130')] [2024-03-21 03:12:07,067][04017] Updated weights for policy 0, policy_version 22787 (0.0013) [2024-03-21 03:12:10,521][03784] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 47097.0). Total num frames: 746749952. Throughput: 0: 48195.4. Samples: 748109100. Policy #0 lag: (min: 0.0, avg: 29.4, max: 81.0) [2024-03-21 03:12:10,522][03784] Avg episode reward: [(0, '1.647')] [2024-03-21 03:12:14,001][04017] Updated weights for policy 0, policy_version 22797 (0.0018) [2024-03-21 03:12:15,521][03784] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 46874.9). Total num frames: 747044864. Throughput: 0: 47888.9. Samples: 748221900. Policy #0 lag: (min: 3.0, avg: 38.6, max: 105.0) [2024-03-21 03:12:15,522][03784] Avg episode reward: [(0, '1.220')] [2024-03-21 03:12:20,521][03784] Fps is (10 sec: 49152.0, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 747241472. Throughput: 0: 47551.1. Samples: 748492900. Policy #0 lag: (min: 3.0, avg: 38.6, max: 105.0) [2024-03-21 03:12:20,523][03784] Avg episode reward: [(0, '0.650')] [2024-03-21 03:12:21,953][04017] Updated weights for policy 0, policy_version 22807 (0.0011) [2024-03-21 03:12:25,521][03784] Fps is (10 sec: 29491.6, 60 sec: 44236.8, 300 sec: 46319.5). Total num frames: 747339776. Throughput: 0: 47484.6. Samples: 748780000. Policy #0 lag: (min: 3.0, avg: 38.6, max: 105.0) [2024-03-21 03:12:25,522][03784] Avg episode reward: [(0, '0.729')] [2024-03-21 03:12:30,521][03784] Fps is (10 sec: 26214.4, 60 sec: 39867.7, 300 sec: 46652.7). Total num frames: 747503616. Throughput: 0: 47173.3. Samples: 748908800. Policy #0 lag: (min: 1.0, avg: 27.5, max: 107.0) [2024-03-21 03:12:30,522][03784] Avg episode reward: [(0, '1.401')] [2024-03-21 03:12:32,972][04017] Updated weights for policy 0, policy_version 22817 (0.0029) [2024-03-21 03:12:35,521][03784] Fps is (10 sec: 55705.0, 60 sec: 44236.8, 300 sec: 47541.4). Total num frames: 747896832. Throughput: 0: 45715.6. Samples: 749146900. Policy #0 lag: (min: 1.0, avg: 27.5, max: 107.0) [2024-03-21 03:12:35,522][03784] Avg episode reward: [(0, '1.270')] [2024-03-21 03:12:36,362][04017] Updated weights for policy 0, policy_version 22827 (0.0012) [2024-03-21 03:12:40,521][03784] Fps is (10 sec: 65536.2, 60 sec: 46967.4, 300 sec: 47985.7). Total num frames: 748158976. Throughput: 0: 45288.8. Samples: 749423500. Policy #0 lag: (min: 1.0, avg: 27.5, max: 107.0) [2024-03-21 03:12:40,522][03784] Avg episode reward: [(0, '1.167')] [2024-03-21 03:12:43,320][03995] Signal inference workers to stop experience collection... (15050 times) [2024-03-21 03:12:43,321][03995] Signal inference workers to resume experience collection... (15050 times) [2024-03-21 03:12:43,389][04017] InferenceWorker_p0-w0: stopping experience collection (15050 times) [2024-03-21 03:12:43,390][04017] InferenceWorker_p0-w0: resuming experience collection (15050 times) [2024-03-21 03:12:43,689][04017] Updated weights for policy 0, policy_version 22837 (0.0016) [2024-03-21 03:12:45,521][03784] Fps is (10 sec: 49152.4, 60 sec: 47513.7, 300 sec: 47652.5). Total num frames: 748388352. Throughput: 0: 45300.1. Samples: 749571100. Policy #0 lag: (min: 1.0, avg: 27.5, max: 107.0) [2024-03-21 03:12:45,522][03784] Avg episode reward: [(0, '1.395')] [2024-03-21 03:12:50,521][03784] Fps is (10 sec: 45875.7, 60 sec: 48606.0, 300 sec: 47652.5). Total num frames: 748617728. Throughput: 0: 45166.7. Samples: 749878500. Policy #0 lag: (min: 0.0, avg: 40.6, max: 73.0) [2024-03-21 03:12:50,522][03784] Avg episode reward: [(0, '1.395')] [2024-03-21 03:12:51,097][04017] Updated weights for policy 0, policy_version 22847 (0.0011) [2024-03-21 03:12:54,485][04017] Updated weights for policy 0, policy_version 22857 (0.0021) [2024-03-21 03:12:55,521][03784] Fps is (10 sec: 65535.4, 60 sec: 49698.1, 300 sec: 47652.5). Total num frames: 749043712. Throughput: 0: 45371.2. Samples: 750150800. Policy #0 lag: (min: 0.0, avg: 40.6, max: 73.0) [2024-03-21 03:12:55,522][03784] Avg episode reward: [(0, '1.192')] [2024-03-21 03:13:00,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 749076480. Throughput: 0: 46477.8. Samples: 750313400. Policy #0 lag: (min: 0.0, avg: 40.6, max: 73.0) [2024-03-21 03:13:00,522][03784] Avg episode reward: [(0, '1.192')] [2024-03-21 03:13:02,633][04017] Updated weights for policy 0, policy_version 22867 (0.0015) [2024-03-21 03:13:05,521][03784] Fps is (10 sec: 39321.8, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 749436928. Throughput: 0: 46889.0. Samples: 750602900. Policy #0 lag: (min: 1.0, avg: 40.2, max: 90.0) [2024-03-21 03:13:05,522][03784] Avg episode reward: [(0, '0.697')] [2024-03-21 03:13:10,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46967.6, 300 sec: 47097.1). Total num frames: 749568000. Throughput: 0: 46744.4. Samples: 750883500. Policy #0 lag: (min: 1.0, avg: 40.2, max: 90.0) [2024-03-21 03:13:10,522][03784] Avg episode reward: [(0, '1.103')] [2024-03-21 03:13:11,455][04017] Updated weights for policy 0, policy_version 22877 (0.0014) [2024-03-21 03:13:15,521][03784] Fps is (10 sec: 32768.0, 60 sec: 45329.1, 300 sec: 47097.1). Total num frames: 749764608. Throughput: 0: 47186.8. Samples: 751032200. Policy #0 lag: (min: 1.0, avg: 40.2, max: 90.0) [2024-03-21 03:13:15,522][03784] Avg episode reward: [(0, '1.103')] [2024-03-21 03:13:20,521][03784] Fps is (10 sec: 26214.5, 60 sec: 43144.7, 300 sec: 46208.4). Total num frames: 749830144. Throughput: 0: 48940.1. Samples: 751349200. Policy #0 lag: (min: 0.0, avg: 29.1, max: 67.0) [2024-03-21 03:13:20,521][03784] Avg episode reward: [(0, '0.484')] [2024-03-21 03:13:21,595][04017] Updated weights for policy 0, policy_version 22887 (0.0012) [2024-03-21 03:13:25,521][03784] Fps is (10 sec: 45874.8, 60 sec: 48059.6, 300 sec: 47097.0). Total num frames: 750223360. Throughput: 0: 47746.7. Samples: 751572100. Policy #0 lag: (min: 0.0, avg: 29.1, max: 67.0) [2024-03-21 03:13:25,522][03784] Avg episode reward: [(0, '1.185')] [2024-03-21 03:13:27,010][04017] Updated weights for policy 0, policy_version 22897 (0.0018) [2024-03-21 03:13:30,521][03784] Fps is (10 sec: 55705.4, 60 sec: 48059.8, 300 sec: 46763.8). Total num frames: 750387200. Throughput: 0: 47020.0. Samples: 751687000. Policy #0 lag: (min: 0.0, avg: 29.1, max: 67.0) [2024-03-21 03:13:30,522][03784] Avg episode reward: [(0, '1.459')] [2024-03-21 03:13:35,521][03784] Fps is (10 sec: 29491.2, 60 sec: 43690.7, 300 sec: 46319.5). Total num frames: 750518272. Throughput: 0: 46691.1. Samples: 751979600. Policy #0 lag: (min: 0.0, avg: 29.1, max: 67.0) [2024-03-21 03:13:35,522][03784] Avg episode reward: [(0, '1.326')] [2024-03-21 03:13:36,327][04017] Updated weights for policy 0, policy_version 22907 (0.0014) [2024-03-21 03:13:39,387][03995] Signal inference workers to stop experience collection... (15100 times) [2024-03-21 03:13:39,414][04017] InferenceWorker_p0-w0: stopping experience collection (15100 times) [2024-03-21 03:13:39,624][03995] Signal inference workers to resume experience collection... (15100 times) [2024-03-21 03:13:39,624][04017] InferenceWorker_p0-w0: resuming experience collection (15100 times) [2024-03-21 03:13:40,521][03784] Fps is (10 sec: 42598.4, 60 sec: 44236.9, 300 sec: 46541.7). Total num frames: 750813184. Throughput: 0: 46464.5. Samples: 752241700. Policy #0 lag: (min: 1.0, avg: 28.1, max: 64.0) [2024-03-21 03:13:40,522][03784] Avg episode reward: [(0, '1.187')] [2024-03-21 03:13:42,021][04017] Updated weights for policy 0, policy_version 22917 (0.0017) [2024-03-21 03:13:45,521][03784] Fps is (10 sec: 72088.5, 60 sec: 47513.4, 300 sec: 46986.0). Total num frames: 751239168. Throughput: 0: 45930.9. Samples: 752380300. Policy #0 lag: (min: 1.0, avg: 28.1, max: 64.0) [2024-03-21 03:13:45,523][03784] Avg episode reward: [(0, '1.357')] [2024-03-21 03:13:45,852][04017] Updated weights for policy 0, policy_version 22927 (0.0015) [2024-03-21 03:13:50,521][03784] Fps is (10 sec: 72088.9, 60 sec: 48605.8, 300 sec: 47319.2). Total num frames: 751534080. Throughput: 0: 45237.7. Samples: 752638600. Policy #0 lag: (min: 1.0, avg: 28.1, max: 64.0) [2024-03-21 03:13:50,522][03784] Avg episode reward: [(0, '1.058')] [2024-03-21 03:13:52,056][04017] Updated weights for policy 0, policy_version 22937 (0.0011) [2024-03-21 03:13:55,521][03784] Fps is (10 sec: 42599.3, 60 sec: 43690.7, 300 sec: 47319.2). Total num frames: 751665152. Throughput: 0: 45677.8. Samples: 752939000. Policy #0 lag: (min: 0.0, avg: 42.1, max: 76.0) [2024-03-21 03:13:55,522][03784] Avg episode reward: [(0, '0.469')] [2024-03-21 03:14:00,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46967.4, 300 sec: 47208.1). Total num frames: 751894528. Throughput: 0: 45037.7. Samples: 753058900. Policy #0 lag: (min: 0.0, avg: 42.1, max: 76.0) [2024-03-21 03:14:00,522][03784] Avg episode reward: [(0, '1.312')] [2024-03-21 03:14:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000022946_751894528.pth... [2024-03-21 03:14:00,676][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000022598_740491264.pth [2024-03-21 03:14:01,682][04017] Updated weights for policy 0, policy_version 22947 (0.0014) [2024-03-21 03:14:05,521][03784] Fps is (10 sec: 36044.6, 60 sec: 43144.5, 300 sec: 47430.3). Total num frames: 752025600. Throughput: 0: 44968.8. Samples: 753372800. Policy #0 lag: (min: 0.0, avg: 42.1, max: 76.0) [2024-03-21 03:14:05,522][03784] Avg episode reward: [(0, '1.193')] [2024-03-21 03:14:07,915][04017] Updated weights for policy 0, policy_version 22957 (0.0015) [2024-03-21 03:14:10,521][03784] Fps is (10 sec: 52429.1, 60 sec: 47513.6, 300 sec: 47763.5). Total num frames: 752418816. Throughput: 0: 46033.4. Samples: 753643600. Policy #0 lag: (min: 2.0, avg: 45.7, max: 104.0) [2024-03-21 03:14:10,522][03784] Avg episode reward: [(0, '0.606')] [2024-03-21 03:14:15,521][03784] Fps is (10 sec: 52429.3, 60 sec: 46421.4, 300 sec: 46986.0). Total num frames: 752549888. Throughput: 0: 46882.3. Samples: 753796700. Policy #0 lag: (min: 2.0, avg: 45.7, max: 104.0) [2024-03-21 03:14:15,522][03784] Avg episode reward: [(0, '0.606')] [2024-03-21 03:14:15,868][04017] Updated weights for policy 0, policy_version 22967 (0.0015) [2024-03-21 03:14:20,521][03784] Fps is (10 sec: 45875.2, 60 sec: 50790.3, 300 sec: 47208.2). Total num frames: 752877568. Throughput: 0: 46388.9. Samples: 754067100. Policy #0 lag: (min: 2.0, avg: 45.7, max: 104.0) [2024-03-21 03:14:20,522][03784] Avg episode reward: [(0, '0.736')] [2024-03-21 03:14:22,217][04017] Updated weights for policy 0, policy_version 22977 (0.0009) [2024-03-21 03:14:25,521][03784] Fps is (10 sec: 42597.6, 60 sec: 45875.1, 300 sec: 46874.9). Total num frames: 752975872. Throughput: 0: 47728.7. Samples: 754389500. Policy #0 lag: (min: 2.0, avg: 45.7, max: 104.0) [2024-03-21 03:14:25,522][03784] Avg episode reward: [(0, '0.736')] [2024-03-21 03:14:25,620][03995] Signal inference workers to stop experience collection... (15150 times) [2024-03-21 03:14:25,677][04017] InferenceWorker_p0-w0: stopping experience collection (15150 times) [2024-03-21 03:14:25,933][03995] Signal inference workers to resume experience collection... (15150 times) [2024-03-21 03:14:25,934][04017] InferenceWorker_p0-w0: resuming experience collection (15150 times) [2024-03-21 03:14:30,521][03784] Fps is (10 sec: 32767.7, 60 sec: 46967.4, 300 sec: 47097.0). Total num frames: 753205248. Throughput: 0: 47704.5. Samples: 754527000. Policy #0 lag: (min: 0.0, avg: 33.9, max: 78.0) [2024-03-21 03:14:30,522][03784] Avg episode reward: [(0, '1.323')] [2024-03-21 03:14:30,851][04017] Updated weights for policy 0, policy_version 22987 (0.0011) [2024-03-21 03:14:35,521][03784] Fps is (10 sec: 55706.5, 60 sec: 50244.3, 300 sec: 47097.1). Total num frames: 753532928. Throughput: 0: 48257.9. Samples: 754810200. Policy #0 lag: (min: 0.0, avg: 33.9, max: 78.0) [2024-03-21 03:14:35,522][03784] Avg episode reward: [(0, '1.323')] [2024-03-21 03:14:35,892][04017] Updated weights for policy 0, policy_version 22997 (0.0015) [2024-03-21 03:14:40,521][03784] Fps is (10 sec: 58982.6, 60 sec: 49698.1, 300 sec: 46652.7). Total num frames: 753795072. Throughput: 0: 47875.5. Samples: 755093400. Policy #0 lag: (min: 0.0, avg: 33.9, max: 78.0) [2024-03-21 03:14:40,522][03784] Avg episode reward: [(0, '0.839')] [2024-03-21 03:14:41,703][04017] Updated weights for policy 0, policy_version 23007 (0.0011) [2024-03-21 03:14:45,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46967.6, 300 sec: 46652.7). Total num frames: 754057216. Throughput: 0: 48497.9. Samples: 755241300. Policy #0 lag: (min: 0.0, avg: 38.6, max: 90.0) [2024-03-21 03:14:45,522][03784] Avg episode reward: [(0, '1.102')] [2024-03-21 03:14:49,942][04017] Updated weights for policy 0, policy_version 23017 (0.0010) [2024-03-21 03:14:50,521][03784] Fps is (10 sec: 45875.6, 60 sec: 45329.1, 300 sec: 46986.0). Total num frames: 754253824. Throughput: 0: 47897.8. Samples: 755528200. Policy #0 lag: (min: 0.0, avg: 38.6, max: 90.0) [2024-03-21 03:14:50,522][03784] Avg episode reward: [(0, '1.102')] [2024-03-21 03:14:55,121][04017] Updated weights for policy 0, policy_version 23027 (0.0015) [2024-03-21 03:14:55,521][03784] Fps is (10 sec: 52428.9, 60 sec: 48605.9, 300 sec: 47763.5). Total num frames: 754581504. Throughput: 0: 47717.8. Samples: 755790900. Policy #0 lag: (min: 0.0, avg: 38.6, max: 90.0) [2024-03-21 03:14:55,522][03784] Avg episode reward: [(0, '0.758')] [2024-03-21 03:15:00,521][03784] Fps is (10 sec: 52428.1, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 754778112. Throughput: 0: 47695.4. Samples: 755943000. Policy #0 lag: (min: 0.0, avg: 34.5, max: 91.0) [2024-03-21 03:15:00,522][03784] Avg episode reward: [(0, '0.668')] [2024-03-21 03:15:03,042][04017] Updated weights for policy 0, policy_version 23037 (0.0016) [2024-03-21 03:15:05,521][03784] Fps is (10 sec: 39321.6, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 754974720. Throughput: 0: 48122.3. Samples: 756232600. Policy #0 lag: (min: 0.0, avg: 34.5, max: 91.0) [2024-03-21 03:15:05,522][03784] Avg episode reward: [(0, '1.257')] [2024-03-21 03:15:09,757][04017] Updated weights for policy 0, policy_version 23047 (0.0011) [2024-03-21 03:15:10,521][03784] Fps is (10 sec: 45875.7, 60 sec: 46967.5, 300 sec: 47763.5). Total num frames: 755236864. Throughput: 0: 47662.4. Samples: 756534300. Policy #0 lag: (min: 0.0, avg: 34.5, max: 91.0) [2024-03-21 03:15:10,522][03784] Avg episode reward: [(0, '1.056')] [2024-03-21 03:15:15,521][03784] Fps is (10 sec: 52429.0, 60 sec: 49152.0, 300 sec: 47763.5). Total num frames: 755499008. Throughput: 0: 47824.6. Samples: 756679100. Policy #0 lag: (min: 1.0, avg: 48.9, max: 97.0) [2024-03-21 03:15:15,522][03784] Avg episode reward: [(0, '0.893')] [2024-03-21 03:15:18,159][04017] Updated weights for policy 0, policy_version 23057 (0.0018) [2024-03-21 03:15:19,539][03995] Signal inference workers to stop experience collection... (15200 times) [2024-03-21 03:15:19,603][04017] InferenceWorker_p0-w0: stopping experience collection (15200 times) [2024-03-21 03:15:19,623][03995] Signal inference workers to resume experience collection... (15200 times) [2024-03-21 03:15:19,647][04017] InferenceWorker_p0-w0: resuming experience collection (15200 times) [2024-03-21 03:15:20,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.5, 300 sec: 46763.8). Total num frames: 755695616. Throughput: 0: 47888.9. Samples: 756965200. Policy #0 lag: (min: 1.0, avg: 48.9, max: 97.0) [2024-03-21 03:15:20,522][03784] Avg episode reward: [(0, '1.212')] [2024-03-21 03:15:22,910][04017] Updated weights for policy 0, policy_version 23067 (0.0024) [2024-03-21 03:15:25,521][03784] Fps is (10 sec: 36044.6, 60 sec: 48059.8, 300 sec: 46763.8). Total num frames: 755859456. Throughput: 0: 47960.1. Samples: 757251600. Policy #0 lag: (min: 1.0, avg: 48.9, max: 97.0) [2024-03-21 03:15:25,522][03784] Avg episode reward: [(0, '0.552')] [2024-03-21 03:15:28,900][04017] Updated weights for policy 0, policy_version 23077 (0.0014) [2024-03-21 03:15:30,521][03784] Fps is (10 sec: 55704.2, 60 sec: 50790.3, 300 sec: 46985.9). Total num frames: 756252672. Throughput: 0: 47690.8. Samples: 757387400. Policy #0 lag: (min: 4.0, avg: 38.7, max: 83.0) [2024-03-21 03:15:30,523][03784] Avg episode reward: [(0, '1.450')] [2024-03-21 03:15:35,521][03784] Fps is (10 sec: 58982.1, 60 sec: 48605.8, 300 sec: 47208.1). Total num frames: 756449280. Throughput: 0: 47311.0. Samples: 757657200. Policy #0 lag: (min: 4.0, avg: 38.7, max: 83.0) [2024-03-21 03:15:35,522][03784] Avg episode reward: [(0, '0.935')] [2024-03-21 03:15:38,909][04017] Updated weights for policy 0, policy_version 23087 (0.0010) [2024-03-21 03:15:40,521][03784] Fps is (10 sec: 32768.6, 60 sec: 46421.3, 300 sec: 47097.0). Total num frames: 756580352. Throughput: 0: 47808.8. Samples: 757942300. Policy #0 lag: (min: 4.0, avg: 38.7, max: 83.0) [2024-03-21 03:15:40,522][03784] Avg episode reward: [(0, '1.135')] [2024-03-21 03:15:45,521][03784] Fps is (10 sec: 32767.9, 60 sec: 45329.0, 300 sec: 46652.8). Total num frames: 756776960. Throughput: 0: 47848.9. Samples: 758096200. Policy #0 lag: (min: 0.0, avg: 40.0, max: 92.0) [2024-03-21 03:15:45,522][03784] Avg episode reward: [(0, '1.239')] [2024-03-21 03:15:46,396][04017] Updated weights for policy 0, policy_version 23097 (0.0010) [2024-03-21 03:15:50,527][03784] Fps is (10 sec: 45848.1, 60 sec: 46416.7, 300 sec: 47207.2). Total num frames: 757039104. Throughput: 0: 47287.0. Samples: 758360800. Policy #0 lag: (min: 0.0, avg: 40.0, max: 92.0) [2024-03-21 03:15:50,528][03784] Avg episode reward: [(0, '0.713')] [2024-03-21 03:15:52,616][04017] Updated weights for policy 0, policy_version 23107 (0.0017) [2024-03-21 03:15:55,521][03784] Fps is (10 sec: 49151.8, 60 sec: 44782.8, 300 sec: 47097.0). Total num frames: 757268480. Throughput: 0: 47342.1. Samples: 758664700. Policy #0 lag: (min: 0.0, avg: 40.0, max: 92.0) [2024-03-21 03:15:55,522][03784] Avg episode reward: [(0, '1.320')] [2024-03-21 03:15:59,913][04017] Updated weights for policy 0, policy_version 23117 (0.0017) [2024-03-21 03:16:00,521][03784] Fps is (10 sec: 49181.2, 60 sec: 45875.3, 300 sec: 47541.4). Total num frames: 757530624. Throughput: 0: 47426.6. Samples: 758813300. Policy #0 lag: (min: 0.0, avg: 40.0, max: 92.0) [2024-03-21 03:16:00,522][03784] Avg episode reward: [(0, '0.406')] [2024-03-21 03:16:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000023118_757530624.pth... [2024-03-21 03:16:00,649][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000022775_746291200.pth [2024-03-21 03:16:05,494][04017] Updated weights for policy 0, policy_version 23127 (0.0011) [2024-03-21 03:16:05,521][03784] Fps is (10 sec: 55706.5, 60 sec: 47513.6, 300 sec: 47541.4). Total num frames: 757825536. Throughput: 0: 47008.9. Samples: 759080600. Policy #0 lag: (min: 0.0, avg: 31.1, max: 83.0) [2024-03-21 03:16:05,522][03784] Avg episode reward: [(0, '1.063')] [2024-03-21 03:16:10,521][03784] Fps is (10 sec: 55706.2, 60 sec: 47513.7, 300 sec: 47763.5). Total num frames: 758087680. Throughput: 0: 46549.0. Samples: 759346300. Policy #0 lag: (min: 0.0, avg: 31.1, max: 83.0) [2024-03-21 03:16:10,522][03784] Avg episode reward: [(0, '0.804')] [2024-03-21 03:16:12,746][04017] Updated weights for policy 0, policy_version 23137 (0.0009) [2024-03-21 03:16:12,908][03995] Signal inference workers to stop experience collection... (15250 times) [2024-03-21 03:16:12,996][04017] InferenceWorker_p0-w0: stopping experience collection (15250 times) [2024-03-21 03:16:13,041][03995] Signal inference workers to resume experience collection... (15250 times) [2024-03-21 03:16:13,115][04017] InferenceWorker_p0-w0: resuming experience collection (15250 times) [2024-03-21 03:16:15,521][03784] Fps is (10 sec: 52428.7, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 758349824. Throughput: 0: 46949.2. Samples: 759500100. Policy #0 lag: (min: 0.0, avg: 31.1, max: 83.0) [2024-03-21 03:16:15,522][03784] Avg episode reward: [(0, '0.811')] [2024-03-21 03:16:17,592][04017] Updated weights for policy 0, policy_version 23147 (0.0014) [2024-03-21 03:16:20,521][03784] Fps is (10 sec: 58981.6, 60 sec: 49698.1, 300 sec: 47430.3). Total num frames: 758677504. Throughput: 0: 47131.1. Samples: 759778100. Policy #0 lag: (min: 1.0, avg: 35.5, max: 63.0) [2024-03-21 03:16:20,522][03784] Avg episode reward: [(0, '0.471')] [2024-03-21 03:16:24,727][04017] Updated weights for policy 0, policy_version 23157 (0.0010) [2024-03-21 03:16:25,521][03784] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 46541.7). Total num frames: 758841344. Throughput: 0: 47306.8. Samples: 760071100. Policy #0 lag: (min: 1.0, avg: 35.5, max: 63.0) [2024-03-21 03:16:25,522][03784] Avg episode reward: [(0, '1.391')] [2024-03-21 03:16:30,521][03784] Fps is (10 sec: 39321.8, 60 sec: 46967.7, 300 sec: 46874.9). Total num frames: 759070720. Throughput: 0: 47297.8. Samples: 760224600. Policy #0 lag: (min: 1.0, avg: 35.5, max: 63.0) [2024-03-21 03:16:30,522][03784] Avg episode reward: [(0, '1.110')] [2024-03-21 03:16:31,321][04017] Updated weights for policy 0, policy_version 23167 (0.0015) [2024-03-21 03:16:35,521][03784] Fps is (10 sec: 39321.1, 60 sec: 46421.3, 300 sec: 47097.0). Total num frames: 759234560. Throughput: 0: 47312.8. Samples: 760489600. Policy #0 lag: (min: 2.0, avg: 58.1, max: 111.0) [2024-03-21 03:16:35,522][03784] Avg episode reward: [(0, '0.762')] [2024-03-21 03:16:40,521][03784] Fps is (10 sec: 26214.3, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 759332864. Throughput: 0: 46666.7. Samples: 760764700. Policy #0 lag: (min: 2.0, avg: 58.1, max: 111.0) [2024-03-21 03:16:40,522][03784] Avg episode reward: [(0, '1.228')] [2024-03-21 03:16:44,527][04017] Updated weights for policy 0, policy_version 23177 (0.0010) [2024-03-21 03:16:45,521][03784] Fps is (10 sec: 26214.6, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 759496704. Throughput: 0: 46131.1. Samples: 760889200. Policy #0 lag: (min: 2.0, avg: 58.1, max: 111.0) [2024-03-21 03:16:45,522][03784] Avg episode reward: [(0, '0.697')] [2024-03-21 03:16:50,521][03784] Fps is (10 sec: 42598.1, 60 sec: 45333.5, 300 sec: 46430.6). Total num frames: 759758848. Throughput: 0: 45942.1. Samples: 761148000. Policy #0 lag: (min: 1.0, avg: 21.7, max: 49.0) [2024-03-21 03:16:50,522][03784] Avg episode reward: [(0, '1.241')] [2024-03-21 03:16:50,869][04017] Updated weights for policy 0, policy_version 23187 (0.0010) [2024-03-21 03:16:55,521][03784] Fps is (10 sec: 42598.5, 60 sec: 44236.9, 300 sec: 46319.5). Total num frames: 759922688. Throughput: 0: 46308.8. Samples: 761430200. Policy #0 lag: (min: 1.0, avg: 21.7, max: 49.0) [2024-03-21 03:16:55,522][03784] Avg episode reward: [(0, '1.241')] [2024-03-21 03:16:59,078][04017] Updated weights for policy 0, policy_version 23197 (0.0010) [2024-03-21 03:17:00,521][03784] Fps is (10 sec: 49152.5, 60 sec: 45329.1, 300 sec: 46541.7). Total num frames: 760250368. Throughput: 0: 45779.9. Samples: 761560200. Policy #0 lag: (min: 1.0, avg: 21.7, max: 49.0) [2024-03-21 03:17:00,522][03784] Avg episode reward: [(0, '1.166')] [2024-03-21 03:17:03,867][04017] Updated weights for policy 0, policy_version 23207 (0.0015) [2024-03-21 03:17:05,521][03784] Fps is (10 sec: 58981.6, 60 sec: 44782.8, 300 sec: 46652.7). Total num frames: 760512512. Throughput: 0: 45566.6. Samples: 761828600. Policy #0 lag: (min: 0.0, avg: 32.5, max: 67.0) [2024-03-21 03:17:05,522][03784] Avg episode reward: [(0, '1.488')] [2024-03-21 03:17:09,549][04017] Updated weights for policy 0, policy_version 23217 (0.0012) [2024-03-21 03:17:10,521][03784] Fps is (10 sec: 55705.7, 60 sec: 45329.0, 300 sec: 46652.7). Total num frames: 760807424. Throughput: 0: 45148.8. Samples: 762102800. Policy #0 lag: (min: 0.0, avg: 32.5, max: 67.0) [2024-03-21 03:17:10,522][03784] Avg episode reward: [(0, '1.488')] [2024-03-21 03:17:13,452][03995] Signal inference workers to stop experience collection... (15300 times) [2024-03-21 03:17:13,453][03995] Signal inference workers to resume experience collection... (15300 times) [2024-03-21 03:17:13,562][04017] InferenceWorker_p0-w0: stopping experience collection (15300 times) [2024-03-21 03:17:13,562][04017] InferenceWorker_p0-w0: resuming experience collection (15300 times) [2024-03-21 03:17:14,437][04017] Updated weights for policy 0, policy_version 23227 (0.0011) [2024-03-21 03:17:15,521][03784] Fps is (10 sec: 62259.9, 60 sec: 46421.3, 300 sec: 47097.1). Total num frames: 761135104. Throughput: 0: 45037.8. Samples: 762251300. Policy #0 lag: (min: 0.0, avg: 32.5, max: 67.0) [2024-03-21 03:17:15,522][03784] Avg episode reward: [(0, '1.565')] [2024-03-21 03:17:20,521][03784] Fps is (10 sec: 55706.0, 60 sec: 44783.0, 300 sec: 47541.4). Total num frames: 761364480. Throughput: 0: 45593.5. Samples: 762541300. Policy #0 lag: (min: 0.0, avg: 32.5, max: 67.0) [2024-03-21 03:17:20,522][03784] Avg episode reward: [(0, '1.473')] [2024-03-21 03:17:22,670][04017] Updated weights for policy 0, policy_version 23237 (0.0010) [2024-03-21 03:17:25,521][03784] Fps is (10 sec: 45874.8, 60 sec: 45875.1, 300 sec: 47763.5). Total num frames: 761593856. Throughput: 0: 45900.0. Samples: 762830200. Policy #0 lag: (min: 0.0, avg: 50.6, max: 90.0) [2024-03-21 03:17:25,523][03784] Avg episode reward: [(0, '0.773')] [2024-03-21 03:17:30,112][04017] Updated weights for policy 0, policy_version 23247 (0.0016) [2024-03-21 03:17:30,521][03784] Fps is (10 sec: 42598.0, 60 sec: 45329.1, 300 sec: 47097.1). Total num frames: 761790464. Throughput: 0: 46326.6. Samples: 762973900. Policy #0 lag: (min: 0.0, avg: 50.6, max: 90.0) [2024-03-21 03:17:30,522][03784] Avg episode reward: [(0, '0.897')] [2024-03-21 03:17:35,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46967.5, 300 sec: 47097.1). Total num frames: 762052608. Throughput: 0: 47002.3. Samples: 763263100. Policy #0 lag: (min: 0.0, avg: 50.6, max: 90.0) [2024-03-21 03:17:35,522][03784] Avg episode reward: [(0, '1.103')] [2024-03-21 03:17:35,649][04017] Updated weights for policy 0, policy_version 23257 (0.0015) [2024-03-21 03:17:40,521][03784] Fps is (10 sec: 45875.3, 60 sec: 48605.9, 300 sec: 46986.0). Total num frames: 762249216. Throughput: 0: 46722.2. Samples: 763532700. Policy #0 lag: (min: 0.0, avg: 45.3, max: 109.0) [2024-03-21 03:17:40,522][03784] Avg episode reward: [(0, '1.187')] [2024-03-21 03:17:44,560][04017] Updated weights for policy 0, policy_version 23267 (0.0011) [2024-03-21 03:17:45,521][03784] Fps is (10 sec: 36045.1, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 762413056. Throughput: 0: 46995.6. Samples: 763675000. Policy #0 lag: (min: 0.0, avg: 45.3, max: 109.0) [2024-03-21 03:17:45,522][03784] Avg episode reward: [(0, '0.821')] [2024-03-21 03:17:50,481][04017] Updated weights for policy 0, policy_version 23277 (0.0011) [2024-03-21 03:17:50,521][03784] Fps is (10 sec: 49151.6, 60 sec: 49698.2, 300 sec: 46430.6). Total num frames: 762740736. Throughput: 0: 47675.6. Samples: 763974000. Policy #0 lag: (min: 0.0, avg: 45.3, max: 109.0) [2024-03-21 03:17:50,522][03784] Avg episode reward: [(0, '0.821')] [2024-03-21 03:17:55,521][03784] Fps is (10 sec: 55705.4, 60 sec: 50790.4, 300 sec: 47097.1). Total num frames: 762970112. Throughput: 0: 47351.1. Samples: 764233600. Policy #0 lag: (min: 0.0, avg: 38.8, max: 80.0) [2024-03-21 03:17:55,522][03784] Avg episode reward: [(0, '0.903')] [2024-03-21 03:17:57,230][04017] Updated weights for policy 0, policy_version 23287 (0.0017) [2024-03-21 03:18:00,521][03784] Fps is (10 sec: 49152.6, 60 sec: 49698.2, 300 sec: 46763.8). Total num frames: 763232256. Throughput: 0: 47017.8. Samples: 764367100. Policy #0 lag: (min: 0.0, avg: 38.8, max: 80.0) [2024-03-21 03:18:00,522][03784] Avg episode reward: [(0, '1.346')] [2024-03-21 03:18:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000023292_763232256.pth... [2024-03-21 03:18:00,655][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000022946_751894528.pth [2024-03-21 03:18:05,521][03784] Fps is (10 sec: 29491.2, 60 sec: 45875.3, 300 sec: 46430.6). Total num frames: 763265024. Throughput: 0: 46613.3. Samples: 764638900. Policy #0 lag: (min: 0.0, avg: 38.8, max: 80.0) [2024-03-21 03:18:05,522][03784] Avg episode reward: [(0, '0.709')] [2024-03-21 03:18:07,893][04017] Updated weights for policy 0, policy_version 23297 (0.0015) [2024-03-21 03:18:09,255][03995] Signal inference workers to stop experience collection... (15350 times) [2024-03-21 03:18:09,256][03995] Signal inference workers to resume experience collection... (15350 times) [2024-03-21 03:18:09,328][04017] InferenceWorker_p0-w0: stopping experience collection (15350 times) [2024-03-21 03:18:09,328][04017] InferenceWorker_p0-w0: resuming experience collection (15350 times) [2024-03-21 03:18:10,521][03784] Fps is (10 sec: 36044.4, 60 sec: 46421.3, 300 sec: 46874.9). Total num frames: 763592704. Throughput: 0: 46160.0. Samples: 764907400. Policy #0 lag: (min: 0.0, avg: 38.8, max: 80.0) [2024-03-21 03:18:10,522][03784] Avg episode reward: [(0, '1.168')] [2024-03-21 03:18:14,260][04017] Updated weights for policy 0, policy_version 23307 (0.0012) [2024-03-21 03:18:15,521][03784] Fps is (10 sec: 52428.7, 60 sec: 44236.8, 300 sec: 47319.2). Total num frames: 763789312. Throughput: 0: 46524.4. Samples: 765067500. Policy #0 lag: (min: 0.0, avg: 25.6, max: 72.0) [2024-03-21 03:18:15,522][03784] Avg episode reward: [(0, '0.459')] [2024-03-21 03:18:19,077][04017] Updated weights for policy 0, policy_version 23317 (0.0011) [2024-03-21 03:18:20,521][03784] Fps is (10 sec: 49152.5, 60 sec: 45329.0, 300 sec: 46986.0). Total num frames: 764084224. Throughput: 0: 46231.2. Samples: 765343500. Policy #0 lag: (min: 0.0, avg: 25.6, max: 72.0) [2024-03-21 03:18:20,522][03784] Avg episode reward: [(0, '1.031')] [2024-03-21 03:18:25,521][03784] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 46874.9). Total num frames: 764215296. Throughput: 0: 47004.4. Samples: 765647900. Policy #0 lag: (min: 0.0, avg: 25.6, max: 72.0) [2024-03-21 03:18:25,522][03784] Avg episode reward: [(0, '1.592')] [2024-03-21 03:18:29,240][04017] Updated weights for policy 0, policy_version 23327 (0.0019) [2024-03-21 03:18:30,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 47208.1). Total num frames: 764444672. Throughput: 0: 46706.7. Samples: 765776800. Policy #0 lag: (min: 0.0, avg: 30.9, max: 66.0) [2024-03-21 03:18:30,522][03784] Avg episode reward: [(0, '0.698')] [2024-03-21 03:18:34,099][04017] Updated weights for policy 0, policy_version 23337 (0.0017) [2024-03-21 03:18:35,521][03784] Fps is (10 sec: 49152.1, 60 sec: 44236.8, 300 sec: 47097.1). Total num frames: 764706816. Throughput: 0: 46431.2. Samples: 766063400. Policy #0 lag: (min: 0.0, avg: 30.9, max: 66.0) [2024-03-21 03:18:35,522][03784] Avg episode reward: [(0, '1.064')] [2024-03-21 03:18:40,521][03784] Fps is (10 sec: 55704.8, 60 sec: 45875.1, 300 sec: 46652.8). Total num frames: 765001728. Throughput: 0: 47051.0. Samples: 766350900. Policy #0 lag: (min: 0.0, avg: 30.9, max: 66.0) [2024-03-21 03:18:40,522][03784] Avg episode reward: [(0, '1.321')] [2024-03-21 03:18:40,767][04017] Updated weights for policy 0, policy_version 23347 (0.0015) [2024-03-21 03:18:45,235][04017] Updated weights for policy 0, policy_version 23357 (0.0010) [2024-03-21 03:18:45,521][03784] Fps is (10 sec: 65535.2, 60 sec: 49151.9, 300 sec: 46874.9). Total num frames: 765362176. Throughput: 0: 47108.7. Samples: 766487000. Policy #0 lag: (min: 1.0, avg: 53.2, max: 112.0) [2024-03-21 03:18:45,522][03784] Avg episode reward: [(0, '1.085')] [2024-03-21 03:18:50,486][04017] Updated weights for policy 0, policy_version 23367 (0.0013) [2024-03-21 03:18:50,521][03784] Fps is (10 sec: 68812.6, 60 sec: 49152.0, 300 sec: 47541.3). Total num frames: 765689856. Throughput: 0: 47228.8. Samples: 766764200. Policy #0 lag: (min: 1.0, avg: 53.2, max: 112.0) [2024-03-21 03:18:50,522][03784] Avg episode reward: [(0, '0.773')] [2024-03-21 03:18:50,694][03995] Signal inference workers to stop experience collection... (15400 times) [2024-03-21 03:18:50,763][03995] Signal inference workers to resume experience collection... (15400 times) [2024-03-21 03:18:50,823][04017] InferenceWorker_p0-w0: stopping experience collection (15400 times) [2024-03-21 03:18:50,863][04017] InferenceWorker_p0-w0: resuming experience collection (15400 times) [2024-03-21 03:18:55,521][03784] Fps is (10 sec: 52429.8, 60 sec: 48605.9, 300 sec: 47430.3). Total num frames: 765886464. Throughput: 0: 47362.4. Samples: 767038700. Policy #0 lag: (min: 1.0, avg: 53.2, max: 112.0) [2024-03-21 03:18:55,522][03784] Avg episode reward: [(0, '0.675')] [2024-03-21 03:19:00,521][03784] Fps is (10 sec: 29491.7, 60 sec: 45875.2, 300 sec: 47319.2). Total num frames: 765984768. Throughput: 0: 47271.2. Samples: 767194700. Policy #0 lag: (min: 1.0, avg: 53.2, max: 112.0) [2024-03-21 03:19:00,521][03784] Avg episode reward: [(0, '1.268')] [2024-03-21 03:19:00,594][04017] Updated weights for policy 0, policy_version 23377 (0.0015) [2024-03-21 03:19:05,521][03784] Fps is (10 sec: 45875.2, 60 sec: 51336.6, 300 sec: 47208.1). Total num frames: 766345216. Throughput: 0: 47577.8. Samples: 767484500. Policy #0 lag: (min: 1.0, avg: 46.7, max: 114.0) [2024-03-21 03:19:05,522][03784] Avg episode reward: [(0, '0.980')] [2024-03-21 03:19:05,526][04017] Updated weights for policy 0, policy_version 23387 (0.0011) [2024-03-21 03:19:10,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48606.0, 300 sec: 47319.2). Total num frames: 766509056. Throughput: 0: 47673.4. Samples: 767793200. Policy #0 lag: (min: 1.0, avg: 46.7, max: 114.0) [2024-03-21 03:19:10,522][03784] Avg episode reward: [(0, '0.981')] [2024-03-21 03:19:15,521][03784] Fps is (10 sec: 29491.0, 60 sec: 47513.6, 300 sec: 46652.7). Total num frames: 766640128. Throughput: 0: 48431.1. Samples: 767956200. Policy #0 lag: (min: 1.0, avg: 46.7, max: 114.0) [2024-03-21 03:19:15,522][03784] Avg episode reward: [(0, '1.166')] [2024-03-21 03:19:16,551][04017] Updated weights for policy 0, policy_version 23397 (0.0010) [2024-03-21 03:19:20,521][03784] Fps is (10 sec: 26214.2, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 766771200. Throughput: 0: 48640.0. Samples: 768252200. Policy #0 lag: (min: 0.0, avg: 23.7, max: 89.0) [2024-03-21 03:19:20,522][03784] Avg episode reward: [(0, '1.414')] [2024-03-21 03:19:24,618][04017] Updated weights for policy 0, policy_version 23407 (0.0016) [2024-03-21 03:19:25,521][03784] Fps is (10 sec: 42598.2, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 767066112. Throughput: 0: 48286.7. Samples: 768523800. Policy #0 lag: (min: 0.0, avg: 23.7, max: 89.0) [2024-03-21 03:19:25,522][03784] Avg episode reward: [(0, '0.733')] [2024-03-21 03:19:30,102][04017] Updated weights for policy 0, policy_version 23417 (0.0012) [2024-03-21 03:19:30,521][03784] Fps is (10 sec: 58982.4, 60 sec: 48605.8, 300 sec: 46874.9). Total num frames: 767361024. Throughput: 0: 47975.7. Samples: 768645900. Policy #0 lag: (min: 0.0, avg: 23.7, max: 89.0) [2024-03-21 03:19:30,522][03784] Avg episode reward: [(0, '0.950')] [2024-03-21 03:19:35,521][03784] Fps is (10 sec: 42598.8, 60 sec: 46421.4, 300 sec: 46430.6). Total num frames: 767492096. Throughput: 0: 47853.5. Samples: 768917600. Policy #0 lag: (min: 0.0, avg: 38.7, max: 77.0) [2024-03-21 03:19:35,522][03784] Avg episode reward: [(0, '1.433')] [2024-03-21 03:19:37,054][04017] Updated weights for policy 0, policy_version 23427 (0.0019) [2024-03-21 03:19:40,521][03784] Fps is (10 sec: 55705.9, 60 sec: 48606.0, 300 sec: 46986.0). Total num frames: 767918080. Throughput: 0: 47602.2. Samples: 769180800. Policy #0 lag: (min: 0.0, avg: 38.7, max: 77.0) [2024-03-21 03:19:40,522][03784] Avg episode reward: [(0, '0.743')] [2024-03-21 03:19:41,113][04017] Updated weights for policy 0, policy_version 23437 (0.0010) [2024-03-21 03:19:42,537][03995] Signal inference workers to stop experience collection... (15450 times) [2024-03-21 03:19:42,595][04017] InferenceWorker_p0-w0: stopping experience collection (15450 times) [2024-03-21 03:19:42,770][03995] Signal inference workers to resume experience collection... (15450 times) [2024-03-21 03:19:42,770][04017] InferenceWorker_p0-w0: resuming experience collection (15450 times) [2024-03-21 03:19:45,521][03784] Fps is (10 sec: 78642.8, 60 sec: 48606.0, 300 sec: 47541.4). Total num frames: 768278528. Throughput: 0: 46953.3. Samples: 769307600. Policy #0 lag: (min: 0.0, avg: 38.7, max: 77.0) [2024-03-21 03:19:45,522][03784] Avg episode reward: [(0, '0.727')] [2024-03-21 03:19:48,236][04017] Updated weights for policy 0, policy_version 23447 (0.0013) [2024-03-21 03:19:50,521][03784] Fps is (10 sec: 52428.7, 60 sec: 45875.3, 300 sec: 46986.0). Total num frames: 768442368. Throughput: 0: 47206.6. Samples: 769608800. Policy #0 lag: (min: 0.0, avg: 38.7, max: 77.0) [2024-03-21 03:19:50,522][03784] Avg episode reward: [(0, '0.944')] [2024-03-21 03:19:54,174][04017] Updated weights for policy 0, policy_version 23457 (0.0011) [2024-03-21 03:19:55,521][03784] Fps is (10 sec: 42598.6, 60 sec: 46967.4, 300 sec: 47208.2). Total num frames: 768704512. Throughput: 0: 46535.5. Samples: 769887300. Policy #0 lag: (min: 0.0, avg: 47.8, max: 117.0) [2024-03-21 03:19:55,522][03784] Avg episode reward: [(0, '1.295')] [2024-03-21 03:20:00,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46967.4, 300 sec: 46874.9). Total num frames: 768802816. Throughput: 0: 46248.9. Samples: 770037400. Policy #0 lag: (min: 0.0, avg: 47.8, max: 117.0) [2024-03-21 03:20:00,522][03784] Avg episode reward: [(0, '1.229')] [2024-03-21 03:20:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000023462_768802816.pth... [2024-03-21 03:20:00,697][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000023118_757530624.pth [2024-03-21 03:20:05,124][04017] Updated weights for policy 0, policy_version 23467 (0.0016) [2024-03-21 03:20:05,521][03784] Fps is (10 sec: 29491.0, 60 sec: 44236.7, 300 sec: 46652.7). Total num frames: 768999424. Throughput: 0: 46593.3. Samples: 770348900. Policy #0 lag: (min: 0.0, avg: 47.8, max: 117.0) [2024-03-21 03:20:05,522][03784] Avg episode reward: [(0, '0.881')] [2024-03-21 03:20:10,521][03784] Fps is (10 sec: 45874.9, 60 sec: 45875.1, 300 sec: 46652.7). Total num frames: 769261568. Throughput: 0: 46980.0. Samples: 770637900. Policy #0 lag: (min: 7.0, avg: 33.3, max: 75.0) [2024-03-21 03:20:10,522][03784] Avg episode reward: [(0, '0.647')] [2024-03-21 03:20:11,127][04017] Updated weights for policy 0, policy_version 23477 (0.0013) [2024-03-21 03:20:15,521][03784] Fps is (10 sec: 52429.3, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 769523712. Throughput: 0: 47035.6. Samples: 770762500. Policy #0 lag: (min: 7.0, avg: 33.3, max: 75.0) [2024-03-21 03:20:15,522][03784] Avg episode reward: [(0, '0.647')] [2024-03-21 03:20:20,521][03784] Fps is (10 sec: 32768.1, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 769589248. Throughput: 0: 47622.2. Samples: 771060600. Policy #0 lag: (min: 7.0, avg: 33.3, max: 75.0) [2024-03-21 03:20:20,522][03784] Avg episode reward: [(0, '1.420')] [2024-03-21 03:20:20,808][04017] Updated weights for policy 0, policy_version 23487 (0.0021) [2024-03-21 03:20:25,521][03784] Fps is (10 sec: 39320.9, 60 sec: 47513.5, 300 sec: 46319.5). Total num frames: 769916928. Throughput: 0: 47555.4. Samples: 771320800. Policy #0 lag: (min: 1.0, avg: 53.5, max: 111.0) [2024-03-21 03:20:25,522][03784] Avg episode reward: [(0, '0.885')] [2024-03-21 03:20:25,751][04017] Updated weights for policy 0, policy_version 23497 (0.0013) [2024-03-21 03:20:30,521][03784] Fps is (10 sec: 58981.7, 60 sec: 46967.4, 300 sec: 46541.7). Total num frames: 770179072. Throughput: 0: 47742.1. Samples: 771456000. Policy #0 lag: (min: 1.0, avg: 53.5, max: 111.0) [2024-03-21 03:20:30,522][03784] Avg episode reward: [(0, '1.268')] [2024-03-21 03:20:32,854][04017] Updated weights for policy 0, policy_version 23507 (0.0011) [2024-03-21 03:20:34,617][03995] Signal inference workers to stop experience collection... (15500 times) [2024-03-21 03:20:34,698][04017] InferenceWorker_p0-w0: stopping experience collection (15500 times) [2024-03-21 03:20:34,848][03995] Signal inference workers to resume experience collection... (15500 times) [2024-03-21 03:20:34,848][04017] InferenceWorker_p0-w0: resuming experience collection (15500 times) [2024-03-21 03:20:35,521][03784] Fps is (10 sec: 58982.6, 60 sec: 50244.1, 300 sec: 47208.1). Total num frames: 770506752. Throughput: 0: 47633.2. Samples: 771752300. Policy #0 lag: (min: 1.0, avg: 53.5, max: 111.0) [2024-03-21 03:20:35,522][03784] Avg episode reward: [(0, '1.238')] [2024-03-21 03:20:36,427][04017] Updated weights for policy 0, policy_version 23517 (0.0020) [2024-03-21 03:20:40,521][03784] Fps is (10 sec: 55705.6, 60 sec: 46967.4, 300 sec: 47319.2). Total num frames: 770736128. Throughput: 0: 47828.7. Samples: 772039600. Policy #0 lag: (min: 0.0, avg: 45.8, max: 96.0) [2024-03-21 03:20:40,522][03784] Avg episode reward: [(0, '0.745')] [2024-03-21 03:20:44,874][04017] Updated weights for policy 0, policy_version 23527 (0.0011) [2024-03-21 03:20:45,521][03784] Fps is (10 sec: 45875.7, 60 sec: 44782.9, 300 sec: 47209.1). Total num frames: 770965504. Throughput: 0: 47831.1. Samples: 772189800. Policy #0 lag: (min: 0.0, avg: 45.8, max: 96.0) [2024-03-21 03:20:45,522][03784] Avg episode reward: [(0, '0.959')] [2024-03-21 03:20:49,521][04017] Updated weights for policy 0, policy_version 23537 (0.0011) [2024-03-21 03:20:50,521][03784] Fps is (10 sec: 62259.6, 60 sec: 48605.8, 300 sec: 47763.5). Total num frames: 771358720. Throughput: 0: 47062.2. Samples: 772466700. Policy #0 lag: (min: 0.0, avg: 45.8, max: 96.0) [2024-03-21 03:20:50,522][03784] Avg episode reward: [(0, '1.243')] [2024-03-21 03:20:55,521][03784] Fps is (10 sec: 55705.7, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 771522560. Throughput: 0: 47371.2. Samples: 772769600. Policy #0 lag: (min: 0.0, avg: 45.8, max: 96.0) [2024-03-21 03:20:55,522][03784] Avg episode reward: [(0, '0.644')] [2024-03-21 03:20:56,735][04017] Updated weights for policy 0, policy_version 23547 (0.0015) [2024-03-21 03:21:00,521][03784] Fps is (10 sec: 45874.8, 60 sec: 50244.1, 300 sec: 47430.3). Total num frames: 771817472. Throughput: 0: 47642.1. Samples: 772906400. Policy #0 lag: (min: 0.0, avg: 50.1, max: 95.0) [2024-03-21 03:21:00,522][03784] Avg episode reward: [(0, '1.143')] [2024-03-21 03:21:01,909][04017] Updated weights for policy 0, policy_version 23557 (0.0011) [2024-03-21 03:21:05,521][03784] Fps is (10 sec: 45875.0, 60 sec: 49698.2, 300 sec: 47097.0). Total num frames: 771981312. Throughput: 0: 47457.7. Samples: 773196200. Policy #0 lag: (min: 0.0, avg: 50.1, max: 95.0) [2024-03-21 03:21:05,522][03784] Avg episode reward: [(0, '0.921')] [2024-03-21 03:21:10,521][03784] Fps is (10 sec: 26214.6, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 772079616. Throughput: 0: 47489.0. Samples: 773457800. Policy #0 lag: (min: 0.0, avg: 50.1, max: 95.0) [2024-03-21 03:21:10,522][03784] Avg episode reward: [(0, '0.829')] [2024-03-21 03:21:13,674][04017] Updated weights for policy 0, policy_version 23567 (0.0017) [2024-03-21 03:21:15,521][03784] Fps is (10 sec: 29491.2, 60 sec: 45875.1, 300 sec: 46097.4). Total num frames: 772276224. Throughput: 0: 47062.3. Samples: 773573800. Policy #0 lag: (min: 1.0, avg: 54.5, max: 110.0) [2024-03-21 03:21:15,522][03784] Avg episode reward: [(0, '0.611')] [2024-03-21 03:21:20,521][03784] Fps is (10 sec: 36044.9, 60 sec: 47513.6, 300 sec: 46097.3). Total num frames: 772440064. Throughput: 0: 47106.8. Samples: 773872100. Policy #0 lag: (min: 1.0, avg: 54.5, max: 110.0) [2024-03-21 03:21:20,522][03784] Avg episode reward: [(0, '1.336')] [2024-03-21 03:21:22,729][04017] Updated weights for policy 0, policy_version 23577 (0.0011) [2024-03-21 03:21:25,165][03995] Signal inference workers to stop experience collection... (15550 times) [2024-03-21 03:21:25,235][03995] Signal inference workers to resume experience collection... (15550 times) [2024-03-21 03:21:25,249][04017] InferenceWorker_p0-w0: stopping experience collection (15550 times) [2024-03-21 03:21:25,302][04017] InferenceWorker_p0-w0: resuming experience collection (15550 times) [2024-03-21 03:21:25,521][03784] Fps is (10 sec: 55706.0, 60 sec: 48606.0, 300 sec: 46652.8). Total num frames: 772833280. Throughput: 0: 46411.3. Samples: 774128100. Policy #0 lag: (min: 1.0, avg: 54.5, max: 110.0) [2024-03-21 03:21:25,522][03784] Avg episode reward: [(0, '1.213')] [2024-03-21 03:21:25,928][04017] Updated weights for policy 0, policy_version 23587 (0.0015) [2024-03-21 03:21:30,521][03784] Fps is (10 sec: 62259.5, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 773062656. Throughput: 0: 46166.7. Samples: 774267300. Policy #0 lag: (min: 0.0, avg: 41.9, max: 100.0) [2024-03-21 03:21:30,522][03784] Avg episode reward: [(0, '0.534')] [2024-03-21 03:21:32,980][04017] Updated weights for policy 0, policy_version 23597 (0.0020) [2024-03-21 03:21:35,521][03784] Fps is (10 sec: 49152.0, 60 sec: 46967.6, 300 sec: 47430.3). Total num frames: 773324800. Throughput: 0: 45889.0. Samples: 774531700. Policy #0 lag: (min: 0.0, avg: 41.9, max: 100.0) [2024-03-21 03:21:35,522][03784] Avg episode reward: [(0, '0.665')] [2024-03-21 03:21:40,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44783.0, 300 sec: 47208.1). Total num frames: 773423104. Throughput: 0: 45924.4. Samples: 774836200. Policy #0 lag: (min: 0.0, avg: 41.9, max: 100.0) [2024-03-21 03:21:40,522][03784] Avg episode reward: [(0, '1.346')] [2024-03-21 03:21:42,711][04017] Updated weights for policy 0, policy_version 23607 (0.0028) [2024-03-21 03:21:45,521][03784] Fps is (10 sec: 39321.6, 60 sec: 45875.2, 300 sec: 47319.2). Total num frames: 773718016. Throughput: 0: 46082.4. Samples: 774980100. Policy #0 lag: (min: 0.0, avg: 41.9, max: 100.0) [2024-03-21 03:21:45,522][03784] Avg episode reward: [(0, '0.927')] [2024-03-21 03:21:47,142][04017] Updated weights for policy 0, policy_version 23617 (0.0020) [2024-03-21 03:21:50,521][03784] Fps is (10 sec: 68812.6, 60 sec: 45875.2, 300 sec: 48096.8). Total num frames: 774111232. Throughput: 0: 45642.2. Samples: 775250100. Policy #0 lag: (min: 1.0, avg: 45.8, max: 116.0) [2024-03-21 03:21:50,522][03784] Avg episode reward: [(0, '0.645')] [2024-03-21 03:21:53,879][04017] Updated weights for policy 0, policy_version 23627 (0.0010) [2024-03-21 03:21:55,521][03784] Fps is (10 sec: 55705.6, 60 sec: 45875.2, 300 sec: 47541.4). Total num frames: 774275072. Throughput: 0: 46373.4. Samples: 775544600. Policy #0 lag: (min: 1.0, avg: 45.8, max: 116.0) [2024-03-21 03:21:55,522][03784] Avg episode reward: [(0, '0.777')] [2024-03-21 03:22:00,521][03784] Fps is (10 sec: 29491.0, 60 sec: 43144.6, 300 sec: 47097.1). Total num frames: 774406144. Throughput: 0: 47222.2. Samples: 775698800. Policy #0 lag: (min: 1.0, avg: 45.8, max: 116.0) [2024-03-21 03:22:00,522][03784] Avg episode reward: [(0, '0.777')] [2024-03-21 03:22:00,780][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000023634_774438912.pth... [2024-03-21 03:22:00,920][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000023292_763232256.pth [2024-03-21 03:22:01,995][04017] Updated weights for policy 0, policy_version 23637 (0.0017) [2024-03-21 03:22:05,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44236.8, 300 sec: 46874.9). Total num frames: 774635520. Throughput: 0: 46553.4. Samples: 775967000. Policy #0 lag: (min: 1.0, avg: 36.0, max: 75.0) [2024-03-21 03:22:05,522][03784] Avg episode reward: [(0, '0.505')] [2024-03-21 03:22:10,402][04017] Updated weights for policy 0, policy_version 23647 (0.0011) [2024-03-21 03:22:10,521][03784] Fps is (10 sec: 45875.8, 60 sec: 46421.4, 300 sec: 46541.7). Total num frames: 774864896. Throughput: 0: 47377.8. Samples: 776260100. Policy #0 lag: (min: 1.0, avg: 36.0, max: 75.0) [2024-03-21 03:22:10,522][03784] Avg episode reward: [(0, '1.215')] [2024-03-21 03:22:13,698][03995] Signal inference workers to stop experience collection... (15600 times) [2024-03-21 03:22:13,747][04017] InferenceWorker_p0-w0: stopping experience collection (15600 times) [2024-03-21 03:22:13,771][03995] Signal inference workers to resume experience collection... (15600 times) [2024-03-21 03:22:13,799][04017] InferenceWorker_p0-w0: resuming experience collection (15600 times) [2024-03-21 03:22:13,802][04017] Updated weights for policy 0, policy_version 23657 (0.0017) [2024-03-21 03:22:15,521][03784] Fps is (10 sec: 68813.3, 60 sec: 50790.5, 300 sec: 47319.2). Total num frames: 775323648. Throughput: 0: 47329.0. Samples: 776397100. Policy #0 lag: (min: 1.0, avg: 36.0, max: 75.0) [2024-03-21 03:22:15,521][03784] Avg episode reward: [(0, '0.816')] [2024-03-21 03:22:17,204][04017] Updated weights for policy 0, policy_version 23667 (0.0011) [2024-03-21 03:22:20,521][03784] Fps is (10 sec: 75366.3, 60 sec: 52975.0, 300 sec: 47541.4). Total num frames: 775618560. Throughput: 0: 47437.8. Samples: 776666400. Policy #0 lag: (min: 0.0, avg: 42.6, max: 85.0) [2024-03-21 03:22:20,522][03784] Avg episode reward: [(0, '1.410')] [2024-03-21 03:22:25,521][03784] Fps is (10 sec: 39320.9, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 775716864. Throughput: 0: 47231.0. Samples: 776961600. Policy #0 lag: (min: 0.0, avg: 42.6, max: 85.0) [2024-03-21 03:22:25,522][03784] Avg episode reward: [(0, '0.712')] [2024-03-21 03:22:30,309][04017] Updated weights for policy 0, policy_version 23677 (0.0014) [2024-03-21 03:22:30,521][03784] Fps is (10 sec: 22937.5, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 775847936. Throughput: 0: 47400.0. Samples: 777113100. Policy #0 lag: (min: 0.0, avg: 42.6, max: 85.0) [2024-03-21 03:22:30,522][03784] Avg episode reward: [(0, '1.146')] [2024-03-21 03:22:35,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46967.4, 300 sec: 47097.1). Total num frames: 776142848. Throughput: 0: 47097.7. Samples: 777369500. Policy #0 lag: (min: 1.0, avg: 39.4, max: 80.0) [2024-03-21 03:22:35,522][03784] Avg episode reward: [(0, '1.517')] [2024-03-21 03:22:35,623][04017] Updated weights for policy 0, policy_version 23687 (0.0010) [2024-03-21 03:22:40,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 47319.2). Total num frames: 776372224. Throughput: 0: 46931.1. Samples: 777656500. Policy #0 lag: (min: 1.0, avg: 39.4, max: 80.0) [2024-03-21 03:22:40,522][03784] Avg episode reward: [(0, '1.619')] [2024-03-21 03:22:45,521][03784] Fps is (10 sec: 32768.1, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 776470528. Throughput: 0: 46553.4. Samples: 777793700. Policy #0 lag: (min: 1.0, avg: 39.4, max: 80.0) [2024-03-21 03:22:45,522][03784] Avg episode reward: [(0, '1.321')] [2024-03-21 03:22:45,998][04017] Updated weights for policy 0, policy_version 23697 (0.0011) [2024-03-21 03:22:50,521][03784] Fps is (10 sec: 42598.4, 60 sec: 44782.9, 300 sec: 46874.9). Total num frames: 776798208. Throughput: 0: 46513.3. Samples: 778060100. Policy #0 lag: (min: 1.0, avg: 39.4, max: 80.0) [2024-03-21 03:22:50,522][03784] Avg episode reward: [(0, '0.475')] [2024-03-21 03:22:51,557][04017] Updated weights for policy 0, policy_version 23707 (0.0018) [2024-03-21 03:22:55,521][03784] Fps is (10 sec: 49151.9, 60 sec: 44782.9, 300 sec: 46541.7). Total num frames: 776962048. Throughput: 0: 46242.2. Samples: 778341000. Policy #0 lag: (min: 0.0, avg: 52.2, max: 111.0) [2024-03-21 03:22:55,522][03784] Avg episode reward: [(0, '1.169')] [2024-03-21 03:23:00,521][03784] Fps is (10 sec: 26214.2, 60 sec: 44236.8, 300 sec: 46763.8). Total num frames: 777060352. Throughput: 0: 46326.5. Samples: 778481800. Policy #0 lag: (min: 0.0, avg: 52.2, max: 111.0) [2024-03-21 03:23:00,522][03784] Avg episode reward: [(0, '0.809')] [2024-03-21 03:23:02,588][04017] Updated weights for policy 0, policy_version 23717 (0.0011) [2024-03-21 03:23:05,521][03784] Fps is (10 sec: 36045.3, 60 sec: 44783.0, 300 sec: 46541.7). Total num frames: 777322496. Throughput: 0: 46537.9. Samples: 778760600. Policy #0 lag: (min: 0.0, avg: 52.2, max: 111.0) [2024-03-21 03:23:05,522][03784] Avg episode reward: [(0, '1.093')] [2024-03-21 03:23:06,890][04017] Updated weights for policy 0, policy_version 23727 (0.0011) [2024-03-21 03:23:10,521][03784] Fps is (10 sec: 65536.7, 60 sec: 47513.6, 300 sec: 47208.1). Total num frames: 777715712. Throughput: 0: 45760.1. Samples: 779020800. Policy #0 lag: (min: 0.0, avg: 37.8, max: 113.0) [2024-03-21 03:23:10,522][03784] Avg episode reward: [(0, '0.690')] [2024-03-21 03:23:13,862][04017] Updated weights for policy 0, policy_version 23737 (0.0010) [2024-03-21 03:23:15,521][03784] Fps is (10 sec: 58981.4, 60 sec: 43144.4, 300 sec: 46874.9). Total num frames: 777912320. Throughput: 0: 46015.5. Samples: 779183800. Policy #0 lag: (min: 0.0, avg: 37.8, max: 113.0) [2024-03-21 03:23:15,522][03784] Avg episode reward: [(0, '1.458')] [2024-03-21 03:23:15,589][03995] Signal inference workers to stop experience collection... (15650 times) [2024-03-21 03:23:15,655][03995] Signal inference workers to resume experience collection... (15650 times) [2024-03-21 03:23:15,667][04017] InferenceWorker_p0-w0: stopping experience collection (15650 times) [2024-03-21 03:23:15,709][04017] InferenceWorker_p0-w0: resuming experience collection (15650 times) [2024-03-21 03:23:17,631][04017] Updated weights for policy 0, policy_version 23747 (0.0014) [2024-03-21 03:23:20,521][03784] Fps is (10 sec: 62258.7, 60 sec: 45329.0, 300 sec: 47874.6). Total num frames: 778338304. Throughput: 0: 45775.5. Samples: 779429400. Policy #0 lag: (min: 0.0, avg: 37.8, max: 113.0) [2024-03-21 03:23:20,522][03784] Avg episode reward: [(0, '1.413')] [2024-03-21 03:23:25,521][03784] Fps is (10 sec: 52428.9, 60 sec: 45329.1, 300 sec: 47430.3). Total num frames: 778436608. Throughput: 0: 46508.9. Samples: 779749400. Policy #0 lag: (min: 0.0, avg: 37.8, max: 113.0) [2024-03-21 03:23:25,522][03784] Avg episode reward: [(0, '1.042')] [2024-03-21 03:23:25,912][04017] Updated weights for policy 0, policy_version 23757 (0.0011) [2024-03-21 03:23:30,521][03784] Fps is (10 sec: 36045.1, 60 sec: 47513.6, 300 sec: 47430.3). Total num frames: 778698752. Throughput: 0: 46811.1. Samples: 779900200. Policy #0 lag: (min: 0.0, avg: 37.7, max: 71.0) [2024-03-21 03:23:30,522][03784] Avg episode reward: [(0, '1.498')] [2024-03-21 03:23:31,378][04017] Updated weights for policy 0, policy_version 23767 (0.0016) [2024-03-21 03:23:35,521][03784] Fps is (10 sec: 58982.5, 60 sec: 48059.8, 300 sec: 47541.4). Total num frames: 779026432. Throughput: 0: 47393.4. Samples: 780192800. Policy #0 lag: (min: 0.0, avg: 37.7, max: 71.0) [2024-03-21 03:23:35,522][03784] Avg episode reward: [(0, '0.880')] [2024-03-21 03:23:36,326][04017] Updated weights for policy 0, policy_version 23777 (0.0022) [2024-03-21 03:23:40,521][03784] Fps is (10 sec: 58982.4, 60 sec: 48605.9, 300 sec: 47208.2). Total num frames: 779288576. Throughput: 0: 47086.7. Samples: 780459900. Policy #0 lag: (min: 0.0, avg: 37.7, max: 71.0) [2024-03-21 03:23:40,522][03784] Avg episode reward: [(0, '1.450')] [2024-03-21 03:23:45,521][03784] Fps is (10 sec: 36044.6, 60 sec: 48605.8, 300 sec: 46430.6). Total num frames: 779386880. Throughput: 0: 47340.0. Samples: 780612100. Policy #0 lag: (min: 0.0, avg: 60.5, max: 113.0) [2024-03-21 03:23:45,522][03784] Avg episode reward: [(0, '1.003')] [2024-03-21 03:23:46,268][04017] Updated weights for policy 0, policy_version 23787 (0.0010) [2024-03-21 03:23:50,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 779616256. Throughput: 0: 47964.3. Samples: 780919000. Policy #0 lag: (min: 0.0, avg: 60.5, max: 113.0) [2024-03-21 03:23:50,522][03784] Avg episode reward: [(0, '1.070')] [2024-03-21 03:23:54,269][04017] Updated weights for policy 0, policy_version 23797 (0.0014) [2024-03-21 03:23:55,521][03784] Fps is (10 sec: 42599.2, 60 sec: 47513.7, 300 sec: 46874.9). Total num frames: 779812864. Throughput: 0: 48920.1. Samples: 781222200. Policy #0 lag: (min: 0.0, avg: 60.5, max: 113.0) [2024-03-21 03:23:55,521][03784] Avg episode reward: [(0, '1.070')] [2024-03-21 03:24:00,024][04017] Updated weights for policy 0, policy_version 23807 (0.0010) [2024-03-21 03:24:00,521][03784] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 46652.7). Total num frames: 780107776. Throughput: 0: 48084.5. Samples: 781347600. Policy #0 lag: (min: 3.0, avg: 29.3, max: 52.0) [2024-03-21 03:24:00,522][03784] Avg episode reward: [(0, '1.579')] [2024-03-21 03:24:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000023807_780107776.pth... [2024-03-21 03:24:00,651][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000023462_768802816.pth [2024-03-21 03:24:05,521][03784] Fps is (10 sec: 52428.2, 60 sec: 50244.2, 300 sec: 46874.9). Total num frames: 780337152. Throughput: 0: 49049.0. Samples: 781636600. Policy #0 lag: (min: 3.0, avg: 29.3, max: 52.0) [2024-03-21 03:24:05,522][03784] Avg episode reward: [(0, '0.418')] [2024-03-21 03:24:08,626][04017] Updated weights for policy 0, policy_version 23817 (0.0009) [2024-03-21 03:24:08,643][03995] Signal inference workers to stop experience collection... (15700 times) [2024-03-21 03:24:08,644][03995] Signal inference workers to resume experience collection... (15700 times) [2024-03-21 03:24:08,699][04017] InferenceWorker_p0-w0: stopping experience collection (15700 times) [2024-03-21 03:24:08,699][04017] InferenceWorker_p0-w0: resuming experience collection (15700 times) [2024-03-21 03:24:10,521][03784] Fps is (10 sec: 45875.4, 60 sec: 47513.6, 300 sec: 47208.1). Total num frames: 780566528. Throughput: 0: 47900.1. Samples: 781904900. Policy #0 lag: (min: 3.0, avg: 29.3, max: 52.0) [2024-03-21 03:24:10,522][03784] Avg episode reward: [(0, '1.170')] [2024-03-21 03:24:12,691][04017] Updated weights for policy 0, policy_version 23827 (0.0022) [2024-03-21 03:24:15,521][03784] Fps is (10 sec: 55705.6, 60 sec: 49698.2, 300 sec: 47874.6). Total num frames: 780894208. Throughput: 0: 47535.5. Samples: 782039300. Policy #0 lag: (min: 0.0, avg: 49.5, max: 94.0) [2024-03-21 03:24:15,522][03784] Avg episode reward: [(0, '1.222')] [2024-03-21 03:24:18,728][04017] Updated weights for policy 0, policy_version 23837 (0.0014) [2024-03-21 03:24:20,521][03784] Fps is (10 sec: 58981.9, 60 sec: 46967.5, 300 sec: 47763.5). Total num frames: 781156352. Throughput: 0: 47060.0. Samples: 782310500. Policy #0 lag: (min: 0.0, avg: 49.5, max: 94.0) [2024-03-21 03:24:20,522][03784] Avg episode reward: [(0, '1.218')] [2024-03-21 03:24:25,521][03784] Fps is (10 sec: 49151.3, 60 sec: 49151.9, 300 sec: 47541.4). Total num frames: 781385728. Throughput: 0: 47164.3. Samples: 782582300. Policy #0 lag: (min: 0.0, avg: 49.5, max: 94.0) [2024-03-21 03:24:25,522][03784] Avg episode reward: [(0, '1.480')] [2024-03-21 03:24:25,900][04017] Updated weights for policy 0, policy_version 23847 (0.0011) [2024-03-21 03:24:30,521][03784] Fps is (10 sec: 29491.2, 60 sec: 45875.1, 300 sec: 47319.2). Total num frames: 781451264. Throughput: 0: 47117.8. Samples: 782732400. Policy #0 lag: (min: 0.0, avg: 49.5, max: 94.0) [2024-03-21 03:24:30,522][03784] Avg episode reward: [(0, '1.379')] [2024-03-21 03:24:35,143][04017] Updated weights for policy 0, policy_version 23857 (0.0011) [2024-03-21 03:24:35,521][03784] Fps is (10 sec: 39321.5, 60 sec: 45875.1, 300 sec: 46986.0). Total num frames: 781778944. Throughput: 0: 46808.7. Samples: 783025400. Policy #0 lag: (min: 0.0, avg: 37.8, max: 84.0) [2024-03-21 03:24:35,522][03784] Avg episode reward: [(0, '0.890')] [2024-03-21 03:24:40,521][03784] Fps is (10 sec: 52429.0, 60 sec: 44782.9, 300 sec: 46430.6). Total num frames: 781975552. Throughput: 0: 46031.0. Samples: 783293600. Policy #0 lag: (min: 0.0, avg: 37.8, max: 84.0) [2024-03-21 03:24:40,522][03784] Avg episode reward: [(0, '1.525')] [2024-03-21 03:24:42,346][04017] Updated weights for policy 0, policy_version 23867 (0.0016) [2024-03-21 03:24:45,521][03784] Fps is (10 sec: 45875.4, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 782237696. Throughput: 0: 46659.9. Samples: 783447300. Policy #0 lag: (min: 0.0, avg: 37.8, max: 84.0) [2024-03-21 03:24:45,522][03784] Avg episode reward: [(0, '0.519')] [2024-03-21 03:24:49,976][04017] Updated weights for policy 0, policy_version 23877 (0.0011) [2024-03-21 03:24:50,521][03784] Fps is (10 sec: 45875.5, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 782434304. Throughput: 0: 46875.6. Samples: 783746000. Policy #0 lag: (min: 0.0, avg: 46.0, max: 114.0) [2024-03-21 03:24:50,521][03784] Avg episode reward: [(0, '0.519')] [2024-03-21 03:24:54,178][04017] Updated weights for policy 0, policy_version 23887 (0.0012) [2024-03-21 03:24:55,521][03784] Fps is (10 sec: 55706.1, 60 sec: 49698.0, 300 sec: 47430.3). Total num frames: 782794752. Throughput: 0: 47202.2. Samples: 784029000. Policy #0 lag: (min: 0.0, avg: 46.0, max: 114.0) [2024-03-21 03:24:55,522][03784] Avg episode reward: [(0, '1.393')] [2024-03-21 03:24:59,117][03995] Signal inference workers to stop experience collection... (15750 times) [2024-03-21 03:24:59,192][03995] Signal inference workers to resume experience collection... (15750 times) [2024-03-21 03:24:59,239][04017] InferenceWorker_p0-w0: stopping experience collection (15750 times) [2024-03-21 03:24:59,240][04017] InferenceWorker_p0-w0: resuming experience collection (15750 times) [2024-03-21 03:25:00,521][03784] Fps is (10 sec: 52427.7, 60 sec: 47513.5, 300 sec: 47319.2). Total num frames: 782958592. Throughput: 0: 47770.9. Samples: 784189000. Policy #0 lag: (min: 0.0, avg: 46.0, max: 114.0) [2024-03-21 03:25:00,523][03784] Avg episode reward: [(0, '0.684')] [2024-03-21 03:25:02,516][04017] Updated weights for policy 0, policy_version 23897 (0.0011) [2024-03-21 03:25:05,521][03784] Fps is (10 sec: 39321.7, 60 sec: 47513.6, 300 sec: 47208.1). Total num frames: 783187968. Throughput: 0: 47715.6. Samples: 784457700. Policy #0 lag: (min: 2.0, avg: 39.2, max: 74.0) [2024-03-21 03:25:05,522][03784] Avg episode reward: [(0, '0.687')] [2024-03-21 03:25:08,680][04017] Updated weights for policy 0, policy_version 23907 (0.0017) [2024-03-21 03:25:10,521][03784] Fps is (10 sec: 49152.8, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 783450112. Throughput: 0: 48160.1. Samples: 784749500. Policy #0 lag: (min: 2.0, avg: 39.2, max: 74.0) [2024-03-21 03:25:10,522][03784] Avg episode reward: [(0, '1.100')] [2024-03-21 03:25:14,618][04017] Updated weights for policy 0, policy_version 23917 (0.0011) [2024-03-21 03:25:15,521][03784] Fps is (10 sec: 58981.8, 60 sec: 48059.7, 300 sec: 48096.7). Total num frames: 783777792. Throughput: 0: 47828.8. Samples: 784884700. Policy #0 lag: (min: 2.0, avg: 39.2, max: 74.0) [2024-03-21 03:25:15,522][03784] Avg episode reward: [(0, '0.748')] [2024-03-21 03:25:20,521][03784] Fps is (10 sec: 52428.4, 60 sec: 46967.4, 300 sec: 47652.5). Total num frames: 783974400. Throughput: 0: 47595.6. Samples: 785167200. Policy #0 lag: (min: 2.0, avg: 39.2, max: 74.0) [2024-03-21 03:25:20,522][03784] Avg episode reward: [(0, '0.969')] [2024-03-21 03:25:23,900][04017] Updated weights for policy 0, policy_version 23927 (0.0019) [2024-03-21 03:25:25,521][03784] Fps is (10 sec: 32768.3, 60 sec: 45329.2, 300 sec: 47208.2). Total num frames: 784105472. Throughput: 0: 48588.9. Samples: 785480100. Policy #0 lag: (min: 0.0, avg: 31.1, max: 71.0) [2024-03-21 03:25:25,522][03784] Avg episode reward: [(0, '1.454')] [2024-03-21 03:25:30,521][03784] Fps is (10 sec: 36045.2, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 784334848. Throughput: 0: 48249.0. Samples: 785618500. Policy #0 lag: (min: 0.0, avg: 31.1, max: 71.0) [2024-03-21 03:25:30,522][03784] Avg episode reward: [(0, '1.133')] [2024-03-21 03:25:30,615][04017] Updated weights for policy 0, policy_version 23937 (0.0014) [2024-03-21 03:25:35,521][03784] Fps is (10 sec: 55705.9, 60 sec: 48059.9, 300 sec: 47208.2). Total num frames: 784662528. Throughput: 0: 47757.8. Samples: 785895100. Policy #0 lag: (min: 0.0, avg: 31.1, max: 71.0) [2024-03-21 03:25:35,522][03784] Avg episode reward: [(0, '0.922')] [2024-03-21 03:25:35,562][04017] Updated weights for policy 0, policy_version 23947 (0.0018) [2024-03-21 03:25:40,369][04017] Updated weights for policy 0, policy_version 23957 (0.0025) [2024-03-21 03:25:40,521][03784] Fps is (10 sec: 68812.4, 60 sec: 50790.4, 300 sec: 47652.4). Total num frames: 785022976. Throughput: 0: 47353.3. Samples: 786159900. Policy #0 lag: (min: 1.0, avg: 37.6, max: 76.0) [2024-03-21 03:25:40,522][03784] Avg episode reward: [(0, '1.536')] [2024-03-21 03:25:45,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49152.1, 300 sec: 46874.9). Total num frames: 785186816. Throughput: 0: 47033.5. Samples: 786305500. Policy #0 lag: (min: 1.0, avg: 37.6, max: 76.0) [2024-03-21 03:25:45,522][03784] Avg episode reward: [(0, '1.287')] [2024-03-21 03:25:49,509][04017] Updated weights for policy 0, policy_version 23967 (0.0010) [2024-03-21 03:25:50,521][03784] Fps is (10 sec: 32768.0, 60 sec: 48605.8, 300 sec: 46874.9). Total num frames: 785350656. Throughput: 0: 47655.5. Samples: 786602200. Policy #0 lag: (min: 1.0, avg: 37.6, max: 76.0) [2024-03-21 03:25:50,522][03784] Avg episode reward: [(0, '1.105')] [2024-03-21 03:25:52,313][03995] Signal inference workers to stop experience collection... (15800 times) [2024-03-21 03:25:52,314][03995] Signal inference workers to resume experience collection... (15800 times) [2024-03-21 03:25:52,381][04017] InferenceWorker_p0-w0: stopping experience collection (15800 times) [2024-03-21 03:25:52,381][04017] InferenceWorker_p0-w0: resuming experience collection (15800 times) [2024-03-21 03:25:55,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46967.5, 300 sec: 46763.8). Total num frames: 785612800. Throughput: 0: 47420.0. Samples: 786883400. Policy #0 lag: (min: 1.0, avg: 37.6, max: 76.0) [2024-03-21 03:25:55,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 03:25:56,984][04017] Updated weights for policy 0, policy_version 23977 (0.0016) [2024-03-21 03:26:00,521][03784] Fps is (10 sec: 58982.2, 60 sec: 49698.2, 300 sec: 47319.2). Total num frames: 785940480. Throughput: 0: 47308.9. Samples: 787013600. Policy #0 lag: (min: 0.0, avg: 27.6, max: 65.0) [2024-03-21 03:26:00,522][03784] Avg episode reward: [(0, '0.684')] [2024-03-21 03:26:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000023985_785940480.pth... [2024-03-21 03:26:00,647][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000023634_774438912.pth [2024-03-21 03:26:03,001][04017] Updated weights for policy 0, policy_version 23987 (0.0017) [2024-03-21 03:26:05,521][03784] Fps is (10 sec: 39321.8, 60 sec: 46967.5, 300 sec: 47208.2). Total num frames: 786006016. Throughput: 0: 46713.5. Samples: 787269300. Policy #0 lag: (min: 0.0, avg: 27.6, max: 65.0) [2024-03-21 03:26:05,522][03784] Avg episode reward: [(0, '1.163')] [2024-03-21 03:26:10,521][03784] Fps is (10 sec: 26214.3, 60 sec: 45875.1, 300 sec: 47208.1). Total num frames: 786202624. Throughput: 0: 46031.0. Samples: 787551500. Policy #0 lag: (min: 0.0, avg: 27.6, max: 65.0) [2024-03-21 03:26:10,522][03784] Avg episode reward: [(0, '0.736')] [2024-03-21 03:26:13,789][04017] Updated weights for policy 0, policy_version 23997 (0.0016) [2024-03-21 03:26:15,521][03784] Fps is (10 sec: 49151.4, 60 sec: 45329.1, 300 sec: 47652.4). Total num frames: 786497536. Throughput: 0: 46224.4. Samples: 787698600. Policy #0 lag: (min: 0.0, avg: 31.0, max: 72.0) [2024-03-21 03:26:15,522][03784] Avg episode reward: [(0, '1.279')] [2024-03-21 03:26:20,521][03784] Fps is (10 sec: 42598.8, 60 sec: 44236.9, 300 sec: 46763.8). Total num frames: 786628608. Throughput: 0: 46468.8. Samples: 787986200. Policy #0 lag: (min: 0.0, avg: 31.0, max: 72.0) [2024-03-21 03:26:20,522][03784] Avg episode reward: [(0, '0.790')] [2024-03-21 03:26:22,829][04017] Updated weights for policy 0, policy_version 24007 (0.0016) [2024-03-21 03:26:25,521][03784] Fps is (10 sec: 32768.4, 60 sec: 45329.1, 300 sec: 46652.8). Total num frames: 786825216. Throughput: 0: 47409.0. Samples: 788293300. Policy #0 lag: (min: 0.0, avg: 31.0, max: 72.0) [2024-03-21 03:26:25,521][03784] Avg episode reward: [(0, '0.790')] [2024-03-21 03:26:27,937][04017] Updated weights for policy 0, policy_version 24017 (0.0012) [2024-03-21 03:26:30,521][03784] Fps is (10 sec: 55705.9, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 787185664. Throughput: 0: 46926.7. Samples: 788417200. Policy #0 lag: (min: 0.0, avg: 50.1, max: 112.0) [2024-03-21 03:26:30,522][03784] Avg episode reward: [(0, '1.054')] [2024-03-21 03:26:32,176][04017] Updated weights for policy 0, policy_version 24027 (0.0012) [2024-03-21 03:26:35,521][03784] Fps is (10 sec: 62259.0, 60 sec: 46421.3, 300 sec: 47541.4). Total num frames: 787447808. Throughput: 0: 46322.3. Samples: 788686700. Policy #0 lag: (min: 0.0, avg: 50.1, max: 112.0) [2024-03-21 03:26:35,522][03784] Avg episode reward: [(0, '1.408')] [2024-03-21 03:26:39,380][04017] Updated weights for policy 0, policy_version 24037 (0.0010) [2024-03-21 03:26:40,521][03784] Fps is (10 sec: 52428.4, 60 sec: 44782.9, 300 sec: 47430.3). Total num frames: 787709952. Throughput: 0: 46548.8. Samples: 788978100. Policy #0 lag: (min: 0.0, avg: 50.1, max: 112.0) [2024-03-21 03:26:40,522][03784] Avg episode reward: [(0, '0.650')] [2024-03-21 03:26:42,573][03995] Signal inference workers to stop experience collection... (15850 times) [2024-03-21 03:26:42,579][03995] Signal inference workers to resume experience collection... (15850 times) [2024-03-21 03:26:42,667][04017] InferenceWorker_p0-w0: stopping experience collection (15850 times) [2024-03-21 03:26:42,667][04017] InferenceWorker_p0-w0: resuming experience collection (15850 times) [2024-03-21 03:26:45,521][03784] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 787906560. Throughput: 0: 46904.6. Samples: 789124300. Policy #0 lag: (min: 0.0, avg: 50.1, max: 112.0) [2024-03-21 03:26:45,522][03784] Avg episode reward: [(0, '1.162')] [2024-03-21 03:26:46,433][04017] Updated weights for policy 0, policy_version 24047 (0.0019) [2024-03-21 03:26:50,521][03784] Fps is (10 sec: 55706.1, 60 sec: 48605.9, 300 sec: 47430.3). Total num frames: 788267008. Throughput: 0: 47762.2. Samples: 789418600. Policy #0 lag: (min: 1.0, avg: 40.8, max: 79.0) [2024-03-21 03:26:50,522][03784] Avg episode reward: [(0, '1.162')] [2024-03-21 03:26:52,139][04017] Updated weights for policy 0, policy_version 24057 (0.0021) [2024-03-21 03:26:55,521][03784] Fps is (10 sec: 49152.0, 60 sec: 46421.4, 300 sec: 47430.3). Total num frames: 788398080. Throughput: 0: 48013.5. Samples: 789712100. Policy #0 lag: (min: 1.0, avg: 40.8, max: 79.0) [2024-03-21 03:26:55,522][03784] Avg episode reward: [(0, '1.162')] [2024-03-21 03:26:57,940][04017] Updated weights for policy 0, policy_version 24067 (0.0010) [2024-03-21 03:27:00,521][03784] Fps is (10 sec: 62258.5, 60 sec: 49152.0, 300 sec: 48318.9). Total num frames: 788889600. Throughput: 0: 47644.4. Samples: 789842600. Policy #0 lag: (min: 1.0, avg: 40.8, max: 79.0) [2024-03-21 03:27:00,522][03784] Avg episode reward: [(0, '1.312')] [2024-03-21 03:27:03,409][04017] Updated weights for policy 0, policy_version 24077 (0.0011) [2024-03-21 03:27:05,521][03784] Fps is (10 sec: 62259.1, 60 sec: 50244.2, 300 sec: 47985.7). Total num frames: 789020672. Throughput: 0: 47317.8. Samples: 790115500. Policy #0 lag: (min: 0.0, avg: 53.1, max: 116.0) [2024-03-21 03:27:05,522][03784] Avg episode reward: [(0, '1.312')] [2024-03-21 03:27:10,521][03784] Fps is (10 sec: 22937.7, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 789118976. Throughput: 0: 46824.4. Samples: 790400400. Policy #0 lag: (min: 0.0, avg: 53.1, max: 116.0) [2024-03-21 03:27:10,522][03784] Avg episode reward: [(0, '0.710')] [2024-03-21 03:27:14,485][04017] Updated weights for policy 0, policy_version 24087 (0.0014) [2024-03-21 03:27:15,521][03784] Fps is (10 sec: 32767.8, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 789348352. Throughput: 0: 47395.5. Samples: 790550000. Policy #0 lag: (min: 0.0, avg: 53.1, max: 116.0) [2024-03-21 03:27:15,522][03784] Avg episode reward: [(0, '1.409')] [2024-03-21 03:27:20,521][03784] Fps is (10 sec: 42598.1, 60 sec: 48605.8, 300 sec: 46874.9). Total num frames: 789544960. Throughput: 0: 47317.7. Samples: 790816000. Policy #0 lag: (min: 0.0, avg: 53.1, max: 116.0) [2024-03-21 03:27:20,522][03784] Avg episode reward: [(0, '1.196')] [2024-03-21 03:27:23,264][04017] Updated weights for policy 0, policy_version 24097 (0.0011) [2024-03-21 03:27:25,521][03784] Fps is (10 sec: 36044.7, 60 sec: 48059.6, 300 sec: 46986.0). Total num frames: 789708800. Throughput: 0: 46968.9. Samples: 791091700. Policy #0 lag: (min: 0.0, avg: 36.2, max: 87.0) [2024-03-21 03:27:25,522][03784] Avg episode reward: [(0, '0.942')] [2024-03-21 03:27:30,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45329.0, 300 sec: 46652.7). Total num frames: 789905408. Throughput: 0: 46784.4. Samples: 791229600. Policy #0 lag: (min: 0.0, avg: 36.2, max: 87.0) [2024-03-21 03:27:30,522][03784] Avg episode reward: [(0, '0.711')] [2024-03-21 03:27:31,197][04017] Updated weights for policy 0, policy_version 24107 (0.0015) [2024-03-21 03:27:35,521][03784] Fps is (10 sec: 45875.3, 60 sec: 45329.0, 300 sec: 46763.8). Total num frames: 790167552. Throughput: 0: 45971.0. Samples: 791487300. Policy #0 lag: (min: 0.0, avg: 36.2, max: 87.0) [2024-03-21 03:27:35,522][03784] Avg episode reward: [(0, '0.984')] [2024-03-21 03:27:38,637][04017] Updated weights for policy 0, policy_version 24117 (0.0011) [2024-03-21 03:27:40,371][03995] Signal inference workers to stop experience collection... (15900 times) [2024-03-21 03:27:40,442][03995] Signal inference workers to resume experience collection... (15900 times) [2024-03-21 03:27:40,451][04017] InferenceWorker_p0-w0: stopping experience collection (15900 times) [2024-03-21 03:27:40,506][04017] InferenceWorker_p0-w0: resuming experience collection (15900 times) [2024-03-21 03:27:40,521][03784] Fps is (10 sec: 45874.7, 60 sec: 44236.7, 300 sec: 47097.0). Total num frames: 790364160. Throughput: 0: 45848.7. Samples: 791775300. Policy #0 lag: (min: 1.0, avg: 27.0, max: 64.0) [2024-03-21 03:27:40,522][03784] Avg episode reward: [(0, '0.803')] [2024-03-21 03:27:42,739][04017] Updated weights for policy 0, policy_version 24127 (0.0019) [2024-03-21 03:27:45,521][03784] Fps is (10 sec: 58982.4, 60 sec: 47513.5, 300 sec: 47319.2). Total num frames: 790757376. Throughput: 0: 45762.2. Samples: 791901900. Policy #0 lag: (min: 1.0, avg: 27.0, max: 64.0) [2024-03-21 03:27:45,522][03784] Avg episode reward: [(0, '1.424')] [2024-03-21 03:27:47,769][04017] Updated weights for policy 0, policy_version 24137 (0.0011) [2024-03-21 03:27:50,521][03784] Fps is (10 sec: 68813.7, 60 sec: 46421.3, 300 sec: 47763.5). Total num frames: 791052288. Throughput: 0: 45866.6. Samples: 792179500. Policy #0 lag: (min: 1.0, avg: 27.0, max: 64.0) [2024-03-21 03:27:50,522][03784] Avg episode reward: [(0, '1.456')] [2024-03-21 03:27:55,261][04017] Updated weights for policy 0, policy_version 24147 (0.0011) [2024-03-21 03:27:55,521][03784] Fps is (10 sec: 49152.0, 60 sec: 47513.5, 300 sec: 48096.8). Total num frames: 791248896. Throughput: 0: 46215.5. Samples: 792480100. Policy #0 lag: (min: 0.0, avg: 38.2, max: 83.0) [2024-03-21 03:27:55,522][03784] Avg episode reward: [(0, '0.496')] [2024-03-21 03:28:00,521][03784] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 47985.6). Total num frames: 791478272. Throughput: 0: 46186.6. Samples: 792628400. Policy #0 lag: (min: 0.0, avg: 38.2, max: 83.0) [2024-03-21 03:28:00,522][03784] Avg episode reward: [(0, '0.496')] [2024-03-21 03:28:00,723][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000024155_791511040.pth... [2024-03-21 03:28:00,855][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000023807_780107776.pth [2024-03-21 03:28:02,613][04017] Updated weights for policy 0, policy_version 24157 (0.0020) [2024-03-21 03:28:05,521][03784] Fps is (10 sec: 32768.0, 60 sec: 42598.4, 300 sec: 46986.0). Total num frames: 791576576. Throughput: 0: 46797.8. Samples: 792921900. Policy #0 lag: (min: 0.0, avg: 38.2, max: 83.0) [2024-03-21 03:28:05,522][03784] Avg episode reward: [(0, '0.889')] [2024-03-21 03:28:10,188][04017] Updated weights for policy 0, policy_version 24167 (0.0016) [2024-03-21 03:28:10,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 791937024. Throughput: 0: 46755.5. Samples: 793195700. Policy #0 lag: (min: 0.0, avg: 38.2, max: 83.0) [2024-03-21 03:28:10,522][03784] Avg episode reward: [(0, '0.659')] [2024-03-21 03:28:15,521][03784] Fps is (10 sec: 52429.3, 60 sec: 45875.3, 300 sec: 46652.8). Total num frames: 792100864. Throughput: 0: 47084.5. Samples: 793348400. Policy #0 lag: (min: 0.0, avg: 53.9, max: 115.0) [2024-03-21 03:28:15,522][03784] Avg episode reward: [(0, '1.102')] [2024-03-21 03:28:16,708][04017] Updated weights for policy 0, policy_version 24177 (0.0016) [2024-03-21 03:28:20,521][03784] Fps is (10 sec: 45875.7, 60 sec: 47513.7, 300 sec: 47319.2). Total num frames: 792395776. Throughput: 0: 47611.2. Samples: 793629800. Policy #0 lag: (min: 0.0, avg: 53.9, max: 115.0) [2024-03-21 03:28:20,522][03784] Avg episode reward: [(0, '1.152')] [2024-03-21 03:28:25,449][04017] Updated weights for policy 0, policy_version 24187 (0.0012) [2024-03-21 03:28:25,521][03784] Fps is (10 sec: 45874.6, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 792559616. Throughput: 0: 47584.5. Samples: 793916600. Policy #0 lag: (min: 0.0, avg: 53.9, max: 115.0) [2024-03-21 03:28:25,522][03784] Avg episode reward: [(0, '1.300')] [2024-03-21 03:28:28,336][03995] Signal inference workers to stop experience collection... (15950 times) [2024-03-21 03:28:28,336][03995] Signal inference workers to resume experience collection... (15950 times) [2024-03-21 03:28:28,413][04017] InferenceWorker_p0-w0: stopping experience collection (15950 times) [2024-03-21 03:28:28,413][04017] InferenceWorker_p0-w0: resuming experience collection (15950 times) [2024-03-21 03:28:30,521][03784] Fps is (10 sec: 45874.9, 60 sec: 49152.0, 300 sec: 46874.9). Total num frames: 792854528. Throughput: 0: 47771.1. Samples: 794051600. Policy #0 lag: (min: 0.0, avg: 34.9, max: 86.0) [2024-03-21 03:28:30,522][03784] Avg episode reward: [(0, '1.309')] [2024-03-21 03:28:30,807][04017] Updated weights for policy 0, policy_version 24197 (0.0013) [2024-03-21 03:28:35,521][03784] Fps is (10 sec: 52429.0, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 793083904. Throughput: 0: 47913.3. Samples: 794335600. Policy #0 lag: (min: 0.0, avg: 34.9, max: 86.0) [2024-03-21 03:28:35,522][03784] Avg episode reward: [(0, '1.540')] [2024-03-21 03:28:36,675][04017] Updated weights for policy 0, policy_version 24207 (0.0018) [2024-03-21 03:28:40,495][04017] Updated weights for policy 0, policy_version 24217 (0.0014) [2024-03-21 03:28:40,521][03784] Fps is (10 sec: 68812.9, 60 sec: 52975.0, 300 sec: 47985.7). Total num frames: 793542656. Throughput: 0: 46504.5. Samples: 794572800. Policy #0 lag: (min: 0.0, avg: 34.9, max: 86.0) [2024-03-21 03:28:40,522][03784] Avg episode reward: [(0, '0.756')] [2024-03-21 03:28:45,521][03784] Fps is (10 sec: 58982.6, 60 sec: 48605.9, 300 sec: 47652.5). Total num frames: 793673728. Throughput: 0: 46575.7. Samples: 794724300. Policy #0 lag: (min: 1.0, avg: 46.3, max: 100.0) [2024-03-21 03:28:45,522][03784] Avg episode reward: [(0, '0.956')] [2024-03-21 03:28:50,521][03784] Fps is (10 sec: 26214.5, 60 sec: 45875.2, 300 sec: 47430.3). Total num frames: 793804800. Throughput: 0: 46486.7. Samples: 795013800. Policy #0 lag: (min: 1.0, avg: 46.3, max: 100.0) [2024-03-21 03:28:50,522][03784] Avg episode reward: [(0, '0.596')] [2024-03-21 03:28:54,696][04017] Updated weights for policy 0, policy_version 24227 (0.0010) [2024-03-21 03:28:55,521][03784] Fps is (10 sec: 22937.5, 60 sec: 44236.8, 300 sec: 46763.8). Total num frames: 793903104. Throughput: 0: 46871.2. Samples: 795304900. Policy #0 lag: (min: 1.0, avg: 46.3, max: 100.0) [2024-03-21 03:28:55,522][03784] Avg episode reward: [(0, '0.611')] [2024-03-21 03:29:00,521][03784] Fps is (10 sec: 29491.1, 60 sec: 43690.7, 300 sec: 46652.7). Total num frames: 794099712. Throughput: 0: 46782.1. Samples: 795453600. Policy #0 lag: (min: 1.0, avg: 46.3, max: 100.0) [2024-03-21 03:29:00,522][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 03:29:02,100][04017] Updated weights for policy 0, policy_version 24237 (0.0019) [2024-03-21 03:29:05,521][03784] Fps is (10 sec: 49151.2, 60 sec: 46967.3, 300 sec: 46874.9). Total num frames: 794394624. Throughput: 0: 47033.1. Samples: 795746300. Policy #0 lag: (min: 0.0, avg: 30.1, max: 76.0) [2024-03-21 03:29:05,522][03784] Avg episode reward: [(0, '0.850')] [2024-03-21 03:29:06,818][04017] Updated weights for policy 0, policy_version 24247 (0.0011) [2024-03-21 03:29:10,521][03784] Fps is (10 sec: 52428.8, 60 sec: 44783.0, 300 sec: 46541.7). Total num frames: 794624000. Throughput: 0: 47251.1. Samples: 796042900. Policy #0 lag: (min: 0.0, avg: 30.1, max: 76.0) [2024-03-21 03:29:10,522][03784] Avg episode reward: [(0, '0.850')] [2024-03-21 03:29:13,913][04017] Updated weights for policy 0, policy_version 24257 (0.0010) [2024-03-21 03:29:15,423][03995] Signal inference workers to stop experience collection... (16000 times) [2024-03-21 03:29:15,488][04017] InferenceWorker_p0-w0: stopping experience collection (16000 times) [2024-03-21 03:29:15,521][03784] Fps is (10 sec: 62259.9, 60 sec: 48605.7, 300 sec: 46986.0). Total num frames: 795017216. Throughput: 0: 47297.7. Samples: 796180000. Policy #0 lag: (min: 0.0, avg: 30.1, max: 76.0) [2024-03-21 03:29:15,522][03784] Avg episode reward: [(0, '1.351')] [2024-03-21 03:29:15,687][03995] Signal inference workers to resume experience collection... (16000 times) [2024-03-21 03:29:15,687][04017] InferenceWorker_p0-w0: resuming experience collection (16000 times) [2024-03-21 03:29:17,785][04017] Updated weights for policy 0, policy_version 24267 (0.0017) [2024-03-21 03:29:20,521][03784] Fps is (10 sec: 65535.7, 60 sec: 48059.7, 300 sec: 47097.1). Total num frames: 795279360. Throughput: 0: 47017.7. Samples: 796451400. Policy #0 lag: (min: 1.0, avg: 36.5, max: 69.0) [2024-03-21 03:29:20,522][03784] Avg episode reward: [(0, '1.290')] [2024-03-21 03:29:23,146][04017] Updated weights for policy 0, policy_version 24277 (0.0015) [2024-03-21 03:29:25,521][03784] Fps is (10 sec: 62259.5, 60 sec: 51336.6, 300 sec: 48096.8). Total num frames: 795639808. Throughput: 0: 47902.2. Samples: 796728400. Policy #0 lag: (min: 1.0, avg: 36.5, max: 69.0) [2024-03-21 03:29:25,522][03784] Avg episode reward: [(0, '1.015')] [2024-03-21 03:29:30,521][03784] Fps is (10 sec: 45874.4, 60 sec: 48059.6, 300 sec: 47319.2). Total num frames: 795738112. Throughput: 0: 47799.7. Samples: 796875300. Policy #0 lag: (min: 1.0, avg: 36.5, max: 69.0) [2024-03-21 03:29:30,522][03784] Avg episode reward: [(0, '1.159')] [2024-03-21 03:29:31,487][04017] Updated weights for policy 0, policy_version 24287 (0.0016) [2024-03-21 03:29:35,521][03784] Fps is (10 sec: 32767.8, 60 sec: 48059.7, 300 sec: 47430.3). Total num frames: 795967488. Throughput: 0: 47875.5. Samples: 797168200. Policy #0 lag: (min: 0.0, avg: 43.4, max: 86.0) [2024-03-21 03:29:35,522][03784] Avg episode reward: [(0, '0.746')] [2024-03-21 03:29:39,123][04017] Updated weights for policy 0, policy_version 24297 (0.0011) [2024-03-21 03:29:40,521][03784] Fps is (10 sec: 45876.1, 60 sec: 44236.8, 300 sec: 47319.2). Total num frames: 796196864. Throughput: 0: 47782.2. Samples: 797455100. Policy #0 lag: (min: 0.0, avg: 43.4, max: 86.0) [2024-03-21 03:29:40,522][03784] Avg episode reward: [(0, '1.282')] [2024-03-21 03:29:45,521][03784] Fps is (10 sec: 32768.1, 60 sec: 43690.6, 300 sec: 46986.0). Total num frames: 796295168. Throughput: 0: 47775.5. Samples: 797603500. Policy #0 lag: (min: 0.0, avg: 43.4, max: 86.0) [2024-03-21 03:29:45,522][03784] Avg episode reward: [(0, '1.470')] [2024-03-21 03:29:50,521][03784] Fps is (10 sec: 26214.3, 60 sec: 44236.7, 300 sec: 46319.5). Total num frames: 796459008. Throughput: 0: 48209.0. Samples: 797915700. Policy #0 lag: (min: 0.0, avg: 43.4, max: 86.0) [2024-03-21 03:29:50,522][03784] Avg episode reward: [(0, '1.280')] [2024-03-21 03:29:51,199][04017] Updated weights for policy 0, policy_version 24307 (0.0013) [2024-03-21 03:29:55,521][03784] Fps is (10 sec: 36044.5, 60 sec: 45875.1, 300 sec: 46430.6). Total num frames: 796655616. Throughput: 0: 48368.8. Samples: 798219500. Policy #0 lag: (min: 0.0, avg: 32.3, max: 75.0) [2024-03-21 03:29:55,522][03784] Avg episode reward: [(0, '1.251')] [2024-03-21 03:29:56,972][04017] Updated weights for policy 0, policy_version 24317 (0.0011) [2024-03-21 03:30:00,521][03784] Fps is (10 sec: 55705.2, 60 sec: 48605.8, 300 sec: 46874.9). Total num frames: 797016064. Throughput: 0: 48175.5. Samples: 798347900. Policy #0 lag: (min: 0.0, avg: 32.3, max: 75.0) [2024-03-21 03:30:00,522][03784] Avg episode reward: [(0, '1.486')] [2024-03-21 03:30:00,914][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000024324_797048832.pth... [2024-03-21 03:30:00,970][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000023985_785940480.pth [2024-03-21 03:30:01,855][04017] Updated weights for policy 0, policy_version 24327 (0.0017) [2024-03-21 03:30:05,521][03784] Fps is (10 sec: 75367.0, 60 sec: 50244.4, 300 sec: 47319.2). Total num frames: 797409280. Throughput: 0: 47742.2. Samples: 798599800. Policy #0 lag: (min: 0.0, avg: 32.3, max: 75.0) [2024-03-21 03:30:05,522][03784] Avg episode reward: [(0, '1.254')] [2024-03-21 03:30:06,124][04017] Updated weights for policy 0, policy_version 24337 (0.0011) [2024-03-21 03:30:06,165][03995] Signal inference workers to stop experience collection... (16050 times) [2024-03-21 03:30:06,246][04017] InferenceWorker_p0-w0: stopping experience collection (16050 times) [2024-03-21 03:30:06,380][03995] Signal inference workers to resume experience collection... (16050 times) [2024-03-21 03:30:06,380][04017] InferenceWorker_p0-w0: resuming experience collection (16050 times) [2024-03-21 03:30:10,288][04017] Updated weights for policy 0, policy_version 24347 (0.0021) [2024-03-21 03:30:10,521][03784] Fps is (10 sec: 78643.6, 60 sec: 52974.9, 300 sec: 47541.4). Total num frames: 797802496. Throughput: 0: 47246.6. Samples: 798854500. Policy #0 lag: (min: 3.0, avg: 41.7, max: 74.0) [2024-03-21 03:30:10,522][03784] Avg episode reward: [(0, '0.977')] [2024-03-21 03:30:15,521][03784] Fps is (10 sec: 58982.7, 60 sec: 49698.2, 300 sec: 47541.4). Total num frames: 797999104. Throughput: 0: 47238.0. Samples: 799001000. Policy #0 lag: (min: 3.0, avg: 41.7, max: 74.0) [2024-03-21 03:30:15,522][03784] Avg episode reward: [(0, '1.090')] [2024-03-21 03:30:20,311][04017] Updated weights for policy 0, policy_version 24357 (0.0011) [2024-03-21 03:30:20,521][03784] Fps is (10 sec: 32768.2, 60 sec: 47513.6, 300 sec: 47541.4). Total num frames: 798130176. Throughput: 0: 47484.5. Samples: 799305000. Policy #0 lag: (min: 3.0, avg: 41.7, max: 74.0) [2024-03-21 03:30:20,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 03:30:25,521][03784] Fps is (10 sec: 39321.7, 60 sec: 45875.2, 300 sec: 47652.5). Total num frames: 798392320. Throughput: 0: 46949.0. Samples: 799567800. Policy #0 lag: (min: 0.0, avg: 42.8, max: 82.0) [2024-03-21 03:30:25,522][03784] Avg episode reward: [(0, '0.931')] [2024-03-21 03:30:27,245][04017] Updated weights for policy 0, policy_version 24367 (0.0011) [2024-03-21 03:30:30,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46967.6, 300 sec: 47097.0). Total num frames: 798556160. Throughput: 0: 46733.3. Samples: 799706500. Policy #0 lag: (min: 0.0, avg: 42.8, max: 82.0) [2024-03-21 03:30:30,522][03784] Avg episode reward: [(0, '1.053')] [2024-03-21 03:30:35,521][03784] Fps is (10 sec: 19660.7, 60 sec: 43690.7, 300 sec: 45986.3). Total num frames: 798588928. Throughput: 0: 45897.8. Samples: 799981100. Policy #0 lag: (min: 0.0, avg: 42.8, max: 82.0) [2024-03-21 03:30:35,522][03784] Avg episode reward: [(0, '1.222')] [2024-03-21 03:30:38,793][04017] Updated weights for policy 0, policy_version 24377 (0.0016) [2024-03-21 03:30:40,521][03784] Fps is (10 sec: 26214.5, 60 sec: 43690.7, 300 sec: 46208.4). Total num frames: 798818304. Throughput: 0: 44929.0. Samples: 800241300. Policy #0 lag: (min: 0.0, avg: 42.8, max: 82.0) [2024-03-21 03:30:40,522][03784] Avg episode reward: [(0, '1.220')] [2024-03-21 03:30:44,996][04017] Updated weights for policy 0, policy_version 24387 (0.0020) [2024-03-21 03:30:45,521][03784] Fps is (10 sec: 55705.5, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 799145984. Throughput: 0: 45164.6. Samples: 800380300. Policy #0 lag: (min: 0.0, avg: 39.3, max: 89.0) [2024-03-21 03:30:45,523][03784] Avg episode reward: [(0, '0.627')] [2024-03-21 03:30:50,495][04017] Updated weights for policy 0, policy_version 24397 (0.0019) [2024-03-21 03:30:50,521][03784] Fps is (10 sec: 62258.5, 60 sec: 49698.1, 300 sec: 46874.9). Total num frames: 799440896. Throughput: 0: 45639.9. Samples: 800653600. Policy #0 lag: (min: 0.0, avg: 39.3, max: 89.0) [2024-03-21 03:30:50,522][03784] Avg episode reward: [(0, '1.561')] [2024-03-21 03:30:55,521][03784] Fps is (10 sec: 52428.9, 60 sec: 50244.4, 300 sec: 46541.7). Total num frames: 799670272. Throughput: 0: 45877.9. Samples: 800919000. Policy #0 lag: (min: 0.0, avg: 39.3, max: 89.0) [2024-03-21 03:30:55,522][03784] Avg episode reward: [(0, '0.737')] [2024-03-21 03:30:56,365][04017] Updated weights for policy 0, policy_version 24407 (0.0012) [2024-03-21 03:31:00,305][03995] Signal inference workers to stop experience collection... (16100 times) [2024-03-21 03:31:00,323][03995] Signal inference workers to resume experience collection... (16100 times) [2024-03-21 03:31:00,389][04017] InferenceWorker_p0-w0: stopping experience collection (16100 times) [2024-03-21 03:31:00,389][04017] InferenceWorker_p0-w0: resuming experience collection (16100 times) [2024-03-21 03:31:00,521][03784] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 799932416. Throughput: 0: 45471.0. Samples: 801047200. Policy #0 lag: (min: 0.0, avg: 41.2, max: 100.0) [2024-03-21 03:31:00,522][03784] Avg episode reward: [(0, '0.929')] [2024-03-21 03:31:02,785][04017] Updated weights for policy 0, policy_version 24417 (0.0020) [2024-03-21 03:31:05,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45329.1, 300 sec: 47208.1). Total num frames: 800129024. Throughput: 0: 45008.9. Samples: 801330400. Policy #0 lag: (min: 0.0, avg: 41.2, max: 100.0) [2024-03-21 03:31:05,522][03784] Avg episode reward: [(0, '0.603')] [2024-03-21 03:31:10,521][03784] Fps is (10 sec: 36044.9, 60 sec: 41506.2, 300 sec: 46763.8). Total num frames: 800292864. Throughput: 0: 45753.3. Samples: 801626700. Policy #0 lag: (min: 0.0, avg: 41.2, max: 100.0) [2024-03-21 03:31:10,522][03784] Avg episode reward: [(0, '1.259')] [2024-03-21 03:31:13,750][04017] Updated weights for policy 0, policy_version 24427 (0.0017) [2024-03-21 03:31:15,521][03784] Fps is (10 sec: 36045.1, 60 sec: 41506.2, 300 sec: 46986.0). Total num frames: 800489472. Throughput: 0: 45909.0. Samples: 801772400. Policy #0 lag: (min: 0.0, avg: 41.2, max: 100.0) [2024-03-21 03:31:15,530][03784] Avg episode reward: [(0, '0.739')] [2024-03-21 03:31:20,521][03784] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 46986.0). Total num frames: 800686080. Throughput: 0: 46464.4. Samples: 802072000. Policy #0 lag: (min: 1.0, avg: 30.4, max: 64.0) [2024-03-21 03:31:20,531][03784] Avg episode reward: [(0, '1.710')] [2024-03-21 03:31:22,403][04017] Updated weights for policy 0, policy_version 24437 (0.0015) [2024-03-21 03:31:25,521][03784] Fps is (10 sec: 55705.1, 60 sec: 44236.7, 300 sec: 46986.0). Total num frames: 801046528. Throughput: 0: 46073.3. Samples: 802314600. Policy #0 lag: (min: 1.0, avg: 30.4, max: 64.0) [2024-03-21 03:31:25,522][03784] Avg episode reward: [(0, '1.440')] [2024-03-21 03:31:25,998][04017] Updated weights for policy 0, policy_version 24447 (0.0019) [2024-03-21 03:31:30,521][03784] Fps is (10 sec: 55705.9, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 801243136. Throughput: 0: 46308.9. Samples: 802464200. Policy #0 lag: (min: 1.0, avg: 30.4, max: 64.0) [2024-03-21 03:31:30,522][03784] Avg episode reward: [(0, '1.392')] [2024-03-21 03:31:33,210][04017] Updated weights for policy 0, policy_version 24457 (0.0024) [2024-03-21 03:31:35,521][03784] Fps is (10 sec: 45875.6, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 801505280. Throughput: 0: 46200.2. Samples: 802732600. Policy #0 lag: (min: 1.0, avg: 52.1, max: 105.0) [2024-03-21 03:31:35,522][03784] Avg episode reward: [(0, '0.722')] [2024-03-21 03:31:40,137][04017] Updated weights for policy 0, policy_version 24467 (0.0025) [2024-03-21 03:31:40,521][03784] Fps is (10 sec: 52429.0, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 801767424. Throughput: 0: 46433.3. Samples: 803008500. Policy #0 lag: (min: 1.0, avg: 52.1, max: 105.0) [2024-03-21 03:31:40,522][03784] Avg episode reward: [(0, '1.321')] [2024-03-21 03:31:45,521][03784] Fps is (10 sec: 49151.4, 60 sec: 47513.6, 300 sec: 46541.6). Total num frames: 801996800. Throughput: 0: 46675.6. Samples: 803147600. Policy #0 lag: (min: 1.0, avg: 52.1, max: 105.0) [2024-03-21 03:31:45,522][03784] Avg episode reward: [(0, '1.030')] [2024-03-21 03:31:46,123][04017] Updated weights for policy 0, policy_version 24477 (0.0012) [2024-03-21 03:31:50,521][03784] Fps is (10 sec: 55705.3, 60 sec: 48059.8, 300 sec: 47208.1). Total num frames: 802324480. Throughput: 0: 46433.3. Samples: 803419900. Policy #0 lag: (min: 1.0, avg: 52.1, max: 105.0) [2024-03-21 03:31:50,522][03784] Avg episode reward: [(0, '0.862')] [2024-03-21 03:31:51,556][04017] Updated weights for policy 0, policy_version 24487 (0.0015) [2024-03-21 03:31:52,040][03995] Signal inference workers to stop experience collection... (16150 times) [2024-03-21 03:31:52,047][03995] Signal inference workers to resume experience collection... (16150 times) [2024-03-21 03:31:52,131][04017] InferenceWorker_p0-w0: stopping experience collection (16150 times) [2024-03-21 03:31:52,132][04017] InferenceWorker_p0-w0: resuming experience collection (16150 times) [2024-03-21 03:31:55,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46967.4, 300 sec: 46097.4). Total num frames: 802488320. Throughput: 0: 46333.3. Samples: 803711700. Policy #0 lag: (min: 0.0, avg: 47.0, max: 92.0) [2024-03-21 03:31:55,522][03784] Avg episode reward: [(0, '0.815')] [2024-03-21 03:32:00,115][04017] Updated weights for policy 0, policy_version 24497 (0.0015) [2024-03-21 03:32:00,521][03784] Fps is (10 sec: 42598.6, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 802750464. Throughput: 0: 46139.9. Samples: 803848700. Policy #0 lag: (min: 0.0, avg: 47.0, max: 92.0) [2024-03-21 03:32:00,522][03784] Avg episode reward: [(0, '1.184')] [2024-03-21 03:32:00,532][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000024498_802750464.pth... [2024-03-21 03:32:00,648][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000024155_791511040.pth [2024-03-21 03:32:05,521][03784] Fps is (10 sec: 39321.6, 60 sec: 45875.2, 300 sec: 46652.7). Total num frames: 802881536. Throughput: 0: 45833.3. Samples: 804134500. Policy #0 lag: (min: 0.0, avg: 47.0, max: 92.0) [2024-03-21 03:32:05,522][03784] Avg episode reward: [(0, '0.733')] [2024-03-21 03:32:10,153][04017] Updated weights for policy 0, policy_version 24507 (0.0010) [2024-03-21 03:32:10,521][03784] Fps is (10 sec: 29491.3, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 803045376. Throughput: 0: 46633.4. Samples: 804413100. Policy #0 lag: (min: 0.0, avg: 30.0, max: 85.0) [2024-03-21 03:32:10,522][03784] Avg episode reward: [(0, '1.315')] [2024-03-21 03:32:14,429][04017] Updated weights for policy 0, policy_version 24517 (0.0020) [2024-03-21 03:32:15,521][03784] Fps is (10 sec: 49152.4, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 803373056. Throughput: 0: 46162.3. Samples: 804541500. Policy #0 lag: (min: 0.0, avg: 30.0, max: 85.0) [2024-03-21 03:32:15,522][03784] Avg episode reward: [(0, '1.271')] [2024-03-21 03:32:20,521][03784] Fps is (10 sec: 58982.4, 60 sec: 49152.1, 300 sec: 47208.1). Total num frames: 803635200. Throughput: 0: 46668.9. Samples: 804832700. Policy #0 lag: (min: 0.0, avg: 30.0, max: 85.0) [2024-03-21 03:32:20,522][03784] Avg episode reward: [(0, '1.520')] [2024-03-21 03:32:25,300][04017] Updated weights for policy 0, policy_version 24527 (0.0013) [2024-03-21 03:32:25,521][03784] Fps is (10 sec: 32767.7, 60 sec: 44236.8, 300 sec: 46763.8). Total num frames: 803700736. Throughput: 0: 47717.7. Samples: 805155800. Policy #0 lag: (min: 0.0, avg: 34.9, max: 91.0) [2024-03-21 03:32:25,522][03784] Avg episode reward: [(0, '1.520')] [2024-03-21 03:32:30,521][03784] Fps is (10 sec: 32767.6, 60 sec: 45329.0, 300 sec: 46763.8). Total num frames: 803962880. Throughput: 0: 47588.8. Samples: 805289100. Policy #0 lag: (min: 0.0, avg: 34.9, max: 91.0) [2024-03-21 03:32:30,522][03784] Avg episode reward: [(0, '0.537')] [2024-03-21 03:32:31,032][04017] Updated weights for policy 0, policy_version 24537 (0.0012) [2024-03-21 03:32:35,521][03784] Fps is (10 sec: 55706.4, 60 sec: 45875.2, 300 sec: 47097.1). Total num frames: 804257792. Throughput: 0: 48109.0. Samples: 805584800. Policy #0 lag: (min: 0.0, avg: 34.9, max: 91.0) [2024-03-21 03:32:35,522][03784] Avg episode reward: [(0, '1.346')] [2024-03-21 03:32:36,188][04017] Updated weights for policy 0, policy_version 24547 (0.0027) [2024-03-21 03:32:40,521][03784] Fps is (10 sec: 58983.1, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 804552704. Throughput: 0: 47522.3. Samples: 805850200. Policy #0 lag: (min: 0.0, avg: 34.9, max: 91.0) [2024-03-21 03:32:40,522][03784] Avg episode reward: [(0, '0.572')] [2024-03-21 03:32:43,222][04017] Updated weights for policy 0, policy_version 24557 (0.0011) [2024-03-21 03:32:45,521][03784] Fps is (10 sec: 49151.6, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 804749312. Throughput: 0: 47815.6. Samples: 806000400. Policy #0 lag: (min: 1.0, avg: 45.1, max: 91.0) [2024-03-21 03:32:45,522][03784] Avg episode reward: [(0, '0.708')] [2024-03-21 03:32:49,886][04017] Updated weights for policy 0, policy_version 24567 (0.0011) [2024-03-21 03:32:50,521][03784] Fps is (10 sec: 49151.6, 60 sec: 45329.0, 300 sec: 46763.8). Total num frames: 805044224. Throughput: 0: 48144.4. Samples: 806301000. Policy #0 lag: (min: 1.0, avg: 45.1, max: 91.0) [2024-03-21 03:32:50,522][03784] Avg episode reward: [(0, '1.299')] [2024-03-21 03:32:53,192][03995] Signal inference workers to stop experience collection... (16200 times) [2024-03-21 03:32:53,249][04017] InferenceWorker_p0-w0: stopping experience collection (16200 times) [2024-03-21 03:32:53,272][03995] Signal inference workers to resume experience collection... (16200 times) [2024-03-21 03:32:53,302][04017] InferenceWorker_p0-w0: resuming experience collection (16200 times) [2024-03-21 03:32:54,743][04017] Updated weights for policy 0, policy_version 24577 (0.0015) [2024-03-21 03:32:55,521][03784] Fps is (10 sec: 62259.2, 60 sec: 48059.8, 300 sec: 47097.1). Total num frames: 805371904. Throughput: 0: 48542.2. Samples: 806597500. Policy #0 lag: (min: 1.0, avg: 45.1, max: 91.0) [2024-03-21 03:32:55,522][03784] Avg episode reward: [(0, '1.299')] [2024-03-21 03:33:00,521][03784] Fps is (10 sec: 55705.7, 60 sec: 47513.6, 300 sec: 47541.4). Total num frames: 805601280. Throughput: 0: 49051.0. Samples: 806748800. Policy #0 lag: (min: 3.0, avg: 42.5, max: 122.0) [2024-03-21 03:33:00,522][03784] Avg episode reward: [(0, '1.338')] [2024-03-21 03:33:00,954][04017] Updated weights for policy 0, policy_version 24587 (0.0010) [2024-03-21 03:33:05,521][03784] Fps is (10 sec: 55706.0, 60 sec: 50790.5, 300 sec: 47430.3). Total num frames: 805928960. Throughput: 0: 48633.4. Samples: 807021200. Policy #0 lag: (min: 3.0, avg: 42.5, max: 122.0) [2024-03-21 03:33:05,522][03784] Avg episode reward: [(0, '0.524')] [2024-03-21 03:33:07,225][04017] Updated weights for policy 0, policy_version 24597 (0.0011) [2024-03-21 03:33:10,521][03784] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 47430.3). Total num frames: 806092800. Throughput: 0: 47722.3. Samples: 807303300. Policy #0 lag: (min: 3.0, avg: 42.5, max: 122.0) [2024-03-21 03:33:10,522][03784] Avg episode reward: [(0, '0.957')] [2024-03-21 03:33:15,521][03784] Fps is (10 sec: 29491.0, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 806223872. Throughput: 0: 48297.9. Samples: 807462500. Policy #0 lag: (min: 3.0, avg: 42.5, max: 122.0) [2024-03-21 03:33:15,522][03784] Avg episode reward: [(0, '1.139')] [2024-03-21 03:33:18,724][04017] Updated weights for policy 0, policy_version 24607 (0.0011) [2024-03-21 03:33:20,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46421.3, 300 sec: 46986.0). Total num frames: 806420480. Throughput: 0: 48397.7. Samples: 807762700. Policy #0 lag: (min: 0.0, avg: 36.6, max: 84.0) [2024-03-21 03:33:20,522][03784] Avg episode reward: [(0, '1.479')] [2024-03-21 03:33:24,050][04017] Updated weights for policy 0, policy_version 24617 (0.0011) [2024-03-21 03:33:25,521][03784] Fps is (10 sec: 55704.8, 60 sec: 51336.5, 300 sec: 47208.1). Total num frames: 806780928. Throughput: 0: 48575.4. Samples: 808036100. Policy #0 lag: (min: 0.0, avg: 36.6, max: 84.0) [2024-03-21 03:33:25,522][03784] Avg episode reward: [(0, '1.090')] [2024-03-21 03:33:28,502][04017] Updated weights for policy 0, policy_version 24627 (0.0010) [2024-03-21 03:33:30,521][03784] Fps is (10 sec: 58982.5, 60 sec: 50790.5, 300 sec: 47208.1). Total num frames: 807010304. Throughput: 0: 48351.1. Samples: 808176200. Policy #0 lag: (min: 0.0, avg: 36.6, max: 84.0) [2024-03-21 03:33:30,522][03784] Avg episode reward: [(0, '1.090')] [2024-03-21 03:33:35,521][03784] Fps is (10 sec: 32768.6, 60 sec: 47513.6, 300 sec: 45986.3). Total num frames: 807108608. Throughput: 0: 47846.8. Samples: 808454100. Policy #0 lag: (min: 0.0, avg: 36.6, max: 84.0) [2024-03-21 03:33:35,522][03784] Avg episode reward: [(0, '0.587')] [2024-03-21 03:33:38,436][04017] Updated weights for policy 0, policy_version 24637 (0.0011) [2024-03-21 03:33:40,521][03784] Fps is (10 sec: 36044.3, 60 sec: 46967.4, 300 sec: 46430.6). Total num frames: 807370752. Throughput: 0: 46993.2. Samples: 808712200. Policy #0 lag: (min: 0.0, avg: 32.8, max: 80.0) [2024-03-21 03:33:40,522][03784] Avg episode reward: [(0, '0.758')] [2024-03-21 03:33:44,182][04017] Updated weights for policy 0, policy_version 24647 (0.0021) [2024-03-21 03:33:45,521][03784] Fps is (10 sec: 62258.3, 60 sec: 49698.1, 300 sec: 47208.1). Total num frames: 807731200. Throughput: 0: 46668.8. Samples: 808848900. Policy #0 lag: (min: 0.0, avg: 32.8, max: 80.0) [2024-03-21 03:33:45,522][03784] Avg episode reward: [(0, '1.040')] [2024-03-21 03:33:45,586][03995] Signal inference workers to stop experience collection... (16250 times) [2024-03-21 03:33:45,672][04017] InferenceWorker_p0-w0: stopping experience collection (16250 times) [2024-03-21 03:33:45,711][03995] Signal inference workers to resume experience collection... (16250 times) [2024-03-21 03:33:45,732][04017] InferenceWorker_p0-w0: resuming experience collection (16250 times) [2024-03-21 03:33:47,542][04017] Updated weights for policy 0, policy_version 24657 (0.0014) [2024-03-21 03:33:50,521][03784] Fps is (10 sec: 68814.0, 60 sec: 50244.4, 300 sec: 47985.7). Total num frames: 808058880. Throughput: 0: 46655.5. Samples: 809120700. Policy #0 lag: (min: 0.0, avg: 32.8, max: 80.0) [2024-03-21 03:33:50,522][03784] Avg episode reward: [(0, '1.190')] [2024-03-21 03:33:55,521][03784] Fps is (10 sec: 42599.2, 60 sec: 46421.4, 300 sec: 47652.5). Total num frames: 808157184. Throughput: 0: 47068.9. Samples: 809421400. Policy #0 lag: (min: 0.0, avg: 40.2, max: 84.0) [2024-03-21 03:33:55,522][03784] Avg episode reward: [(0, '1.163')] [2024-03-21 03:33:58,005][04017] Updated weights for policy 0, policy_version 24667 (0.0015) [2024-03-21 03:34:00,521][03784] Fps is (10 sec: 26213.7, 60 sec: 45329.0, 300 sec: 47208.1). Total num frames: 808321024. Throughput: 0: 46455.3. Samples: 809553000. Policy #0 lag: (min: 0.0, avg: 40.2, max: 84.0) [2024-03-21 03:34:00,522][03784] Avg episode reward: [(0, '0.572')] [2024-03-21 03:34:00,769][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000024669_808353792.pth... [2024-03-21 03:34:00,902][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000024324_797048832.pth [2024-03-21 03:34:05,521][03784] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 47319.2). Total num frames: 808583168. Throughput: 0: 46233.4. Samples: 809843200. Policy #0 lag: (min: 0.0, avg: 40.2, max: 84.0) [2024-03-21 03:34:05,522][03784] Avg episode reward: [(0, '0.798')] [2024-03-21 03:34:06,228][04017] Updated weights for policy 0, policy_version 24677 (0.0015) [2024-03-21 03:34:10,521][03784] Fps is (10 sec: 52429.8, 60 sec: 45875.2, 300 sec: 46874.9). Total num frames: 808845312. Throughput: 0: 46333.5. Samples: 810121100. Policy #0 lag: (min: 0.0, avg: 40.2, max: 84.0) [2024-03-21 03:34:10,522][03784] Avg episode reward: [(0, '1.531')] [2024-03-21 03:34:13,182][04017] Updated weights for policy 0, policy_version 24687 (0.0011) [2024-03-21 03:34:15,521][03784] Fps is (10 sec: 55705.2, 60 sec: 48605.9, 300 sec: 46986.0). Total num frames: 809140224. Throughput: 0: 46493.3. Samples: 810268400. Policy #0 lag: (min: 0.0, avg: 37.7, max: 75.0) [2024-03-21 03:34:15,522][03784] Avg episode reward: [(0, '0.982')] [2024-03-21 03:34:20,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 809205760. Throughput: 0: 46564.4. Samples: 810549500. Policy #0 lag: (min: 0.0, avg: 37.7, max: 75.0) [2024-03-21 03:34:20,522][03784] Avg episode reward: [(0, '1.022')] [2024-03-21 03:34:22,711][04017] Updated weights for policy 0, policy_version 24697 (0.0015) [2024-03-21 03:34:25,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45329.2, 300 sec: 46652.8). Total num frames: 809500672. Throughput: 0: 46880.2. Samples: 810821800. Policy #0 lag: (min: 0.0, avg: 37.7, max: 75.0) [2024-03-21 03:34:25,522][03784] Avg episode reward: [(0, '1.181')] [2024-03-21 03:34:27,652][04017] Updated weights for policy 0, policy_version 24707 (0.0015) [2024-03-21 03:34:30,521][03784] Fps is (10 sec: 58982.4, 60 sec: 46421.3, 300 sec: 46874.9). Total num frames: 809795584. Throughput: 0: 46831.2. Samples: 810956300. Policy #0 lag: (min: 0.0, avg: 34.6, max: 81.0) [2024-03-21 03:34:30,522][03784] Avg episode reward: [(0, '1.091')] [2024-03-21 03:34:33,665][04017] Updated weights for policy 0, policy_version 24717 (0.0017) [2024-03-21 03:34:35,521][03784] Fps is (10 sec: 55704.9, 60 sec: 49151.9, 300 sec: 46986.0). Total num frames: 810057728. Throughput: 0: 46737.6. Samples: 811223900. Policy #0 lag: (min: 0.0, avg: 34.6, max: 81.0) [2024-03-21 03:34:35,522][03784] Avg episode reward: [(0, '1.251')] [2024-03-21 03:34:38,046][03995] Signal inference workers to stop experience collection... (16300 times) [2024-03-21 03:34:38,127][03995] Signal inference workers to resume experience collection... (16300 times) [2024-03-21 03:34:38,129][04017] InferenceWorker_p0-w0: stopping experience collection (16300 times) [2024-03-21 03:34:38,181][04017] InferenceWorker_p0-w0: resuming experience collection (16300 times) [2024-03-21 03:34:38,480][04017] Updated weights for policy 0, policy_version 24727 (0.0017) [2024-03-21 03:34:40,521][03784] Fps is (10 sec: 52428.9, 60 sec: 49152.1, 300 sec: 47541.4). Total num frames: 810319872. Throughput: 0: 46404.3. Samples: 811509600. Policy #0 lag: (min: 0.0, avg: 34.6, max: 81.0) [2024-03-21 03:34:40,522][03784] Avg episode reward: [(0, '1.251')] [2024-03-21 03:34:45,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45875.3, 300 sec: 47541.4). Total num frames: 810483712. Throughput: 0: 46578.0. Samples: 811649000. Policy #0 lag: (min: 0.0, avg: 34.6, max: 81.0) [2024-03-21 03:34:45,522][03784] Avg episode reward: [(0, '0.996')] [2024-03-21 03:34:47,336][04017] Updated weights for policy 0, policy_version 24737 (0.0011) [2024-03-21 03:34:50,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43690.6, 300 sec: 47541.4). Total num frames: 810680320. Throughput: 0: 45986.6. Samples: 811912600. Policy #0 lag: (min: 0.0, avg: 38.1, max: 86.0) [2024-03-21 03:34:50,522][03784] Avg episode reward: [(0, '0.879')] [2024-03-21 03:34:54,533][04017] Updated weights for policy 0, policy_version 24747 (0.0015) [2024-03-21 03:34:55,521][03784] Fps is (10 sec: 52428.7, 60 sec: 47513.5, 300 sec: 47430.3). Total num frames: 811008000. Throughput: 0: 46006.7. Samples: 812191400. Policy #0 lag: (min: 0.0, avg: 38.1, max: 86.0) [2024-03-21 03:34:55,522][03784] Avg episode reward: [(0, '1.086')] [2024-03-21 03:35:00,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48059.9, 300 sec: 46763.8). Total num frames: 811204608. Throughput: 0: 45931.1. Samples: 812335300. Policy #0 lag: (min: 0.0, avg: 38.1, max: 86.0) [2024-03-21 03:35:00,522][03784] Avg episode reward: [(0, '1.327')] [2024-03-21 03:35:01,194][04017] Updated weights for policy 0, policy_version 24757 (0.0012) [2024-03-21 03:35:05,521][03784] Fps is (10 sec: 42598.2, 60 sec: 47513.5, 300 sec: 46208.4). Total num frames: 811433984. Throughput: 0: 46275.5. Samples: 812631900. Policy #0 lag: (min: 0.0, avg: 43.7, max: 107.0) [2024-03-21 03:35:05,522][03784] Avg episode reward: [(0, '1.290')] [2024-03-21 03:35:07,742][04017] Updated weights for policy 0, policy_version 24767 (0.0015) [2024-03-21 03:35:10,521][03784] Fps is (10 sec: 45875.6, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 811663360. Throughput: 0: 46606.7. Samples: 812919100. Policy #0 lag: (min: 0.0, avg: 43.7, max: 107.0) [2024-03-21 03:35:10,522][03784] Avg episode reward: [(0, '1.290')] [2024-03-21 03:35:15,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44236.7, 300 sec: 46319.5). Total num frames: 811794432. Throughput: 0: 46611.1. Samples: 813053800. Policy #0 lag: (min: 0.0, avg: 43.7, max: 107.0) [2024-03-21 03:35:15,522][03784] Avg episode reward: [(0, '1.411')] [2024-03-21 03:35:17,402][04017] Updated weights for policy 0, policy_version 24777 (0.0010) [2024-03-21 03:35:20,521][03784] Fps is (10 sec: 45875.0, 60 sec: 48605.9, 300 sec: 46541.7). Total num frames: 812122112. Throughput: 0: 46844.5. Samples: 813331900. Policy #0 lag: (min: 1.0, avg: 31.5, max: 64.0) [2024-03-21 03:35:20,522][03784] Avg episode reward: [(0, '1.050')] [2024-03-21 03:35:21,404][04017] Updated weights for policy 0, policy_version 24787 (0.0011) [2024-03-21 03:35:25,521][03784] Fps is (10 sec: 58982.0, 60 sec: 48059.6, 300 sec: 46874.9). Total num frames: 812384256. Throughput: 0: 46535.4. Samples: 813603700. Policy #0 lag: (min: 1.0, avg: 31.5, max: 64.0) [2024-03-21 03:35:25,522][03784] Avg episode reward: [(0, '1.362')] [2024-03-21 03:35:28,247][04017] Updated weights for policy 0, policy_version 24797 (0.0020) [2024-03-21 03:35:30,521][03784] Fps is (10 sec: 62259.0, 60 sec: 49152.0, 300 sec: 47985.7). Total num frames: 812744704. Throughput: 0: 46453.3. Samples: 813739400. Policy #0 lag: (min: 1.0, avg: 31.5, max: 64.0) [2024-03-21 03:35:30,522][03784] Avg episode reward: [(0, '1.218')] [2024-03-21 03:35:35,521][03784] Fps is (10 sec: 45875.7, 60 sec: 46421.4, 300 sec: 47541.4). Total num frames: 812843008. Throughput: 0: 47222.2. Samples: 814037600. Policy #0 lag: (min: 1.0, avg: 31.5, max: 64.0) [2024-03-21 03:35:35,522][03784] Avg episode reward: [(0, '1.260')] [2024-03-21 03:35:35,915][04017] Updated weights for policy 0, policy_version 24807 (0.0016) [2024-03-21 03:35:40,493][03995] Signal inference workers to stop experience collection... (16350 times) [2024-03-21 03:35:40,521][03784] Fps is (10 sec: 22937.9, 60 sec: 44236.9, 300 sec: 46874.9). Total num frames: 812974080. Throughput: 0: 47497.9. Samples: 814328800. Policy #0 lag: (min: 1.0, avg: 31.4, max: 62.0) [2024-03-21 03:35:40,522][03784] Avg episode reward: [(0, '1.276')] [2024-03-21 03:35:40,564][03995] Signal inference workers to resume experience collection... (16350 times) [2024-03-21 03:35:40,578][04017] InferenceWorker_p0-w0: stopping experience collection (16350 times) [2024-03-21 03:35:40,620][04017] InferenceWorker_p0-w0: resuming experience collection (16350 times) [2024-03-21 03:35:43,255][04017] Updated weights for policy 0, policy_version 24817 (0.0013) [2024-03-21 03:35:45,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 813301760. Throughput: 0: 47486.7. Samples: 814472200. Policy #0 lag: (min: 1.0, avg: 31.4, max: 62.0) [2024-03-21 03:35:45,522][03784] Avg episode reward: [(0, '1.051')] [2024-03-21 03:35:50,112][04017] Updated weights for policy 0, policy_version 24827 (0.0015) [2024-03-21 03:35:50,521][03784] Fps is (10 sec: 58981.1, 60 sec: 48059.6, 300 sec: 47097.0). Total num frames: 813563904. Throughput: 0: 47613.3. Samples: 814774500. Policy #0 lag: (min: 1.0, avg: 31.4, max: 62.0) [2024-03-21 03:35:50,522][03784] Avg episode reward: [(0, '1.051')] [2024-03-21 03:35:55,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 813727744. Throughput: 0: 47695.5. Samples: 815065400. Policy #0 lag: (min: 0.0, avg: 51.7, max: 110.0) [2024-03-21 03:35:55,522][03784] Avg episode reward: [(0, '1.025')] [2024-03-21 03:35:58,079][04017] Updated weights for policy 0, policy_version 24837 (0.0011) [2024-03-21 03:36:00,521][03784] Fps is (10 sec: 39322.2, 60 sec: 45875.2, 300 sec: 46874.9). Total num frames: 813957120. Throughput: 0: 48037.9. Samples: 815215500. Policy #0 lag: (min: 0.0, avg: 51.7, max: 110.0) [2024-03-21 03:36:00,522][03784] Avg episode reward: [(0, '1.025')] [2024-03-21 03:36:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000024840_813957120.pth... [2024-03-21 03:36:00,650][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000024498_802750464.pth [2024-03-21 03:36:04,743][04017] Updated weights for policy 0, policy_version 24847 (0.0010) [2024-03-21 03:36:05,521][03784] Fps is (10 sec: 52428.7, 60 sec: 46967.5, 300 sec: 47319.2). Total num frames: 814252032. Throughput: 0: 48462.2. Samples: 815512700. Policy #0 lag: (min: 0.0, avg: 51.7, max: 110.0) [2024-03-21 03:36:05,522][03784] Avg episode reward: [(0, '1.318')] [2024-03-21 03:36:10,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46967.4, 300 sec: 47430.3). Total num frames: 814481408. Throughput: 0: 48409.1. Samples: 815782100. Policy #0 lag: (min: 0.0, avg: 51.7, max: 110.0) [2024-03-21 03:36:10,522][03784] Avg episode reward: [(0, '1.112')] [2024-03-21 03:36:11,777][04017] Updated weights for policy 0, policy_version 24857 (0.0014) [2024-03-21 03:36:15,521][03784] Fps is (10 sec: 42598.2, 60 sec: 48059.8, 300 sec: 47430.3). Total num frames: 814678016. Throughput: 0: 48740.0. Samples: 815932700. Policy #0 lag: (min: 0.0, avg: 37.2, max: 115.0) [2024-03-21 03:36:15,522][03784] Avg episode reward: [(0, '1.043')] [2024-03-21 03:36:17,532][04017] Updated weights for policy 0, policy_version 24867 (0.0011) [2024-03-21 03:36:20,521][03784] Fps is (10 sec: 49151.8, 60 sec: 47513.6, 300 sec: 47208.1). Total num frames: 814972928. Throughput: 0: 48462.2. Samples: 816218400. Policy #0 lag: (min: 0.0, avg: 37.2, max: 115.0) [2024-03-21 03:36:20,522][03784] Avg episode reward: [(0, '0.714')] [2024-03-21 03:36:22,971][04017] Updated weights for policy 0, policy_version 24877 (0.0023) [2024-03-21 03:36:25,521][03784] Fps is (10 sec: 58982.2, 60 sec: 48059.8, 300 sec: 47541.4). Total num frames: 815267840. Throughput: 0: 48491.0. Samples: 816510900. Policy #0 lag: (min: 0.0, avg: 37.2, max: 115.0) [2024-03-21 03:36:25,522][03784] Avg episode reward: [(0, '1.139')] [2024-03-21 03:36:29,294][04017] Updated weights for policy 0, policy_version 24887 (0.0015) [2024-03-21 03:36:30,521][03784] Fps is (10 sec: 52429.1, 60 sec: 45875.3, 300 sec: 47430.3). Total num frames: 815497216. Throughput: 0: 48537.8. Samples: 816656400. Policy #0 lag: (min: 0.0, avg: 37.2, max: 115.0) [2024-03-21 03:36:30,522][03784] Avg episode reward: [(0, '1.397')] [2024-03-21 03:36:35,521][03784] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 47430.3). Total num frames: 815759360. Throughput: 0: 48060.2. Samples: 816937200. Policy #0 lag: (min: 0.0, avg: 42.7, max: 90.0) [2024-03-21 03:36:35,522][03784] Avg episode reward: [(0, '1.397')] [2024-03-21 03:36:39,235][04017] Updated weights for policy 0, policy_version 24897 (0.0020) [2024-03-21 03:36:40,111][03995] Signal inference workers to stop experience collection... (16400 times) [2024-03-21 03:36:40,115][03995] Signal inference workers to resume experience collection... (16400 times) [2024-03-21 03:36:40,184][04017] InferenceWorker_p0-w0: stopping experience collection (16400 times) [2024-03-21 03:36:40,184][04017] InferenceWorker_p0-w0: resuming experience collection (16400 times) [2024-03-21 03:36:40,521][03784] Fps is (10 sec: 42598.3, 60 sec: 49151.9, 300 sec: 47208.1). Total num frames: 815923200. Throughput: 0: 48046.6. Samples: 817227500. Policy #0 lag: (min: 0.0, avg: 42.7, max: 90.0) [2024-03-21 03:36:40,522][03784] Avg episode reward: [(0, '1.576')] [2024-03-21 03:36:44,181][04017] Updated weights for policy 0, policy_version 24907 (0.0016) [2024-03-21 03:36:45,521][03784] Fps is (10 sec: 39321.7, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 816152576. Throughput: 0: 47397.8. Samples: 817348400. Policy #0 lag: (min: 0.0, avg: 42.7, max: 90.0) [2024-03-21 03:36:45,522][03784] Avg episode reward: [(0, '1.175')] [2024-03-21 03:36:50,521][03784] Fps is (10 sec: 49152.0, 60 sec: 47513.7, 300 sec: 47208.1). Total num frames: 816414720. Throughput: 0: 46895.5. Samples: 817623000. Policy #0 lag: (min: 0.0, avg: 35.9, max: 75.0) [2024-03-21 03:36:50,522][03784] Avg episode reward: [(0, '0.887')] [2024-03-21 03:36:51,079][04017] Updated weights for policy 0, policy_version 24917 (0.0012) [2024-03-21 03:36:55,521][03784] Fps is (10 sec: 55705.0, 60 sec: 49698.0, 300 sec: 47319.2). Total num frames: 816709632. Throughput: 0: 46937.7. Samples: 817894300. Policy #0 lag: (min: 0.0, avg: 35.9, max: 75.0) [2024-03-21 03:36:55,522][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 03:36:58,087][04017] Updated weights for policy 0, policy_version 24927 (0.0018) [2024-03-21 03:37:00,521][03784] Fps is (10 sec: 52428.4, 60 sec: 49698.1, 300 sec: 47652.4). Total num frames: 816939008. Throughput: 0: 46548.8. Samples: 818027400. Policy #0 lag: (min: 0.0, avg: 35.9, max: 75.0) [2024-03-21 03:37:00,522][03784] Avg episode reward: [(0, '1.437')] [2024-03-21 03:37:05,202][04017] Updated weights for policy 0, policy_version 24937 (0.0012) [2024-03-21 03:37:05,521][03784] Fps is (10 sec: 42598.9, 60 sec: 48059.7, 300 sec: 47763.5). Total num frames: 817135616. Throughput: 0: 46706.7. Samples: 818320200. Policy #0 lag: (min: 0.0, avg: 41.6, max: 94.0) [2024-03-21 03:37:05,522][03784] Avg episode reward: [(0, '1.292')] [2024-03-21 03:37:10,521][03784] Fps is (10 sec: 39321.4, 60 sec: 47513.5, 300 sec: 47319.2). Total num frames: 817332224. Throughput: 0: 46148.8. Samples: 818587600. Policy #0 lag: (min: 0.0, avg: 41.6, max: 94.0) [2024-03-21 03:37:10,522][03784] Avg episode reward: [(0, '1.404')] [2024-03-21 03:37:15,192][04017] Updated weights for policy 0, policy_version 24947 (0.0015) [2024-03-21 03:37:15,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46421.4, 300 sec: 46874.9). Total num frames: 817463296. Throughput: 0: 46266.6. Samples: 818738400. Policy #0 lag: (min: 0.0, avg: 41.6, max: 94.0) [2024-03-21 03:37:15,522][03784] Avg episode reward: [(0, '0.953')] [2024-03-21 03:37:20,521][03784] Fps is (10 sec: 26214.8, 60 sec: 43690.7, 300 sec: 47097.1). Total num frames: 817594368. Throughput: 0: 46582.2. Samples: 819033400. Policy #0 lag: (min: 0.0, avg: 41.6, max: 94.0) [2024-03-21 03:37:20,522][03784] Avg episode reward: [(0, '1.565')] [2024-03-21 03:37:24,901][04017] Updated weights for policy 0, policy_version 24957 (0.0011) [2024-03-21 03:37:25,521][03784] Fps is (10 sec: 36044.9, 60 sec: 42598.5, 300 sec: 46986.0). Total num frames: 817823744. Throughput: 0: 46620.0. Samples: 819325400. Policy #0 lag: (min: 1.0, avg: 20.2, max: 52.0) [2024-03-21 03:37:25,522][03784] Avg episode reward: [(0, '1.117')] [2024-03-21 03:37:29,560][04017] Updated weights for policy 0, policy_version 24967 (0.0015) [2024-03-21 03:37:30,521][03784] Fps is (10 sec: 55704.9, 60 sec: 44236.7, 300 sec: 47097.0). Total num frames: 818151424. Throughput: 0: 47046.6. Samples: 819465500. Policy #0 lag: (min: 1.0, avg: 20.2, max: 52.0) [2024-03-21 03:37:30,522][03784] Avg episode reward: [(0, '1.161')] [2024-03-21 03:37:33,027][03995] Signal inference workers to stop experience collection... (16450 times) [2024-03-21 03:37:33,027][03995] Signal inference workers to resume experience collection... (16450 times) [2024-03-21 03:37:33,081][04017] InferenceWorker_p0-w0: stopping experience collection (16450 times) [2024-03-21 03:37:33,081][04017] InferenceWorker_p0-w0: resuming experience collection (16450 times) [2024-03-21 03:37:34,878][04017] Updated weights for policy 0, policy_version 24977 (0.0011) [2024-03-21 03:37:35,521][03784] Fps is (10 sec: 65535.6, 60 sec: 45329.0, 300 sec: 47208.1). Total num frames: 818479104. Throughput: 0: 46786.6. Samples: 819728400. Policy #0 lag: (min: 1.0, avg: 20.2, max: 52.0) [2024-03-21 03:37:35,522][03784] Avg episode reward: [(0, '0.960')] [2024-03-21 03:37:40,521][03784] Fps is (10 sec: 58983.1, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 818741248. Throughput: 0: 46526.8. Samples: 819988000. Policy #0 lag: (min: 0.0, avg: 42.0, max: 105.0) [2024-03-21 03:37:40,522][03784] Avg episode reward: [(0, '0.519')] [2024-03-21 03:37:40,568][04017] Updated weights for policy 0, policy_version 24987 (0.0021) [2024-03-21 03:37:45,521][03784] Fps is (10 sec: 49152.2, 60 sec: 46967.5, 300 sec: 47208.2). Total num frames: 818970624. Throughput: 0: 46835.6. Samples: 820135000. Policy #0 lag: (min: 0.0, avg: 42.0, max: 105.0) [2024-03-21 03:37:45,522][03784] Avg episode reward: [(0, '1.357')] [2024-03-21 03:37:47,275][04017] Updated weights for policy 0, policy_version 24997 (0.0015) [2024-03-21 03:37:50,521][03784] Fps is (10 sec: 52428.7, 60 sec: 47513.6, 300 sec: 47097.1). Total num frames: 819265536. Throughput: 0: 46206.7. Samples: 820399500. Policy #0 lag: (min: 0.0, avg: 42.0, max: 105.0) [2024-03-21 03:37:50,522][03784] Avg episode reward: [(0, '1.117')] [2024-03-21 03:37:52,088][04017] Updated weights for policy 0, policy_version 25007 (0.0016) [2024-03-21 03:37:55,521][03784] Fps is (10 sec: 65535.4, 60 sec: 48605.9, 300 sec: 47541.4). Total num frames: 819625984. Throughput: 0: 46097.8. Samples: 820662000. Policy #0 lag: (min: 0.0, avg: 42.0, max: 105.0) [2024-03-21 03:37:55,522][03784] Avg episode reward: [(0, '0.956')] [2024-03-21 03:37:59,937][04017] Updated weights for policy 0, policy_version 25017 (0.0011) [2024-03-21 03:38:00,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46967.6, 300 sec: 46874.9). Total num frames: 819757056. Throughput: 0: 45957.8. Samples: 820806500. Policy #0 lag: (min: 0.0, avg: 50.6, max: 93.0) [2024-03-21 03:38:00,522][03784] Avg episode reward: [(0, '1.425')] [2024-03-21 03:38:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000025017_819757056.pth... [2024-03-21 03:38:00,651][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000024669_808353792.pth [2024-03-21 03:38:05,521][03784] Fps is (10 sec: 22937.7, 60 sec: 45329.0, 300 sec: 46652.7). Total num frames: 819855360. Throughput: 0: 45664.4. Samples: 821088300. Policy #0 lag: (min: 0.0, avg: 50.6, max: 93.0) [2024-03-21 03:38:05,522][03784] Avg episode reward: [(0, '1.021')] [2024-03-21 03:38:10,297][04017] Updated weights for policy 0, policy_version 25027 (0.0015) [2024-03-21 03:38:10,521][03784] Fps is (10 sec: 32767.9, 60 sec: 45875.3, 300 sec: 46986.0). Total num frames: 820084736. Throughput: 0: 45231.1. Samples: 821360800. Policy #0 lag: (min: 0.0, avg: 50.6, max: 93.0) [2024-03-21 03:38:10,522][03784] Avg episode reward: [(0, '0.679')] [2024-03-21 03:38:15,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 820215808. Throughput: 0: 45544.6. Samples: 821515000. Policy #0 lag: (min: 0.0, avg: 34.5, max: 84.0) [2024-03-21 03:38:15,522][03784] Avg episode reward: [(0, '0.568')] [2024-03-21 03:38:19,488][04017] Updated weights for policy 0, policy_version 25037 (0.0019) [2024-03-21 03:38:20,521][03784] Fps is (10 sec: 42598.1, 60 sec: 48605.8, 300 sec: 46541.7). Total num frames: 820510720. Throughput: 0: 46377.8. Samples: 821815400. Policy #0 lag: (min: 0.0, avg: 34.5, max: 84.0) [2024-03-21 03:38:20,522][03784] Avg episode reward: [(0, '1.391')] [2024-03-21 03:38:25,521][03784] Fps is (10 sec: 49151.6, 60 sec: 48059.7, 300 sec: 46430.6). Total num frames: 820707328. Throughput: 0: 46682.1. Samples: 822088700. Policy #0 lag: (min: 0.0, avg: 34.5, max: 84.0) [2024-03-21 03:38:25,522][03784] Avg episode reward: [(0, '0.768')] [2024-03-21 03:38:25,740][04017] Updated weights for policy 0, policy_version 25047 (0.0016) [2024-03-21 03:38:29,612][03995] Signal inference workers to stop experience collection... (16500 times) [2024-03-21 03:38:29,681][03995] Signal inference workers to resume experience collection... (16500 times) [2024-03-21 03:38:29,703][04017] InferenceWorker_p0-w0: stopping experience collection (16500 times) [2024-03-21 03:38:29,761][04017] InferenceWorker_p0-w0: resuming experience collection (16500 times) [2024-03-21 03:38:30,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 820969472. Throughput: 0: 46106.6. Samples: 822209800. Policy #0 lag: (min: 0.0, avg: 34.5, max: 84.0) [2024-03-21 03:38:30,522][03784] Avg episode reward: [(0, '1.578')] [2024-03-21 03:38:32,341][04017] Updated weights for policy 0, policy_version 25057 (0.0015) [2024-03-21 03:38:35,521][03784] Fps is (10 sec: 52429.1, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 821231616. Throughput: 0: 46804.4. Samples: 822505700. Policy #0 lag: (min: 0.0, avg: 40.4, max: 88.0) [2024-03-21 03:38:35,522][03784] Avg episode reward: [(0, '1.262')] [2024-03-21 03:38:37,567][04017] Updated weights for policy 0, policy_version 25067 (0.0011) [2024-03-21 03:38:40,521][03784] Fps is (10 sec: 68812.1, 60 sec: 48605.8, 300 sec: 47208.1). Total num frames: 821657600. Throughput: 0: 47140.0. Samples: 822783300. Policy #0 lag: (min: 0.0, avg: 40.4, max: 88.0) [2024-03-21 03:38:40,522][03784] Avg episode reward: [(0, '0.776')] [2024-03-21 03:38:41,105][04017] Updated weights for policy 0, policy_version 25077 (0.0016) [2024-03-21 03:38:45,521][03784] Fps is (10 sec: 62259.4, 60 sec: 48059.7, 300 sec: 46763.8). Total num frames: 821854208. Throughput: 0: 47242.2. Samples: 822932400. Policy #0 lag: (min: 0.0, avg: 40.4, max: 88.0) [2024-03-21 03:38:45,522][03784] Avg episode reward: [(0, '0.776')] [2024-03-21 03:38:50,521][03784] Fps is (10 sec: 32768.5, 60 sec: 45329.1, 300 sec: 46874.9). Total num frames: 821985280. Throughput: 0: 47684.5. Samples: 823234100. Policy #0 lag: (min: 0.0, avg: 41.1, max: 92.0) [2024-03-21 03:38:50,522][03784] Avg episode reward: [(0, '0.793')] [2024-03-21 03:38:51,186][04017] Updated weights for policy 0, policy_version 25087 (0.0010) [2024-03-21 03:38:55,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43144.6, 300 sec: 47097.1). Total num frames: 822214656. Throughput: 0: 48153.3. Samples: 823527700. Policy #0 lag: (min: 0.0, avg: 41.1, max: 92.0) [2024-03-21 03:38:55,522][03784] Avg episode reward: [(0, '1.403')] [2024-03-21 03:38:59,210][04017] Updated weights for policy 0, policy_version 25097 (0.0014) [2024-03-21 03:39:00,521][03784] Fps is (10 sec: 42598.0, 60 sec: 44236.7, 300 sec: 46874.9). Total num frames: 822411264. Throughput: 0: 47982.1. Samples: 823674200. Policy #0 lag: (min: 0.0, avg: 41.1, max: 92.0) [2024-03-21 03:39:00,522][03784] Avg episode reward: [(0, '1.413')] [2024-03-21 03:39:05,521][03784] Fps is (10 sec: 39321.6, 60 sec: 45875.2, 300 sec: 46652.8). Total num frames: 822607872. Throughput: 0: 47684.5. Samples: 823961200. Policy #0 lag: (min: 0.0, avg: 41.1, max: 92.0) [2024-03-21 03:39:05,522][03784] Avg episode reward: [(0, '1.520')] [2024-03-21 03:39:08,175][04017] Updated weights for policy 0, policy_version 25107 (0.0011) [2024-03-21 03:39:10,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 822837248. Throughput: 0: 47793.4. Samples: 824239400. Policy #0 lag: (min: 0.0, avg: 24.1, max: 59.0) [2024-03-21 03:39:10,522][03784] Avg episode reward: [(0, '0.610')] [2024-03-21 03:39:12,355][04017] Updated weights for policy 0, policy_version 25117 (0.0018) [2024-03-21 03:39:15,263][03995] Signal inference workers to stop experience collection... (16550 times) [2024-03-21 03:39:15,324][03995] Signal inference workers to resume experience collection... (16550 times) [2024-03-21 03:39:15,357][04017] InferenceWorker_p0-w0: stopping experience collection (16550 times) [2024-03-21 03:39:15,412][04017] InferenceWorker_p0-w0: resuming experience collection (16550 times) [2024-03-21 03:39:15,521][03784] Fps is (10 sec: 65536.3, 60 sec: 50790.4, 300 sec: 47652.5). Total num frames: 823263232. Throughput: 0: 47717.9. Samples: 824357100. Policy #0 lag: (min: 0.0, avg: 24.1, max: 59.0) [2024-03-21 03:39:15,522][03784] Avg episode reward: [(0, '1.146')] [2024-03-21 03:39:16,243][04017] Updated weights for policy 0, policy_version 25127 (0.0012) [2024-03-21 03:39:20,521][03784] Fps is (10 sec: 68812.7, 60 sec: 50244.3, 300 sec: 47541.4). Total num frames: 823525376. Throughput: 0: 47744.5. Samples: 824654200. Policy #0 lag: (min: 0.0, avg: 24.1, max: 59.0) [2024-03-21 03:39:20,522][03784] Avg episode reward: [(0, '0.596')] [2024-03-21 03:39:24,007][04017] Updated weights for policy 0, policy_version 25137 (0.0020) [2024-03-21 03:39:25,521][03784] Fps is (10 sec: 52428.4, 60 sec: 51336.6, 300 sec: 47430.3). Total num frames: 823787520. Throughput: 0: 47791.2. Samples: 824933900. Policy #0 lag: (min: 0.0, avg: 42.9, max: 79.0) [2024-03-21 03:39:25,522][03784] Avg episode reward: [(0, '1.423')] [2024-03-21 03:39:29,533][04017] Updated weights for policy 0, policy_version 25147 (0.0011) [2024-03-21 03:39:30,521][03784] Fps is (10 sec: 52428.4, 60 sec: 51336.5, 300 sec: 47430.3). Total num frames: 824049664. Throughput: 0: 47704.4. Samples: 825079100. Policy #0 lag: (min: 0.0, avg: 42.9, max: 79.0) [2024-03-21 03:39:30,522][03784] Avg episode reward: [(0, '1.073')] [2024-03-21 03:39:35,521][03784] Fps is (10 sec: 39321.6, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 824180736. Throughput: 0: 47013.3. Samples: 825349700. Policy #0 lag: (min: 0.0, avg: 42.9, max: 79.0) [2024-03-21 03:39:35,522][03784] Avg episode reward: [(0, '0.935')] [2024-03-21 03:39:39,578][04017] Updated weights for policy 0, policy_version 25157 (0.0010) [2024-03-21 03:39:40,521][03784] Fps is (10 sec: 29491.2, 60 sec: 44783.0, 300 sec: 46986.0). Total num frames: 824344576. Throughput: 0: 47399.9. Samples: 825660700. Policy #0 lag: (min: 0.0, avg: 42.9, max: 79.0) [2024-03-21 03:39:40,523][03784] Avg episode reward: [(0, '0.935')] [2024-03-21 03:39:45,521][03784] Fps is (10 sec: 32767.9, 60 sec: 44236.8, 300 sec: 46874.9). Total num frames: 824508416. Throughput: 0: 47517.8. Samples: 825812500. Policy #0 lag: (min: 0.0, avg: 37.9, max: 90.0) [2024-03-21 03:39:45,522][03784] Avg episode reward: [(0, '1.095')] [2024-03-21 03:39:48,926][04017] Updated weights for policy 0, policy_version 25167 (0.0011) [2024-03-21 03:39:50,521][03784] Fps is (10 sec: 39322.0, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 824737792. Throughput: 0: 47475.6. Samples: 826097600. Policy #0 lag: (min: 0.0, avg: 37.9, max: 90.0) [2024-03-21 03:39:50,522][03784] Avg episode reward: [(0, '0.810')] [2024-03-21 03:39:54,069][04017] Updated weights for policy 0, policy_version 25177 (0.0015) [2024-03-21 03:39:55,521][03784] Fps is (10 sec: 55705.8, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 825065472. Throughput: 0: 47368.9. Samples: 826371000. Policy #0 lag: (min: 0.0, avg: 37.9, max: 90.0) [2024-03-21 03:39:55,522][03784] Avg episode reward: [(0, '1.168')] [2024-03-21 03:40:00,521][03784] Fps is (10 sec: 49152.0, 60 sec: 46967.5, 300 sec: 46763.8). Total num frames: 825229312. Throughput: 0: 47475.5. Samples: 826493500. Policy #0 lag: (min: 0.0, avg: 37.9, max: 90.0) [2024-03-21 03:40:00,522][03784] Avg episode reward: [(0, '1.249')] [2024-03-21 03:40:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000025184_825229312.pth... [2024-03-21 03:40:00,655][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000024840_813957120.pth [2024-03-21 03:40:03,090][04017] Updated weights for policy 0, policy_version 25187 (0.0011) [2024-03-21 03:40:05,521][03784] Fps is (10 sec: 45875.2, 60 sec: 48605.8, 300 sec: 46986.0). Total num frames: 825524224. Throughput: 0: 46882.2. Samples: 826763900. Policy #0 lag: (min: 0.0, avg: 34.0, max: 77.0) [2024-03-21 03:40:05,522][03784] Avg episode reward: [(0, '1.270')] [2024-03-21 03:40:06,305][03995] Signal inference workers to stop experience collection... (16600 times) [2024-03-21 03:40:06,377][04017] InferenceWorker_p0-w0: stopping experience collection (16600 times) [2024-03-21 03:40:06,388][03995] Signal inference workers to resume experience collection... (16600 times) [2024-03-21 03:40:06,426][04017] InferenceWorker_p0-w0: resuming experience collection (16600 times) [2024-03-21 03:40:07,185][04017] Updated weights for policy 0, policy_version 25197 (0.0022) [2024-03-21 03:40:10,521][03784] Fps is (10 sec: 62259.1, 60 sec: 50244.3, 300 sec: 47652.5). Total num frames: 825851904. Throughput: 0: 46297.8. Samples: 827017300. Policy #0 lag: (min: 0.0, avg: 34.0, max: 77.0) [2024-03-21 03:40:10,522][03784] Avg episode reward: [(0, '0.910')] [2024-03-21 03:40:13,890][04017] Updated weights for policy 0, policy_version 25207 (0.0010) [2024-03-21 03:40:15,521][03784] Fps is (10 sec: 49152.0, 60 sec: 45875.1, 300 sec: 47097.1). Total num frames: 826015744. Throughput: 0: 46253.4. Samples: 827160500. Policy #0 lag: (min: 0.0, avg: 34.0, max: 77.0) [2024-03-21 03:40:15,522][03784] Avg episode reward: [(0, '1.081')] [2024-03-21 03:40:19,033][04017] Updated weights for policy 0, policy_version 25217 (0.0010) [2024-03-21 03:40:20,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46421.3, 300 sec: 47208.2). Total num frames: 826310656. Throughput: 0: 46088.9. Samples: 827423700. Policy #0 lag: (min: 0.0, avg: 34.0, max: 77.0) [2024-03-21 03:40:20,522][03784] Avg episode reward: [(0, '0.507')] [2024-03-21 03:40:25,521][03784] Fps is (10 sec: 49151.8, 60 sec: 45329.0, 300 sec: 46652.7). Total num frames: 826507264. Throughput: 0: 45635.5. Samples: 827714300. Policy #0 lag: (min: 0.0, avg: 36.3, max: 95.0) [2024-03-21 03:40:25,522][03784] Avg episode reward: [(0, '0.959')] [2024-03-21 03:40:28,253][04017] Updated weights for policy 0, policy_version 25227 (0.0010) [2024-03-21 03:40:30,521][03784] Fps is (10 sec: 39321.6, 60 sec: 44236.9, 300 sec: 46986.0). Total num frames: 826703872. Throughput: 0: 45442.3. Samples: 827857400. Policy #0 lag: (min: 0.0, avg: 36.3, max: 95.0) [2024-03-21 03:40:30,522][03784] Avg episode reward: [(0, '1.649')] [2024-03-21 03:40:35,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44782.9, 300 sec: 47097.0). Total num frames: 826867712. Throughput: 0: 45311.0. Samples: 828136600. Policy #0 lag: (min: 0.0, avg: 36.3, max: 95.0) [2024-03-21 03:40:35,522][03784] Avg episode reward: [(0, '1.374')] [2024-03-21 03:40:40,164][04017] Updated weights for policy 0, policy_version 25237 (0.0010) [2024-03-21 03:40:40,521][03784] Fps is (10 sec: 29491.3, 60 sec: 44236.9, 300 sec: 46430.6). Total num frames: 826998784. Throughput: 0: 45453.4. Samples: 828416400. Policy #0 lag: (min: 0.0, avg: 30.7, max: 85.0) [2024-03-21 03:40:40,522][03784] Avg episode reward: [(0, '0.678')] [2024-03-21 03:40:44,143][04017] Updated weights for policy 0, policy_version 25247 (0.0018) [2024-03-21 03:40:45,521][03784] Fps is (10 sec: 45875.6, 60 sec: 46967.5, 300 sec: 46652.8). Total num frames: 827326464. Throughput: 0: 45268.9. Samples: 828530600. Policy #0 lag: (min: 0.0, avg: 30.7, max: 85.0) [2024-03-21 03:40:45,522][03784] Avg episode reward: [(0, '1.362')] [2024-03-21 03:40:50,521][03784] Fps is (10 sec: 42597.9, 60 sec: 44782.9, 300 sec: 46430.6). Total num frames: 827424768. Throughput: 0: 45831.1. Samples: 828826300. Policy #0 lag: (min: 0.0, avg: 30.7, max: 85.0) [2024-03-21 03:40:50,522][03784] Avg episode reward: [(0, '1.044')] [2024-03-21 03:40:53,576][04017] Updated weights for policy 0, policy_version 25257 (0.0015) [2024-03-21 03:40:55,521][03784] Fps is (10 sec: 42598.1, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 827752448. Throughput: 0: 46366.6. Samples: 829103800. Policy #0 lag: (min: 0.0, avg: 30.7, max: 85.0) [2024-03-21 03:40:55,522][03784] Avg episode reward: [(0, '0.818')] [2024-03-21 03:41:00,407][04017] Updated weights for policy 0, policy_version 25267 (0.0023) [2024-03-21 03:41:00,521][03784] Fps is (10 sec: 52428.9, 60 sec: 45329.0, 300 sec: 46430.6). Total num frames: 827949056. Throughput: 0: 46688.9. Samples: 829261500. Policy #0 lag: (min: 0.0, avg: 37.7, max: 82.0) [2024-03-21 03:41:00,522][03784] Avg episode reward: [(0, '1.293')] [2024-03-21 03:41:02,156][03995] Signal inference workers to stop experience collection... (16650 times) [2024-03-21 03:41:02,219][04017] InferenceWorker_p0-w0: stopping experience collection (16650 times) [2024-03-21 03:41:02,426][03995] Signal inference workers to resume experience collection... (16650 times) [2024-03-21 03:41:02,426][04017] InferenceWorker_p0-w0: resuming experience collection (16650 times) [2024-03-21 03:41:03,685][04017] Updated weights for policy 0, policy_version 25277 (0.0020) [2024-03-21 03:41:05,521][03784] Fps is (10 sec: 65535.8, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 828407808. Throughput: 0: 46491.0. Samples: 829515800. Policy #0 lag: (min: 0.0, avg: 37.7, max: 82.0) [2024-03-21 03:41:05,522][03784] Avg episode reward: [(0, '1.105')] [2024-03-21 03:41:08,589][04017] Updated weights for policy 0, policy_version 25287 (0.0015) [2024-03-21 03:41:10,521][03784] Fps is (10 sec: 75366.0, 60 sec: 47513.5, 300 sec: 47541.4). Total num frames: 828702720. Throughput: 0: 45908.8. Samples: 829780200. Policy #0 lag: (min: 0.0, avg: 37.7, max: 82.0) [2024-03-21 03:41:10,522][03784] Avg episode reward: [(0, '1.269')] [2024-03-21 03:41:15,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 828768256. Throughput: 0: 46091.1. Samples: 829931500. Policy #0 lag: (min: 0.0, avg: 37.7, max: 82.0) [2024-03-21 03:41:15,522][03784] Avg episode reward: [(0, '1.355')] [2024-03-21 03:41:20,521][03784] Fps is (10 sec: 19661.1, 60 sec: 43144.5, 300 sec: 46208.4). Total num frames: 828899328. Throughput: 0: 45584.5. Samples: 830187900. Policy #0 lag: (min: 0.0, avg: 36.8, max: 80.0) [2024-03-21 03:41:20,522][03784] Avg episode reward: [(0, '1.267')] [2024-03-21 03:41:22,108][04017] Updated weights for policy 0, policy_version 25297 (0.0011) [2024-03-21 03:41:25,521][03784] Fps is (10 sec: 36044.7, 60 sec: 43690.7, 300 sec: 46208.4). Total num frames: 829128704. Throughput: 0: 45037.7. Samples: 830443100. Policy #0 lag: (min: 0.0, avg: 36.8, max: 80.0) [2024-03-21 03:41:25,522][03784] Avg episode reward: [(0, '0.995')] [2024-03-21 03:41:26,765][04017] Updated weights for policy 0, policy_version 25307 (0.0011) [2024-03-21 03:41:30,521][03784] Fps is (10 sec: 62259.1, 60 sec: 46967.5, 300 sec: 46652.7). Total num frames: 829521920. Throughput: 0: 45073.3. Samples: 830558900. Policy #0 lag: (min: 0.0, avg: 36.8, max: 80.0) [2024-03-21 03:41:30,522][03784] Avg episode reward: [(0, '1.155')] [2024-03-21 03:41:30,972][04017] Updated weights for policy 0, policy_version 25317 (0.0019) [2024-03-21 03:41:35,521][03784] Fps is (10 sec: 55705.6, 60 sec: 46967.5, 300 sec: 46652.7). Total num frames: 829685760. Throughput: 0: 45280.1. Samples: 830863900. Policy #0 lag: (min: 0.0, avg: 54.4, max: 90.0) [2024-03-21 03:41:35,522][03784] Avg episode reward: [(0, '0.594')] [2024-03-21 03:41:40,521][03784] Fps is (10 sec: 36044.7, 60 sec: 48059.7, 300 sec: 46541.7). Total num frames: 829882368. Throughput: 0: 45715.6. Samples: 831161000. Policy #0 lag: (min: 0.0, avg: 54.4, max: 90.0) [2024-03-21 03:41:40,522][03784] Avg episode reward: [(0, '0.594')] [2024-03-21 03:41:41,592][04017] Updated weights for policy 0, policy_version 25327 (0.0019) [2024-03-21 03:41:45,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 830144512. Throughput: 0: 45255.6. Samples: 831298000. Policy #0 lag: (min: 0.0, avg: 54.4, max: 90.0) [2024-03-21 03:41:45,522][03784] Avg episode reward: [(0, '1.114')] [2024-03-21 03:41:46,798][04017] Updated weights for policy 0, policy_version 25337 (0.0015) [2024-03-21 03:41:47,135][03995] Signal inference workers to stop experience collection... (16700 times) [2024-03-21 03:41:47,258][04017] InferenceWorker_p0-w0: stopping experience collection (16700 times) [2024-03-21 03:41:47,331][03995] Signal inference workers to resume experience collection... (16700 times) [2024-03-21 03:41:47,332][04017] InferenceWorker_p0-w0: resuming experience collection (16700 times) [2024-03-21 03:41:50,521][03784] Fps is (10 sec: 52429.0, 60 sec: 49698.2, 300 sec: 46430.6). Total num frames: 830406656. Throughput: 0: 46000.1. Samples: 831585800. Policy #0 lag: (min: 0.0, avg: 54.4, max: 90.0) [2024-03-21 03:41:50,522][03784] Avg episode reward: [(0, '0.744')] [2024-03-21 03:41:55,521][03784] Fps is (10 sec: 36044.4, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 830504960. Throughput: 0: 47202.3. Samples: 831904300. Policy #0 lag: (min: 0.0, avg: 36.6, max: 80.0) [2024-03-21 03:41:55,522][03784] Avg episode reward: [(0, '0.744')] [2024-03-21 03:41:57,409][04017] Updated weights for policy 0, policy_version 25347 (0.0011) [2024-03-21 03:42:00,521][03784] Fps is (10 sec: 29491.0, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 830701568. Throughput: 0: 47353.3. Samples: 832062400. Policy #0 lag: (min: 0.0, avg: 36.6, max: 80.0) [2024-03-21 03:42:00,522][03784] Avg episode reward: [(0, '0.744')] [2024-03-21 03:42:00,866][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000025352_830734336.pth... [2024-03-21 03:42:00,929][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000025017_819757056.pth [2024-03-21 03:42:03,675][04017] Updated weights for policy 0, policy_version 25357 (0.0010) [2024-03-21 03:42:05,521][03784] Fps is (10 sec: 49152.4, 60 sec: 43144.6, 300 sec: 46319.5). Total num frames: 830996480. Throughput: 0: 47906.6. Samples: 832343700. Policy #0 lag: (min: 0.0, avg: 36.6, max: 80.0) [2024-03-21 03:42:05,522][03784] Avg episode reward: [(0, '0.797')] [2024-03-21 03:42:07,933][04017] Updated weights for policy 0, policy_version 25367 (0.0018) [2024-03-21 03:42:10,521][03784] Fps is (10 sec: 58982.7, 60 sec: 43144.6, 300 sec: 46874.9). Total num frames: 831291392. Throughput: 0: 48237.8. Samples: 832613800. Policy #0 lag: (min: 0.0, avg: 36.6, max: 80.0) [2024-03-21 03:42:10,522][03784] Avg episode reward: [(0, '1.423')] [2024-03-21 03:42:15,022][04017] Updated weights for policy 0, policy_version 25377 (0.0014) [2024-03-21 03:42:15,521][03784] Fps is (10 sec: 58982.4, 60 sec: 46967.4, 300 sec: 47430.3). Total num frames: 831586304. Throughput: 0: 48631.1. Samples: 832747300. Policy #0 lag: (min: 0.0, avg: 35.0, max: 70.0) [2024-03-21 03:42:15,522][03784] Avg episode reward: [(0, '1.221')] [2024-03-21 03:42:20,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46967.4, 300 sec: 47097.1). Total num frames: 831717376. Throughput: 0: 47971.1. Samples: 833022600. Policy #0 lag: (min: 0.0, avg: 35.0, max: 70.0) [2024-03-21 03:42:20,522][03784] Avg episode reward: [(0, '0.832')] [2024-03-21 03:42:22,838][04017] Updated weights for policy 0, policy_version 25387 (0.0017) [2024-03-21 03:42:25,521][03784] Fps is (10 sec: 49152.0, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 832077824. Throughput: 0: 47393.3. Samples: 833293700. Policy #0 lag: (min: 0.0, avg: 35.0, max: 70.0) [2024-03-21 03:42:25,522][03784] Avg episode reward: [(0, '1.103')] [2024-03-21 03:42:27,568][04017] Updated weights for policy 0, policy_version 25397 (0.0021) [2024-03-21 03:42:30,521][03784] Fps is (10 sec: 55705.8, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 832274432. Throughput: 0: 47760.0. Samples: 833447200. Policy #0 lag: (min: 0.0, avg: 44.2, max: 112.0) [2024-03-21 03:42:30,522][03784] Avg episode reward: [(0, '1.103')] [2024-03-21 03:42:33,548][04017] Updated weights for policy 0, policy_version 25407 (0.0011) [2024-03-21 03:42:35,521][03784] Fps is (10 sec: 52429.3, 60 sec: 48606.0, 300 sec: 46986.0). Total num frames: 832602112. Throughput: 0: 47982.3. Samples: 833745000. Policy #0 lag: (min: 0.0, avg: 44.2, max: 112.0) [2024-03-21 03:42:35,522][03784] Avg episode reward: [(0, '1.103')] [2024-03-21 03:42:38,321][03995] Signal inference workers to stop experience collection... (16750 times) [2024-03-21 03:42:38,373][04017] InferenceWorker_p0-w0: stopping experience collection (16750 times) [2024-03-21 03:42:38,610][03995] Signal inference workers to resume experience collection... (16750 times) [2024-03-21 03:42:38,610][04017] InferenceWorker_p0-w0: resuming experience collection (16750 times) [2024-03-21 03:42:40,521][03784] Fps is (10 sec: 52428.5, 60 sec: 48605.9, 300 sec: 46874.9). Total num frames: 832798720. Throughput: 0: 47391.2. Samples: 834036900. Policy #0 lag: (min: 0.0, avg: 44.2, max: 112.0) [2024-03-21 03:42:40,522][03784] Avg episode reward: [(0, '1.103')] [2024-03-21 03:42:41,697][04017] Updated weights for policy 0, policy_version 25417 (0.0010) [2024-03-21 03:42:45,521][03784] Fps is (10 sec: 29491.0, 60 sec: 45875.2, 300 sec: 46208.4). Total num frames: 832897024. Throughput: 0: 47202.3. Samples: 834186500. Policy #0 lag: (min: 0.0, avg: 44.2, max: 112.0) [2024-03-21 03:42:45,522][03784] Avg episode reward: [(0, '1.086')] [2024-03-21 03:42:50,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 833159168. Throughput: 0: 47302.3. Samples: 834472300. Policy #0 lag: (min: 2.0, avg: 33.9, max: 76.0) [2024-03-21 03:42:50,522][03784] Avg episode reward: [(0, '0.601')] [2024-03-21 03:42:50,585][04017] Updated weights for policy 0, policy_version 25427 (0.0011) [2024-03-21 03:42:55,098][04017] Updated weights for policy 0, policy_version 25437 (0.0019) [2024-03-21 03:42:55,521][03784] Fps is (10 sec: 65536.1, 60 sec: 50790.5, 300 sec: 46763.8). Total num frames: 833552384. Throughput: 0: 47733.4. Samples: 834761800. Policy #0 lag: (min: 2.0, avg: 33.9, max: 76.0) [2024-03-21 03:42:55,522][03784] Avg episode reward: [(0, '0.702')] [2024-03-21 03:42:58,716][04017] Updated weights for policy 0, policy_version 25447 (0.0013) [2024-03-21 03:43:00,521][03784] Fps is (10 sec: 72089.9, 60 sec: 52975.1, 300 sec: 47541.4). Total num frames: 833880064. Throughput: 0: 47691.2. Samples: 834893400. Policy #0 lag: (min: 2.0, avg: 33.9, max: 76.0) [2024-03-21 03:43:00,521][03784] Avg episode reward: [(0, '1.332')] [2024-03-21 03:43:05,521][03784] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 47319.2). Total num frames: 834043904. Throughput: 0: 47580.0. Samples: 835163700. Policy #0 lag: (min: 2.0, avg: 33.9, max: 76.0) [2024-03-21 03:43:05,522][03784] Avg episode reward: [(0, '1.319')] [2024-03-21 03:43:09,180][04017] Updated weights for policy 0, policy_version 25457 (0.0014) [2024-03-21 03:43:10,521][03784] Fps is (10 sec: 29490.7, 60 sec: 48059.7, 300 sec: 47319.2). Total num frames: 834174976. Throughput: 0: 48484.4. Samples: 835475500. Policy #0 lag: (min: 0.0, avg: 44.2, max: 92.0) [2024-03-21 03:43:10,522][03784] Avg episode reward: [(0, '1.245')] [2024-03-21 03:43:15,521][03784] Fps is (10 sec: 22937.7, 60 sec: 44783.0, 300 sec: 46652.8). Total num frames: 834273280. Throughput: 0: 48493.3. Samples: 835629400. Policy #0 lag: (min: 0.0, avg: 44.2, max: 92.0) [2024-03-21 03:43:15,522][03784] Avg episode reward: [(0, '0.559')] [2024-03-21 03:43:20,521][03784] Fps is (10 sec: 26214.7, 60 sec: 45329.1, 300 sec: 46541.7). Total num frames: 834437120. Throughput: 0: 48553.3. Samples: 835929900. Policy #0 lag: (min: 0.0, avg: 44.2, max: 92.0) [2024-03-21 03:43:20,521][03784] Avg episode reward: [(0, '0.559')] [2024-03-21 03:43:21,020][04017] Updated weights for policy 0, policy_version 25467 (0.0013) [2024-03-21 03:43:25,014][04017] Updated weights for policy 0, policy_version 25477 (0.0016) [2024-03-21 03:43:25,521][03784] Fps is (10 sec: 58982.3, 60 sec: 46421.3, 300 sec: 47097.1). Total num frames: 834863104. Throughput: 0: 47731.1. Samples: 836184800. Policy #0 lag: (min: 5.0, avg: 34.1, max: 75.0) [2024-03-21 03:43:25,522][03784] Avg episode reward: [(0, '1.458')] [2024-03-21 03:43:28,274][03995] Signal inference workers to stop experience collection... (16800 times) [2024-03-21 03:43:28,353][04017] InferenceWorker_p0-w0: stopping experience collection (16800 times) [2024-03-21 03:43:28,496][03995] Signal inference workers to resume experience collection... (16800 times) [2024-03-21 03:43:28,496][04017] InferenceWorker_p0-w0: resuming experience collection (16800 times) [2024-03-21 03:43:29,405][04017] Updated weights for policy 0, policy_version 25487 (0.0012) [2024-03-21 03:43:30,521][03784] Fps is (10 sec: 72088.8, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 835158016. Throughput: 0: 47384.4. Samples: 836318800. Policy #0 lag: (min: 5.0, avg: 34.1, max: 75.0) [2024-03-21 03:43:30,522][03784] Avg episode reward: [(0, '1.458')] [2024-03-21 03:43:35,521][03784] Fps is (10 sec: 45874.9, 60 sec: 45329.0, 300 sec: 46319.5). Total num frames: 835321856. Throughput: 0: 47419.9. Samples: 836606200. Policy #0 lag: (min: 5.0, avg: 34.1, max: 75.0) [2024-03-21 03:43:35,522][03784] Avg episode reward: [(0, '0.863')] [2024-03-21 03:43:37,662][04017] Updated weights for policy 0, policy_version 25497 (0.0011) [2024-03-21 03:43:40,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.4, 300 sec: 46652.7). Total num frames: 835616768. Throughput: 0: 47077.7. Samples: 836880300. Policy #0 lag: (min: 5.0, avg: 34.1, max: 75.0) [2024-03-21 03:43:40,522][03784] Avg episode reward: [(0, '1.231')] [2024-03-21 03:43:44,832][04017] Updated weights for policy 0, policy_version 25507 (0.0012) [2024-03-21 03:43:45,521][03784] Fps is (10 sec: 55706.0, 60 sec: 49698.1, 300 sec: 47097.1). Total num frames: 835878912. Throughput: 0: 47457.7. Samples: 837029000. Policy #0 lag: (min: 0.0, avg: 44.0, max: 109.0) [2024-03-21 03:43:45,522][03784] Avg episode reward: [(0, '0.958')] [2024-03-21 03:43:49,418][04017] Updated weights for policy 0, policy_version 25517 (0.0014) [2024-03-21 03:43:50,521][03784] Fps is (10 sec: 52429.2, 60 sec: 49698.1, 300 sec: 47208.1). Total num frames: 836141056. Throughput: 0: 47071.2. Samples: 837281900. Policy #0 lag: (min: 0.0, avg: 44.0, max: 109.0) [2024-03-21 03:43:50,522][03784] Avg episode reward: [(0, '0.748')] [2024-03-21 03:43:55,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46421.3, 300 sec: 47208.1). Total num frames: 836337664. Throughput: 0: 46306.7. Samples: 837559300. Policy #0 lag: (min: 0.0, avg: 44.0, max: 109.0) [2024-03-21 03:43:55,522][03784] Avg episode reward: [(0, '1.608')] [2024-03-21 03:43:59,615][04017] Updated weights for policy 0, policy_version 25527 (0.0025) [2024-03-21 03:44:00,521][03784] Fps is (10 sec: 39321.2, 60 sec: 44236.7, 300 sec: 47208.1). Total num frames: 836534272. Throughput: 0: 45902.1. Samples: 837695000. Policy #0 lag: (min: 0.0, avg: 46.0, max: 114.0) [2024-03-21 03:44:00,522][03784] Avg episode reward: [(0, '1.426')] [2024-03-21 03:44:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000025529_836534272.pth... [2024-03-21 03:44:00,655][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000025184_825229312.pth [2024-03-21 03:44:05,521][03784] Fps is (10 sec: 42598.1, 60 sec: 45329.0, 300 sec: 47208.1). Total num frames: 836763648. Throughput: 0: 45602.1. Samples: 837982000. Policy #0 lag: (min: 0.0, avg: 46.0, max: 114.0) [2024-03-21 03:44:05,522][03784] Avg episode reward: [(0, '1.474')] [2024-03-21 03:44:05,565][04017] Updated weights for policy 0, policy_version 25537 (0.0018) [2024-03-21 03:44:10,522][03784] Fps is (10 sec: 42596.9, 60 sec: 46421.1, 300 sec: 46430.5). Total num frames: 836960256. Throughput: 0: 45926.2. Samples: 838251500. Policy #0 lag: (min: 0.0, avg: 46.0, max: 114.0) [2024-03-21 03:44:10,522][03784] Avg episode reward: [(0, '1.196')] [2024-03-21 03:44:13,783][04017] Updated weights for policy 0, policy_version 25547 (0.0020) [2024-03-21 03:44:15,521][03784] Fps is (10 sec: 42598.4, 60 sec: 48605.8, 300 sec: 46319.5). Total num frames: 837189632. Throughput: 0: 46082.2. Samples: 838392500. Policy #0 lag: (min: 0.0, avg: 46.0, max: 114.0) [2024-03-21 03:44:15,531][03784] Avg episode reward: [(0, '0.857')] [2024-03-21 03:44:20,521][03784] Fps is (10 sec: 45877.1, 60 sec: 49698.1, 300 sec: 46208.4). Total num frames: 837419008. Throughput: 0: 46391.2. Samples: 838693800. Policy #0 lag: (min: 0.0, avg: 27.5, max: 60.0) [2024-03-21 03:44:20,522][03784] Avg episode reward: [(0, '1.319')] [2024-03-21 03:44:22,661][04017] Updated weights for policy 0, policy_version 25557 (0.0011) [2024-03-21 03:44:24,966][03995] Signal inference workers to stop experience collection... (16850 times) [2024-03-21 03:44:25,044][03995] Signal inference workers to resume experience collection... (16850 times) [2024-03-21 03:44:25,045][04017] InferenceWorker_p0-w0: stopping experience collection (16850 times) [2024-03-21 03:44:25,090][04017] InferenceWorker_p0-w0: resuming experience collection (16850 times) [2024-03-21 03:44:25,521][03784] Fps is (10 sec: 52428.9, 60 sec: 47513.5, 300 sec: 46319.5). Total num frames: 837713920. Throughput: 0: 46582.2. Samples: 838976500. Policy #0 lag: (min: 0.0, avg: 27.5, max: 60.0) [2024-03-21 03:44:25,522][03784] Avg episode reward: [(0, '1.291')] [2024-03-21 03:44:26,014][04017] Updated weights for policy 0, policy_version 25567 (0.0011) [2024-03-21 03:44:30,521][03784] Fps is (10 sec: 58982.5, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 838008832. Throughput: 0: 46482.2. Samples: 839120700. Policy #0 lag: (min: 0.0, avg: 27.5, max: 60.0) [2024-03-21 03:44:30,522][03784] Avg episode reward: [(0, '0.931')] [2024-03-21 03:44:33,442][04017] Updated weights for policy 0, policy_version 25577 (0.0011) [2024-03-21 03:44:35,521][03784] Fps is (10 sec: 52429.1, 60 sec: 48605.9, 300 sec: 47097.1). Total num frames: 838238208. Throughput: 0: 46891.1. Samples: 839392000. Policy #0 lag: (min: 3.0, avg: 41.5, max: 72.0) [2024-03-21 03:44:35,522][03784] Avg episode reward: [(0, '0.931')] [2024-03-21 03:44:40,131][04017] Updated weights for policy 0, policy_version 25587 (0.0012) [2024-03-21 03:44:40,521][03784] Fps is (10 sec: 45874.9, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 838467584. Throughput: 0: 45915.5. Samples: 839625500. Policy #0 lag: (min: 3.0, avg: 41.5, max: 72.0) [2024-03-21 03:44:40,522][03784] Avg episode reward: [(0, '1.003')] [2024-03-21 03:44:45,521][03784] Fps is (10 sec: 29491.3, 60 sec: 44236.8, 300 sec: 46763.8). Total num frames: 838533120. Throughput: 0: 46335.6. Samples: 839780100. Policy #0 lag: (min: 3.0, avg: 41.5, max: 72.0) [2024-03-21 03:44:45,522][03784] Avg episode reward: [(0, '1.337')] [2024-03-21 03:44:50,521][03784] Fps is (10 sec: 22937.6, 60 sec: 42598.3, 300 sec: 46208.4). Total num frames: 838696960. Throughput: 0: 46344.5. Samples: 840067500. Policy #0 lag: (min: 3.0, avg: 41.5, max: 72.0) [2024-03-21 03:44:50,522][03784] Avg episode reward: [(0, '1.511')] [2024-03-21 03:44:51,080][04017] Updated weights for policy 0, policy_version 25597 (0.0014) [2024-03-21 03:44:55,521][03784] Fps is (10 sec: 52428.8, 60 sec: 45329.1, 300 sec: 46874.9). Total num frames: 839057408. Throughput: 0: 46516.0. Samples: 840344700. Policy #0 lag: (min: 2.0, avg: 38.6, max: 79.0) [2024-03-21 03:44:55,522][03784] Avg episode reward: [(0, '1.067')] [2024-03-21 03:44:57,366][04017] Updated weights for policy 0, policy_version 25607 (0.0011) [2024-03-21 03:45:00,521][03784] Fps is (10 sec: 58982.3, 60 sec: 45875.2, 300 sec: 46652.7). Total num frames: 839286784. Throughput: 0: 46728.9. Samples: 840495300. Policy #0 lag: (min: 2.0, avg: 38.6, max: 79.0) [2024-03-21 03:45:00,522][03784] Avg episode reward: [(0, '0.446')] [2024-03-21 03:45:02,826][04017] Updated weights for policy 0, policy_version 25617 (0.0011) [2024-03-21 03:45:05,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46967.6, 300 sec: 46541.7). Total num frames: 839581696. Throughput: 0: 45984.5. Samples: 840763100. Policy #0 lag: (min: 2.0, avg: 38.6, max: 79.0) [2024-03-21 03:45:05,522][03784] Avg episode reward: [(0, '1.150')] [2024-03-21 03:45:09,966][04017] Updated weights for policy 0, policy_version 25627 (0.0018) [2024-03-21 03:45:10,521][03784] Fps is (10 sec: 49151.1, 60 sec: 46967.6, 300 sec: 46652.7). Total num frames: 839778304. Throughput: 0: 46113.2. Samples: 841051600. Policy #0 lag: (min: 0.0, avg: 40.9, max: 92.0) [2024-03-21 03:45:10,522][03784] Avg episode reward: [(0, '1.150')] [2024-03-21 03:45:13,495][03995] Signal inference workers to stop experience collection... (16900 times) [2024-03-21 03:45:13,496][03995] Signal inference workers to resume experience collection... (16900 times) [2024-03-21 03:45:13,543][04017] InferenceWorker_p0-w0: stopping experience collection (16900 times) [2024-03-21 03:45:13,543][04017] InferenceWorker_p0-w0: resuming experience collection (16900 times) [2024-03-21 03:45:14,142][04017] Updated weights for policy 0, policy_version 25637 (0.0012) [2024-03-21 03:45:15,521][03784] Fps is (10 sec: 49151.4, 60 sec: 48059.7, 300 sec: 46652.7). Total num frames: 840073216. Throughput: 0: 45746.6. Samples: 841179300. Policy #0 lag: (min: 0.0, avg: 40.9, max: 92.0) [2024-03-21 03:45:15,522][03784] Avg episode reward: [(0, '0.664')] [2024-03-21 03:45:20,521][03784] Fps is (10 sec: 58983.8, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 840368128. Throughput: 0: 45748.9. Samples: 841450700. Policy #0 lag: (min: 0.0, avg: 40.9, max: 92.0) [2024-03-21 03:45:20,522][03784] Avg episode reward: [(0, '0.626')] [2024-03-21 03:45:21,595][04017] Updated weights for policy 0, policy_version 25647 (0.0019) [2024-03-21 03:45:25,521][03784] Fps is (10 sec: 42598.9, 60 sec: 46421.4, 300 sec: 46763.8). Total num frames: 840499200. Throughput: 0: 47084.5. Samples: 841744300. Policy #0 lag: (min: 0.0, avg: 40.9, max: 92.0) [2024-03-21 03:45:25,522][03784] Avg episode reward: [(0, '0.629')] [2024-03-21 03:45:30,521][03784] Fps is (10 sec: 22937.5, 60 sec: 43144.5, 300 sec: 46541.7). Total num frames: 840597504. Throughput: 0: 47193.3. Samples: 841903800. Policy #0 lag: (min: 0.0, avg: 34.6, max: 66.0) [2024-03-21 03:45:30,522][03784] Avg episode reward: [(0, '0.629')] [2024-03-21 03:45:33,090][04017] Updated weights for policy 0, policy_version 25657 (0.0011) [2024-03-21 03:45:35,521][03784] Fps is (10 sec: 22937.9, 60 sec: 41506.3, 300 sec: 46541.7). Total num frames: 840728576. Throughput: 0: 47633.6. Samples: 842211000. Policy #0 lag: (min: 0.0, avg: 34.6, max: 66.0) [2024-03-21 03:45:35,521][03784] Avg episode reward: [(0, '1.414')] [2024-03-21 03:45:40,204][04017] Updated weights for policy 0, policy_version 25667 (0.0009) [2024-03-21 03:45:40,521][03784] Fps is (10 sec: 49152.5, 60 sec: 43690.8, 300 sec: 46652.8). Total num frames: 841089024. Throughput: 0: 47686.7. Samples: 842490600. Policy #0 lag: (min: 0.0, avg: 34.6, max: 66.0) [2024-03-21 03:45:40,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 03:45:44,995][04017] Updated weights for policy 0, policy_version 25677 (0.0020) [2024-03-21 03:45:45,521][03784] Fps is (10 sec: 68811.7, 60 sec: 48059.7, 300 sec: 47430.3). Total num frames: 841416704. Throughput: 0: 47562.3. Samples: 842635600. Policy #0 lag: (min: 2.0, avg: 39.2, max: 77.0) [2024-03-21 03:45:45,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 03:45:50,144][04017] Updated weights for policy 0, policy_version 25687 (0.0022) [2024-03-21 03:45:50,521][03784] Fps is (10 sec: 62258.7, 60 sec: 50244.3, 300 sec: 47319.2). Total num frames: 841711616. Throughput: 0: 47564.4. Samples: 842903500. Policy #0 lag: (min: 2.0, avg: 39.2, max: 77.0) [2024-03-21 03:45:50,522][03784] Avg episode reward: [(0, '0.598')] [2024-03-21 03:45:55,521][03784] Fps is (10 sec: 49152.1, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 841908224. Throughput: 0: 47540.3. Samples: 843190900. Policy #0 lag: (min: 2.0, avg: 39.2, max: 77.0) [2024-03-21 03:45:55,522][03784] Avg episode reward: [(0, '1.294')] [2024-03-21 03:45:57,384][04017] Updated weights for policy 0, policy_version 25697 (0.0011) [2024-03-21 03:46:00,521][03784] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 46874.9). Total num frames: 842235904. Throughput: 0: 47466.8. Samples: 843315300. Policy #0 lag: (min: 2.0, avg: 39.2, max: 77.0) [2024-03-21 03:46:00,522][03784] Avg episode reward: [(0, '1.316')] [2024-03-21 03:46:00,656][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000025704_842268672.pth... [2024-03-21 03:46:00,779][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000025352_830734336.pth [2024-03-21 03:46:02,337][04017] Updated weights for policy 0, policy_version 25707 (0.0015) [2024-03-21 03:46:02,379][03995] Signal inference workers to stop experience collection... (16950 times) [2024-03-21 03:46:02,380][03995] Signal inference workers to resume experience collection... (16950 times) [2024-03-21 03:46:02,442][04017] InferenceWorker_p0-w0: stopping experience collection (16950 times) [2024-03-21 03:46:02,442][04017] InferenceWorker_p0-w0: resuming experience collection (16950 times) [2024-03-21 03:46:05,521][03784] Fps is (10 sec: 55705.6, 60 sec: 48059.7, 300 sec: 46652.8). Total num frames: 842465280. Throughput: 0: 47544.5. Samples: 843590200. Policy #0 lag: (min: 0.0, avg: 43.4, max: 115.0) [2024-03-21 03:46:05,522][03784] Avg episode reward: [(0, '1.404')] [2024-03-21 03:46:09,489][04017] Updated weights for policy 0, policy_version 25717 (0.0011) [2024-03-21 03:46:10,521][03784] Fps is (10 sec: 45874.9, 60 sec: 48606.0, 300 sec: 47208.1). Total num frames: 842694656. Throughput: 0: 46824.4. Samples: 843851400. Policy #0 lag: (min: 0.0, avg: 43.4, max: 115.0) [2024-03-21 03:46:10,522][03784] Avg episode reward: [(0, '1.275')] [2024-03-21 03:46:15,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45875.3, 300 sec: 47208.1). Total num frames: 842825728. Throughput: 0: 47077.9. Samples: 844022300. Policy #0 lag: (min: 0.0, avg: 43.4, max: 115.0) [2024-03-21 03:46:15,522][03784] Avg episode reward: [(0, '1.275')] [2024-03-21 03:46:19,783][04017] Updated weights for policy 0, policy_version 25727 (0.0012) [2024-03-21 03:46:20,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44782.9, 300 sec: 47208.1). Total num frames: 843055104. Throughput: 0: 46933.1. Samples: 844323000. Policy #0 lag: (min: 0.0, avg: 44.6, max: 117.0) [2024-03-21 03:46:20,522][03784] Avg episode reward: [(0, '1.275')] [2024-03-21 03:46:25,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46421.4, 300 sec: 46652.8). Total num frames: 843284480. Throughput: 0: 47191.1. Samples: 844614200. Policy #0 lag: (min: 0.0, avg: 44.6, max: 117.0) [2024-03-21 03:46:25,522][03784] Avg episode reward: [(0, '1.275')] [2024-03-21 03:46:26,354][04017] Updated weights for policy 0, policy_version 25737 (0.0019) [2024-03-21 03:46:30,521][03784] Fps is (10 sec: 42598.9, 60 sec: 48059.8, 300 sec: 46763.8). Total num frames: 843481088. Throughput: 0: 47037.8. Samples: 844752300. Policy #0 lag: (min: 0.0, avg: 44.6, max: 117.0) [2024-03-21 03:46:30,522][03784] Avg episode reward: [(0, '1.030')] [2024-03-21 03:46:34,357][04017] Updated weights for policy 0, policy_version 25747 (0.0015) [2024-03-21 03:46:35,521][03784] Fps is (10 sec: 42598.2, 60 sec: 49698.0, 300 sec: 46874.9). Total num frames: 843710464. Throughput: 0: 47331.1. Samples: 845033400. Policy #0 lag: (min: 0.0, avg: 44.6, max: 117.0) [2024-03-21 03:46:35,522][03784] Avg episode reward: [(0, '1.298')] [2024-03-21 03:46:40,521][03784] Fps is (10 sec: 45875.2, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 843939840. Throughput: 0: 47166.7. Samples: 845313400. Policy #0 lag: (min: 0.0, avg: 39.6, max: 88.0) [2024-03-21 03:46:40,522][03784] Avg episode reward: [(0, '1.116')] [2024-03-21 03:46:41,051][04017] Updated weights for policy 0, policy_version 25757 (0.0011) [2024-03-21 03:46:45,521][03784] Fps is (10 sec: 55705.6, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 844267520. Throughput: 0: 47386.7. Samples: 845447700. Policy #0 lag: (min: 0.0, avg: 39.6, max: 88.0) [2024-03-21 03:46:45,522][03784] Avg episode reward: [(0, '1.333')] [2024-03-21 03:46:45,963][04017] Updated weights for policy 0, policy_version 25767 (0.0012) [2024-03-21 03:46:50,521][03784] Fps is (10 sec: 49151.5, 60 sec: 45329.0, 300 sec: 47208.1). Total num frames: 844431360. Throughput: 0: 47839.9. Samples: 845743000. Policy #0 lag: (min: 0.0, avg: 39.6, max: 88.0) [2024-03-21 03:46:50,522][03784] Avg episode reward: [(0, '1.278')] [2024-03-21 03:46:55,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45329.0, 300 sec: 47208.1). Total num frames: 844627968. Throughput: 0: 48115.6. Samples: 846016600. Policy #0 lag: (min: 0.0, avg: 39.6, max: 88.0) [2024-03-21 03:46:55,522][03784] Avg episode reward: [(0, '1.093')] [2024-03-21 03:46:56,768][04017] Updated weights for policy 0, policy_version 25777 (0.0011) [2024-03-21 03:47:00,521][03784] Fps is (10 sec: 49152.3, 60 sec: 44783.0, 300 sec: 47208.1). Total num frames: 844922880. Throughput: 0: 47426.6. Samples: 846156500. Policy #0 lag: (min: 0.0, avg: 32.6, max: 67.0) [2024-03-21 03:47:00,522][03784] Avg episode reward: [(0, '0.860')] [2024-03-21 03:47:01,060][04017] Updated weights for policy 0, policy_version 25787 (0.0015) [2024-03-21 03:47:02,740][03995] Signal inference workers to stop experience collection... (17000 times) [2024-03-21 03:47:02,815][04017] InferenceWorker_p0-w0: stopping experience collection (17000 times) [2024-03-21 03:47:03,001][03995] Signal inference workers to resume experience collection... (17000 times) [2024-03-21 03:47:03,001][04017] InferenceWorker_p0-w0: resuming experience collection (17000 times) [2024-03-21 03:47:05,521][03784] Fps is (10 sec: 62259.5, 60 sec: 46421.3, 300 sec: 47319.2). Total num frames: 845250560. Throughput: 0: 46802.3. Samples: 846429100. Policy #0 lag: (min: 0.0, avg: 32.6, max: 67.0) [2024-03-21 03:47:05,522][03784] Avg episode reward: [(0, '0.969')] [2024-03-21 03:47:09,642][04017] Updated weights for policy 0, policy_version 25797 (0.0016) [2024-03-21 03:47:10,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44236.9, 300 sec: 46652.7). Total num frames: 845348864. Throughput: 0: 47073.3. Samples: 846732500. Policy #0 lag: (min: 0.0, avg: 32.6, max: 67.0) [2024-03-21 03:47:10,522][03784] Avg episode reward: [(0, '0.980')] [2024-03-21 03:47:14,922][04017] Updated weights for policy 0, policy_version 25807 (0.0013) [2024-03-21 03:47:15,521][03784] Fps is (10 sec: 42598.3, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 845676544. Throughput: 0: 46937.7. Samples: 846864500. Policy #0 lag: (min: 0.0, avg: 34.7, max: 77.0) [2024-03-21 03:47:15,522][03784] Avg episode reward: [(0, '1.139')] [2024-03-21 03:47:20,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 845807616. Throughput: 0: 46715.5. Samples: 847135600. Policy #0 lag: (min: 0.0, avg: 34.7, max: 77.0) [2024-03-21 03:47:20,522][03784] Avg episode reward: [(0, '0.666')] [2024-03-21 03:47:24,366][04017] Updated weights for policy 0, policy_version 25817 (0.0011) [2024-03-21 03:47:25,521][03784] Fps is (10 sec: 32767.9, 60 sec: 45329.0, 300 sec: 46541.7). Total num frames: 846004224. Throughput: 0: 46735.5. Samples: 847416500. Policy #0 lag: (min: 0.0, avg: 34.7, max: 77.0) [2024-03-21 03:47:25,522][03784] Avg episode reward: [(0, '1.165')] [2024-03-21 03:47:29,643][04017] Updated weights for policy 0, policy_version 25827 (0.0012) [2024-03-21 03:47:30,521][03784] Fps is (10 sec: 55705.7, 60 sec: 48059.7, 300 sec: 46652.7). Total num frames: 846364672. Throughput: 0: 46951.1. Samples: 847560500. Policy #0 lag: (min: 0.0, avg: 34.7, max: 77.0) [2024-03-21 03:47:30,522][03784] Avg episode reward: [(0, '1.293')] [2024-03-21 03:47:34,073][04017] Updated weights for policy 0, policy_version 25837 (0.0011) [2024-03-21 03:47:35,521][03784] Fps is (10 sec: 68813.4, 60 sec: 49698.2, 300 sec: 47097.1). Total num frames: 846692352. Throughput: 0: 46251.2. Samples: 847824300. Policy #0 lag: (min: 0.0, avg: 40.4, max: 83.0) [2024-03-21 03:47:35,522][03784] Avg episode reward: [(0, '0.737')] [2024-03-21 03:47:40,521][03784] Fps is (10 sec: 52428.9, 60 sec: 49152.0, 300 sec: 47430.3). Total num frames: 846888960. Throughput: 0: 46482.3. Samples: 848108300. Policy #0 lag: (min: 0.0, avg: 40.4, max: 83.0) [2024-03-21 03:47:40,522][03784] Avg episode reward: [(0, '1.162')] [2024-03-21 03:47:41,955][04017] Updated weights for policy 0, policy_version 25847 (0.0015) [2024-03-21 03:47:45,521][03784] Fps is (10 sec: 36044.5, 60 sec: 46421.3, 300 sec: 47097.0). Total num frames: 847052800. Throughput: 0: 46535.5. Samples: 848250600. Policy #0 lag: (min: 0.0, avg: 40.4, max: 83.0) [2024-03-21 03:47:45,522][03784] Avg episode reward: [(0, '0.907')] [2024-03-21 03:47:50,521][03784] Fps is (10 sec: 19660.8, 60 sec: 44236.9, 300 sec: 45875.2). Total num frames: 847085568. Throughput: 0: 47311.1. Samples: 848558100. Policy #0 lag: (min: 0.0, avg: 40.4, max: 83.0) [2024-03-21 03:47:50,522][03784] Avg episode reward: [(0, '0.954')] [2024-03-21 03:47:52,945][04017] Updated weights for policy 0, policy_version 25857 (0.0010) [2024-03-21 03:47:55,521][03784] Fps is (10 sec: 32768.4, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 847380480. Throughput: 0: 46855.6. Samples: 848841000. Policy #0 lag: (min: 0.0, avg: 32.0, max: 76.0) [2024-03-21 03:47:55,522][03784] Avg episode reward: [(0, '1.720')] [2024-03-21 03:47:55,522][03995] Saving new best policy, reward=1.720! [2024-03-21 03:47:58,018][04017] Updated weights for policy 0, policy_version 25867 (0.0014) [2024-03-21 03:48:00,521][03784] Fps is (10 sec: 62258.3, 60 sec: 46421.2, 300 sec: 46319.5). Total num frames: 847708160. Throughput: 0: 46744.3. Samples: 848968000. Policy #0 lag: (min: 0.0, avg: 32.0, max: 76.0) [2024-03-21 03:48:00,522][03784] Avg episode reward: [(0, '0.898')] [2024-03-21 03:48:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000025870_847708160.pth... [2024-03-21 03:48:00,644][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000025529_836534272.pth [2024-03-21 03:48:02,134][03995] Signal inference workers to stop experience collection... (17050 times) [2024-03-21 03:48:02,186][04017] InferenceWorker_p0-w0: stopping experience collection (17050 times) [2024-03-21 03:48:02,416][03995] Signal inference workers to resume experience collection... (17050 times) [2024-03-21 03:48:02,417][04017] InferenceWorker_p0-w0: resuming experience collection (17050 times) [2024-03-21 03:48:04,935][04017] Updated weights for policy 0, policy_version 25877 (0.0010) [2024-03-21 03:48:05,521][03784] Fps is (10 sec: 58981.9, 60 sec: 45329.0, 300 sec: 46763.8). Total num frames: 847970304. Throughput: 0: 47213.3. Samples: 849260200. Policy #0 lag: (min: 0.0, avg: 32.0, max: 76.0) [2024-03-21 03:48:05,522][03784] Avg episode reward: [(0, '0.898')] [2024-03-21 03:48:10,521][03784] Fps is (10 sec: 45875.6, 60 sec: 46967.4, 300 sec: 47097.1). Total num frames: 848166912. Throughput: 0: 47364.5. Samples: 849547900. Policy #0 lag: (min: 0.0, avg: 29.9, max: 70.0) [2024-03-21 03:48:10,522][03784] Avg episode reward: [(0, '1.164')] [2024-03-21 03:48:13,961][04017] Updated weights for policy 0, policy_version 25887 (0.0012) [2024-03-21 03:48:15,521][03784] Fps is (10 sec: 36044.6, 60 sec: 44236.8, 300 sec: 47097.0). Total num frames: 848330752. Throughput: 0: 47237.7. Samples: 849686200. Policy #0 lag: (min: 0.0, avg: 29.9, max: 70.0) [2024-03-21 03:48:15,522][03784] Avg episode reward: [(0, '0.826')] [2024-03-21 03:48:20,521][03784] Fps is (10 sec: 32768.2, 60 sec: 44783.0, 300 sec: 46208.4). Total num frames: 848494592. Throughput: 0: 47415.6. Samples: 849958000. Policy #0 lag: (min: 0.0, avg: 29.9, max: 70.0) [2024-03-21 03:48:20,522][03784] Avg episode reward: [(0, '1.057')] [2024-03-21 03:48:21,243][04017] Updated weights for policy 0, policy_version 25897 (0.0018) [2024-03-21 03:48:24,389][04017] Updated weights for policy 0, policy_version 25907 (0.0015) [2024-03-21 03:48:25,521][03784] Fps is (10 sec: 68812.9, 60 sec: 50244.3, 300 sec: 46986.0). Total num frames: 849018880. Throughput: 0: 46762.2. Samples: 850212600. Policy #0 lag: (min: 0.0, avg: 29.9, max: 70.0) [2024-03-21 03:48:25,522][03784] Avg episode reward: [(0, '0.952')] [2024-03-21 03:48:30,027][04017] Updated weights for policy 0, policy_version 25917 (0.0012) [2024-03-21 03:48:30,521][03784] Fps is (10 sec: 78642.9, 60 sec: 48605.9, 300 sec: 47319.2). Total num frames: 849281024. Throughput: 0: 46906.7. Samples: 850361400. Policy #0 lag: (min: 0.0, avg: 47.9, max: 119.0) [2024-03-21 03:48:30,522][03784] Avg episode reward: [(0, '0.952')] [2024-03-21 03:48:35,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44782.8, 300 sec: 46652.7). Total num frames: 849379328. Throughput: 0: 46555.4. Samples: 850653100. Policy #0 lag: (min: 0.0, avg: 47.9, max: 119.0) [2024-03-21 03:48:35,522][03784] Avg episode reward: [(0, '0.949')] [2024-03-21 03:48:40,521][03784] Fps is (10 sec: 26214.2, 60 sec: 44236.7, 300 sec: 46319.5). Total num frames: 849543168. Throughput: 0: 46991.0. Samples: 850955600. Policy #0 lag: (min: 0.0, avg: 47.9, max: 119.0) [2024-03-21 03:48:40,522][03784] Avg episode reward: [(0, '0.920')] [2024-03-21 03:48:40,869][04017] Updated weights for policy 0, policy_version 25927 (0.0019) [2024-03-21 03:48:45,521][03784] Fps is (10 sec: 36045.0, 60 sec: 44782.9, 300 sec: 46097.3). Total num frames: 849739776. Throughput: 0: 47266.8. Samples: 851095000. Policy #0 lag: (min: 0.0, avg: 47.9, max: 119.0) [2024-03-21 03:48:45,523][03784] Avg episode reward: [(0, '1.206')] [2024-03-21 03:48:47,734][04017] Updated weights for policy 0, policy_version 25937 (0.0016) [2024-03-21 03:48:50,521][03784] Fps is (10 sec: 52428.8, 60 sec: 49698.0, 300 sec: 46541.7). Total num frames: 850067456. Throughput: 0: 47126.6. Samples: 851380900. Policy #0 lag: (min: 0.0, avg: 31.0, max: 77.0) [2024-03-21 03:48:50,522][03784] Avg episode reward: [(0, '0.626')] [2024-03-21 03:48:51,496][03995] Signal inference workers to stop experience collection... (17100 times) [2024-03-21 03:48:51,497][03995] Signal inference workers to resume experience collection... (17100 times) [2024-03-21 03:48:51,568][04017] InferenceWorker_p0-w0: stopping experience collection (17100 times) [2024-03-21 03:48:51,568][04017] InferenceWorker_p0-w0: resuming experience collection (17100 times) [2024-03-21 03:48:55,325][04017] Updated weights for policy 0, policy_version 25947 (0.0016) [2024-03-21 03:48:55,521][03784] Fps is (10 sec: 49151.6, 60 sec: 47513.5, 300 sec: 46430.6). Total num frames: 850231296. Throughput: 0: 47624.4. Samples: 851691000. Policy #0 lag: (min: 0.0, avg: 31.0, max: 77.0) [2024-03-21 03:48:55,523][03784] Avg episode reward: [(0, '0.780')] [2024-03-21 03:49:00,521][03784] Fps is (10 sec: 39321.9, 60 sec: 45875.3, 300 sec: 46430.6). Total num frames: 850460672. Throughput: 0: 47631.2. Samples: 851829600. Policy #0 lag: (min: 0.0, avg: 31.0, max: 77.0) [2024-03-21 03:49:00,522][03784] Avg episode reward: [(0, '0.780')] [2024-03-21 03:49:02,327][04017] Updated weights for policy 0, policy_version 25957 (0.0015) [2024-03-21 03:49:05,521][03784] Fps is (10 sec: 49152.4, 60 sec: 45875.2, 300 sec: 46652.8). Total num frames: 850722816. Throughput: 0: 47659.9. Samples: 852102700. Policy #0 lag: (min: 2.0, avg: 27.9, max: 67.0) [2024-03-21 03:49:05,522][03784] Avg episode reward: [(0, '1.023')] [2024-03-21 03:49:08,871][04017] Updated weights for policy 0, policy_version 25967 (0.0010) [2024-03-21 03:49:10,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46421.3, 300 sec: 46652.8). Total num frames: 850952192. Throughput: 0: 47702.3. Samples: 852359200. Policy #0 lag: (min: 2.0, avg: 27.9, max: 67.0) [2024-03-21 03:49:10,522][03784] Avg episode reward: [(0, '0.722')] [2024-03-21 03:49:15,135][04017] Updated weights for policy 0, policy_version 25977 (0.0017) [2024-03-21 03:49:15,521][03784] Fps is (10 sec: 52428.8, 60 sec: 48605.9, 300 sec: 46874.9). Total num frames: 851247104. Throughput: 0: 47368.9. Samples: 852493000. Policy #0 lag: (min: 2.0, avg: 27.9, max: 67.0) [2024-03-21 03:49:15,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 03:49:18,677][04017] Updated weights for policy 0, policy_version 25987 (0.0015) [2024-03-21 03:49:20,521][03784] Fps is (10 sec: 58982.6, 60 sec: 50790.4, 300 sec: 46874.9). Total num frames: 851542016. Throughput: 0: 46440.1. Samples: 852742900. Policy #0 lag: (min: 2.0, avg: 27.9, max: 67.0) [2024-03-21 03:49:20,522][03784] Avg episode reward: [(0, '1.309')] [2024-03-21 03:49:25,521][03784] Fps is (10 sec: 49152.3, 60 sec: 45329.1, 300 sec: 46541.7). Total num frames: 851738624. Throughput: 0: 46782.3. Samples: 853060800. Policy #0 lag: (min: 0.0, avg: 33.7, max: 68.0) [2024-03-21 03:49:25,522][03784] Avg episode reward: [(0, '1.309')] [2024-03-21 03:49:27,277][04017] Updated weights for policy 0, policy_version 25997 (0.0012) [2024-03-21 03:49:30,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46421.3, 300 sec: 46874.9). Total num frames: 852066304. Throughput: 0: 47071.2. Samples: 853213200. Policy #0 lag: (min: 0.0, avg: 33.7, max: 68.0) [2024-03-21 03:49:30,522][03784] Avg episode reward: [(0, '0.688')] [2024-03-21 03:49:35,521][03784] Fps is (10 sec: 39321.4, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 852131840. Throughput: 0: 47033.4. Samples: 853497400. Policy #0 lag: (min: 0.0, avg: 33.7, max: 68.0) [2024-03-21 03:49:35,522][03784] Avg episode reward: [(0, '0.688')] [2024-03-21 03:49:36,213][04017] Updated weights for policy 0, policy_version 26007 (0.0011) [2024-03-21 03:49:40,521][03784] Fps is (10 sec: 29490.8, 60 sec: 46967.4, 300 sec: 46874.9). Total num frames: 852361216. Throughput: 0: 46304.4. Samples: 853774700. Policy #0 lag: (min: 0.0, avg: 33.7, max: 68.0) [2024-03-21 03:49:40,522][03784] Avg episode reward: [(0, '0.922')] [2024-03-21 03:49:42,725][04017] Updated weights for policy 0, policy_version 26017 (0.0009) [2024-03-21 03:49:45,521][03784] Fps is (10 sec: 45875.0, 60 sec: 47513.6, 300 sec: 47097.1). Total num frames: 852590592. Throughput: 0: 46137.7. Samples: 853905800. Policy #0 lag: (min: 0.0, avg: 52.3, max: 109.0) [2024-03-21 03:49:45,522][03784] Avg episode reward: [(0, '0.524')] [2024-03-21 03:49:45,592][03995] Signal inference workers to stop experience collection... (17150 times) [2024-03-21 03:49:45,592][03995] Signal inference workers to resume experience collection... (17150 times) [2024-03-21 03:49:45,656][04017] InferenceWorker_p0-w0: stopping experience collection (17150 times) [2024-03-21 03:49:45,657][04017] InferenceWorker_p0-w0: resuming experience collection (17150 times) [2024-03-21 03:49:49,278][04017] Updated weights for policy 0, policy_version 26027 (0.0011) [2024-03-21 03:49:50,521][03784] Fps is (10 sec: 52429.1, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 852885504. Throughput: 0: 46222.2. Samples: 854182700. Policy #0 lag: (min: 0.0, avg: 52.3, max: 109.0) [2024-03-21 03:49:50,522][03784] Avg episode reward: [(0, '1.156')] [2024-03-21 03:49:55,521][03784] Fps is (10 sec: 52427.9, 60 sec: 48059.6, 300 sec: 46874.9). Total num frames: 853114880. Throughput: 0: 46933.1. Samples: 854471200. Policy #0 lag: (min: 0.0, avg: 52.3, max: 109.0) [2024-03-21 03:49:55,523][03784] Avg episode reward: [(0, '0.464')] [2024-03-21 03:49:56,146][04017] Updated weights for policy 0, policy_version 26037 (0.0011) [2024-03-21 03:50:00,521][03784] Fps is (10 sec: 55705.9, 60 sec: 49698.1, 300 sec: 46986.0). Total num frames: 853442560. Throughput: 0: 47033.3. Samples: 854609500. Policy #0 lag: (min: 2.0, avg: 48.4, max: 95.0) [2024-03-21 03:50:00,522][03784] Avg episode reward: [(0, '1.036')] [2024-03-21 03:50:00,752][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000026046_853475328.pth... [2024-03-21 03:50:00,871][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000025704_842268672.pth [2024-03-21 03:50:01,764][04017] Updated weights for policy 0, policy_version 26047 (0.0015) [2024-03-21 03:50:05,521][03784] Fps is (10 sec: 52429.8, 60 sec: 48605.8, 300 sec: 46986.0). Total num frames: 853639168. Throughput: 0: 47497.7. Samples: 854880300. Policy #0 lag: (min: 2.0, avg: 48.4, max: 95.0) [2024-03-21 03:50:05,522][03784] Avg episode reward: [(0, '1.094')] [2024-03-21 03:50:10,521][03784] Fps is (10 sec: 36045.0, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 853803008. Throughput: 0: 46811.1. Samples: 855167300. Policy #0 lag: (min: 2.0, avg: 48.4, max: 95.0) [2024-03-21 03:50:10,522][03784] Avg episode reward: [(0, '0.911')] [2024-03-21 03:50:12,803][04017] Updated weights for policy 0, policy_version 26057 (0.0014) [2024-03-21 03:50:15,521][03784] Fps is (10 sec: 26214.5, 60 sec: 44236.8, 300 sec: 45875.2). Total num frames: 853901312. Throughput: 0: 46868.9. Samples: 855322300. Policy #0 lag: (min: 2.0, avg: 48.4, max: 95.0) [2024-03-21 03:50:15,522][03784] Avg episode reward: [(0, '1.157')] [2024-03-21 03:50:20,521][03784] Fps is (10 sec: 32767.4, 60 sec: 43144.4, 300 sec: 46208.4). Total num frames: 854130688. Throughput: 0: 46895.4. Samples: 855607700. Policy #0 lag: (min: 0.0, avg: 33.5, max: 67.0) [2024-03-21 03:50:20,522][03784] Avg episode reward: [(0, '1.157')] [2024-03-21 03:50:20,754][04017] Updated weights for policy 0, policy_version 26067 (0.0012) [2024-03-21 03:50:25,521][03784] Fps is (10 sec: 49151.4, 60 sec: 44236.7, 300 sec: 46763.8). Total num frames: 854392832. Throughput: 0: 46835.6. Samples: 855882300. Policy #0 lag: (min: 0.0, avg: 33.5, max: 67.0) [2024-03-21 03:50:25,522][03784] Avg episode reward: [(0, '1.173')] [2024-03-21 03:50:26,523][04017] Updated weights for policy 0, policy_version 26077 (0.0027) [2024-03-21 03:50:30,521][03784] Fps is (10 sec: 65537.0, 60 sec: 45329.1, 300 sec: 47652.4). Total num frames: 854786048. Throughput: 0: 46691.2. Samples: 856006900. Policy #0 lag: (min: 0.0, avg: 33.5, max: 67.0) [2024-03-21 03:50:30,522][03784] Avg episode reward: [(0, '1.456')] [2024-03-21 03:50:30,595][04017] Updated weights for policy 0, policy_version 26087 (0.0020) [2024-03-21 03:50:35,521][03784] Fps is (10 sec: 62260.4, 60 sec: 48059.8, 300 sec: 47208.1). Total num frames: 855015424. Throughput: 0: 46524.6. Samples: 856276300. Policy #0 lag: (min: 0.0, avg: 54.7, max: 106.0) [2024-03-21 03:50:35,522][03784] Avg episode reward: [(0, '1.371')] [2024-03-21 03:50:39,883][04017] Updated weights for policy 0, policy_version 26097 (0.0009) [2024-03-21 03:50:40,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46967.5, 300 sec: 46652.7). Total num frames: 855179264. Throughput: 0: 46246.8. Samples: 856552300. Policy #0 lag: (min: 0.0, avg: 54.7, max: 106.0) [2024-03-21 03:50:40,522][03784] Avg episode reward: [(0, '1.025')] [2024-03-21 03:50:41,564][03995] Signal inference workers to stop experience collection... (17200 times) [2024-03-21 03:50:41,641][03995] Signal inference workers to resume experience collection... (17200 times) [2024-03-21 03:50:41,646][04017] InferenceWorker_p0-w0: stopping experience collection (17200 times) [2024-03-21 03:50:41,780][04017] InferenceWorker_p0-w0: resuming experience collection (17200 times) [2024-03-21 03:50:43,795][04017] Updated weights for policy 0, policy_version 26107 (0.0014) [2024-03-21 03:50:45,521][03784] Fps is (10 sec: 55705.7, 60 sec: 49698.2, 300 sec: 46986.0). Total num frames: 855572480. Throughput: 0: 45886.8. Samples: 856674400. Policy #0 lag: (min: 0.0, avg: 54.7, max: 106.0) [2024-03-21 03:50:45,522][03784] Avg episode reward: [(0, '0.704')] [2024-03-21 03:50:50,011][04017] Updated weights for policy 0, policy_version 26117 (0.0015) [2024-03-21 03:50:50,521][03784] Fps is (10 sec: 62259.8, 60 sec: 48605.9, 300 sec: 47097.1). Total num frames: 855801856. Throughput: 0: 46317.8. Samples: 856964600. Policy #0 lag: (min: 0.0, avg: 54.7, max: 106.0) [2024-03-21 03:50:50,522][03784] Avg episode reward: [(0, '0.678')] [2024-03-21 03:50:55,521][03784] Fps is (10 sec: 36044.3, 60 sec: 46967.6, 300 sec: 46430.6). Total num frames: 855932928. Throughput: 0: 46468.8. Samples: 857258400. Policy #0 lag: (min: 0.0, avg: 40.7, max: 75.0) [2024-03-21 03:50:55,522][03784] Avg episode reward: [(0, '0.769')] [2024-03-21 03:50:58,060][04017] Updated weights for policy 0, policy_version 26127 (0.0014) [2024-03-21 03:51:00,521][03784] Fps is (10 sec: 39321.6, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 856195072. Throughput: 0: 46151.1. Samples: 857399100. Policy #0 lag: (min: 0.0, avg: 40.7, max: 75.0) [2024-03-21 03:51:00,522][03784] Avg episode reward: [(0, '0.925')] [2024-03-21 03:51:05,369][04017] Updated weights for policy 0, policy_version 26137 (0.0015) [2024-03-21 03:51:05,521][03784] Fps is (10 sec: 52429.1, 60 sec: 46967.5, 300 sec: 46652.8). Total num frames: 856457216. Throughput: 0: 46633.5. Samples: 857706200. Policy #0 lag: (min: 0.0, avg: 40.7, max: 75.0) [2024-03-21 03:51:05,522][03784] Avg episode reward: [(0, '0.941')] [2024-03-21 03:51:10,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 856555520. Throughput: 0: 47197.9. Samples: 858006200. Policy #0 lag: (min: 0.0, avg: 27.6, max: 74.0) [2024-03-21 03:51:10,522][03784] Avg episode reward: [(0, '0.941')] [2024-03-21 03:51:15,093][04017] Updated weights for policy 0, policy_version 26147 (0.0010) [2024-03-21 03:51:15,521][03784] Fps is (10 sec: 36044.8, 60 sec: 48605.9, 300 sec: 46652.8). Total num frames: 856817664. Throughput: 0: 47788.9. Samples: 858157400. Policy #0 lag: (min: 0.0, avg: 27.6, max: 74.0) [2024-03-21 03:51:15,522][03784] Avg episode reward: [(0, '0.808')] [2024-03-21 03:51:20,521][03784] Fps is (10 sec: 52428.1, 60 sec: 49152.0, 300 sec: 46763.8). Total num frames: 857079808. Throughput: 0: 47875.3. Samples: 858430700. Policy #0 lag: (min: 0.0, avg: 27.6, max: 74.0) [2024-03-21 03:51:20,522][03784] Avg episode reward: [(0, '0.859')] [2024-03-21 03:51:20,948][04017] Updated weights for policy 0, policy_version 26157 (0.0021) [2024-03-21 03:51:25,521][03784] Fps is (10 sec: 49151.4, 60 sec: 48605.9, 300 sec: 46874.9). Total num frames: 857309184. Throughput: 0: 47935.5. Samples: 858709400. Policy #0 lag: (min: 0.0, avg: 27.6, max: 74.0) [2024-03-21 03:51:25,523][03784] Avg episode reward: [(0, '0.864')] [2024-03-21 03:51:27,093][04017] Updated weights for policy 0, policy_version 26167 (0.0012) [2024-03-21 03:51:30,521][03784] Fps is (10 sec: 49152.5, 60 sec: 46421.3, 300 sec: 46986.0). Total num frames: 857571328. Throughput: 0: 48028.8. Samples: 858835700. Policy #0 lag: (min: 1.0, avg: 33.1, max: 66.0) [2024-03-21 03:51:30,522][03784] Avg episode reward: [(0, '1.657')] [2024-03-21 03:51:35,521][03784] Fps is (10 sec: 42599.0, 60 sec: 45329.0, 300 sec: 46763.8). Total num frames: 857735168. Throughput: 0: 47937.8. Samples: 859121800. Policy #0 lag: (min: 1.0, avg: 33.1, max: 66.0) [2024-03-21 03:51:35,522][03784] Avg episode reward: [(0, '0.906')] [2024-03-21 03:51:36,064][04017] Updated weights for policy 0, policy_version 26177 (0.0010) [2024-03-21 03:51:39,358][03995] Signal inference workers to stop experience collection... (17250 times) [2024-03-21 03:51:39,439][03995] Signal inference workers to resume experience collection... (17250 times) [2024-03-21 03:51:39,444][04017] InferenceWorker_p0-w0: stopping experience collection (17250 times) [2024-03-21 03:51:39,500][04017] InferenceWorker_p0-w0: resuming experience collection (17250 times) [2024-03-21 03:51:40,491][04017] Updated weights for policy 0, policy_version 26187 (0.0018) [2024-03-21 03:51:40,521][03784] Fps is (10 sec: 52429.0, 60 sec: 48605.9, 300 sec: 46874.9). Total num frames: 858095616. Throughput: 0: 47995.6. Samples: 859418200. Policy #0 lag: (min: 1.0, avg: 33.1, max: 66.0) [2024-03-21 03:51:40,522][03784] Avg episode reward: [(0, '0.554')] [2024-03-21 03:51:45,521][03784] Fps is (10 sec: 65536.0, 60 sec: 46967.4, 300 sec: 47319.2). Total num frames: 858390528. Throughput: 0: 47837.8. Samples: 859551800. Policy #0 lag: (min: 2.0, avg: 57.0, max: 106.0) [2024-03-21 03:51:45,522][03784] Avg episode reward: [(0, '1.076')] [2024-03-21 03:51:45,675][04017] Updated weights for policy 0, policy_version 26197 (0.0014) [2024-03-21 03:51:50,521][03784] Fps is (10 sec: 62259.5, 60 sec: 48605.9, 300 sec: 47763.5). Total num frames: 858718208. Throughput: 0: 47271.1. Samples: 859833400. Policy #0 lag: (min: 2.0, avg: 57.0, max: 106.0) [2024-03-21 03:51:50,522][03784] Avg episode reward: [(0, '1.327')] [2024-03-21 03:51:52,608][04017] Updated weights for policy 0, policy_version 26207 (0.0014) [2024-03-21 03:51:55,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 858750976. Throughput: 0: 46753.3. Samples: 860110100. Policy #0 lag: (min: 2.0, avg: 57.0, max: 106.0) [2024-03-21 03:51:55,522][03784] Avg episode reward: [(0, '1.166')] [2024-03-21 03:52:00,218][04017] Updated weights for policy 0, policy_version 26217 (0.0027) [2024-03-21 03:52:00,521][03784] Fps is (10 sec: 39321.4, 60 sec: 48605.8, 300 sec: 46986.0). Total num frames: 859111424. Throughput: 0: 46364.4. Samples: 860243800. Policy #0 lag: (min: 2.0, avg: 57.0, max: 106.0) [2024-03-21 03:52:00,522][03784] Avg episode reward: [(0, '0.924')] [2024-03-21 03:52:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000026218_859111424.pth... [2024-03-21 03:52:00,654][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000025870_847708160.pth [2024-03-21 03:52:05,521][03784] Fps is (10 sec: 45875.7, 60 sec: 45875.3, 300 sec: 46986.0). Total num frames: 859209728. Throughput: 0: 46671.3. Samples: 860530900. Policy #0 lag: (min: 0.0, avg: 37.5, max: 79.0) [2024-03-21 03:52:05,522][03784] Avg episode reward: [(0, '1.333')] [2024-03-21 03:52:08,777][04017] Updated weights for policy 0, policy_version 26227 (0.0010) [2024-03-21 03:52:10,521][03784] Fps is (10 sec: 36044.8, 60 sec: 48605.8, 300 sec: 46763.8). Total num frames: 859471872. Throughput: 0: 47333.4. Samples: 860839400. Policy #0 lag: (min: 0.0, avg: 37.5, max: 79.0) [2024-03-21 03:52:10,522][03784] Avg episode reward: [(0, '1.046')] [2024-03-21 03:52:15,521][03784] Fps is (10 sec: 45874.9, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 859668480. Throughput: 0: 47680.1. Samples: 860981300. Policy #0 lag: (min: 0.0, avg: 37.5, max: 79.0) [2024-03-21 03:52:15,522][03784] Avg episode reward: [(0, '1.233')] [2024-03-21 03:52:17,691][04017] Updated weights for policy 0, policy_version 26237 (0.0019) [2024-03-21 03:52:20,521][03784] Fps is (10 sec: 49151.6, 60 sec: 48059.8, 300 sec: 47319.2). Total num frames: 859963392. Throughput: 0: 47693.2. Samples: 861268000. Policy #0 lag: (min: 0.0, avg: 31.1, max: 69.0) [2024-03-21 03:52:20,522][03784] Avg episode reward: [(0, '1.524')] [2024-03-21 03:52:21,871][04017] Updated weights for policy 0, policy_version 26247 (0.0015) [2024-03-21 03:52:25,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46421.5, 300 sec: 46541.7). Total num frames: 860094464. Throughput: 0: 47562.3. Samples: 861558500. Policy #0 lag: (min: 0.0, avg: 31.1, max: 69.0) [2024-03-21 03:52:25,522][03784] Avg episode reward: [(0, '0.645')] [2024-03-21 03:52:29,486][04017] Updated weights for policy 0, policy_version 26257 (0.0018) [2024-03-21 03:52:30,521][03784] Fps is (10 sec: 42598.9, 60 sec: 46967.5, 300 sec: 46430.6). Total num frames: 860389376. Throughput: 0: 47635.5. Samples: 861695400. Policy #0 lag: (min: 0.0, avg: 31.1, max: 69.0) [2024-03-21 03:52:30,522][03784] Avg episode reward: [(0, '1.022')] [2024-03-21 03:52:34,842][03995] Signal inference workers to stop experience collection... (17300 times) [2024-03-21 03:52:34,842][03995] Signal inference workers to resume experience collection... (17300 times) [2024-03-21 03:52:34,911][04017] InferenceWorker_p0-w0: stopping experience collection (17300 times) [2024-03-21 03:52:34,911][04017] InferenceWorker_p0-w0: resuming experience collection (17300 times) [2024-03-21 03:52:35,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46967.4, 300 sec: 46319.5). Total num frames: 860553216. Throughput: 0: 48146.6. Samples: 862000000. Policy #0 lag: (min: 0.0, avg: 31.1, max: 69.0) [2024-03-21 03:52:35,522][03784] Avg episode reward: [(0, '0.691')] [2024-03-21 03:52:38,469][04017] Updated weights for policy 0, policy_version 26267 (0.0019) [2024-03-21 03:52:40,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 860848128. Throughput: 0: 48142.2. Samples: 862276500. Policy #0 lag: (min: 0.0, avg: 39.5, max: 87.0) [2024-03-21 03:52:40,522][03784] Avg episode reward: [(0, '1.273')] [2024-03-21 03:52:42,332][04017] Updated weights for policy 0, policy_version 26277 (0.0011) [2024-03-21 03:52:45,521][03784] Fps is (10 sec: 65535.9, 60 sec: 46967.4, 300 sec: 47874.6). Total num frames: 861208576. Throughput: 0: 47537.8. Samples: 862383000. Policy #0 lag: (min: 0.0, avg: 39.5, max: 87.0) [2024-03-21 03:52:45,522][03784] Avg episode reward: [(0, '1.001')] [2024-03-21 03:52:48,428][04017] Updated weights for policy 0, policy_version 26287 (0.0010) [2024-03-21 03:52:50,521][03784] Fps is (10 sec: 58982.7, 60 sec: 45329.1, 300 sec: 47652.4). Total num frames: 861437952. Throughput: 0: 47506.6. Samples: 862668700. Policy #0 lag: (min: 0.0, avg: 39.5, max: 87.0) [2024-03-21 03:52:50,522][03784] Avg episode reward: [(0, '0.715')] [2024-03-21 03:52:54,861][04017] Updated weights for policy 0, policy_version 26297 (0.0020) [2024-03-21 03:52:55,521][03784] Fps is (10 sec: 52429.3, 60 sec: 49698.2, 300 sec: 47541.4). Total num frames: 861732864. Throughput: 0: 47184.6. Samples: 862962700. Policy #0 lag: (min: 0.0, avg: 39.5, max: 87.0) [2024-03-21 03:52:55,521][03784] Avg episode reward: [(0, '0.606')] [2024-03-21 03:53:00,521][03784] Fps is (10 sec: 49151.4, 60 sec: 46967.4, 300 sec: 47319.2). Total num frames: 861929472. Throughput: 0: 47224.3. Samples: 863106400. Policy #0 lag: (min: 2.0, avg: 38.9, max: 68.0) [2024-03-21 03:53:00,522][03784] Avg episode reward: [(0, '0.606')] [2024-03-21 03:53:01,730][04017] Updated weights for policy 0, policy_version 26307 (0.0015) [2024-03-21 03:53:05,521][03784] Fps is (10 sec: 39321.3, 60 sec: 48605.8, 300 sec: 47319.2). Total num frames: 862126080. Throughput: 0: 46842.3. Samples: 863375900. Policy #0 lag: (min: 2.0, avg: 38.9, max: 68.0) [2024-03-21 03:53:05,522][03784] Avg episode reward: [(0, '1.346')] [2024-03-21 03:53:10,443][04017] Updated weights for policy 0, policy_version 26317 (0.0026) [2024-03-21 03:53:10,521][03784] Fps is (10 sec: 42598.5, 60 sec: 48059.7, 300 sec: 47541.4). Total num frames: 862355456. Throughput: 0: 47175.4. Samples: 863681400. Policy #0 lag: (min: 2.0, avg: 38.9, max: 68.0) [2024-03-21 03:53:10,522][03784] Avg episode reward: [(0, '0.763')] [2024-03-21 03:53:15,521][03784] Fps is (10 sec: 36044.9, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 862486528. Throughput: 0: 47453.4. Samples: 863830800. Policy #0 lag: (min: 0.0, avg: 43.5, max: 90.0) [2024-03-21 03:53:15,522][03784] Avg episode reward: [(0, '0.918')] [2024-03-21 03:53:19,258][04017] Updated weights for policy 0, policy_version 26327 (0.0010) [2024-03-21 03:53:20,521][03784] Fps is (10 sec: 42598.8, 60 sec: 46967.6, 300 sec: 46652.8). Total num frames: 862781440. Throughput: 0: 47248.9. Samples: 864126200. Policy #0 lag: (min: 0.0, avg: 43.5, max: 90.0) [2024-03-21 03:53:20,522][03784] Avg episode reward: [(0, '0.918')] [2024-03-21 03:53:24,892][04017] Updated weights for policy 0, policy_version 26337 (0.0012) [2024-03-21 03:53:24,941][03995] Signal inference workers to stop experience collection... (17350 times) [2024-03-21 03:53:24,942][03995] Signal inference workers to resume experience collection... (17350 times) [2024-03-21 03:53:25,004][04017] InferenceWorker_p0-w0: stopping experience collection (17350 times) [2024-03-21 03:53:25,005][04017] InferenceWorker_p0-w0: resuming experience collection (17350 times) [2024-03-21 03:53:25,521][03784] Fps is (10 sec: 55704.9, 60 sec: 49151.9, 300 sec: 46652.7). Total num frames: 863043584. Throughput: 0: 47222.2. Samples: 864401500. Policy #0 lag: (min: 0.0, avg: 43.5, max: 90.0) [2024-03-21 03:53:25,522][03784] Avg episode reward: [(0, '0.990')] [2024-03-21 03:53:30,521][03784] Fps is (10 sec: 49151.8, 60 sec: 48059.7, 300 sec: 47097.1). Total num frames: 863272960. Throughput: 0: 47924.4. Samples: 864539600. Policy #0 lag: (min: 0.0, avg: 43.5, max: 90.0) [2024-03-21 03:53:30,522][03784] Avg episode reward: [(0, '1.221')] [2024-03-21 03:53:32,198][04017] Updated weights for policy 0, policy_version 26347 (0.0014) [2024-03-21 03:53:35,521][03784] Fps is (10 sec: 36044.9, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 863404032. Throughput: 0: 48186.6. Samples: 864837100. Policy #0 lag: (min: 0.0, avg: 50.2, max: 119.0) [2024-03-21 03:53:35,530][03784] Avg episode reward: [(0, '1.048')] [2024-03-21 03:53:40,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46421.3, 300 sec: 47097.1). Total num frames: 863633408. Throughput: 0: 47666.6. Samples: 865107700. Policy #0 lag: (min: 0.0, avg: 50.2, max: 119.0) [2024-03-21 03:53:40,530][03784] Avg episode reward: [(0, '1.323')] [2024-03-21 03:53:41,023][04017] Updated weights for policy 0, policy_version 26357 (0.0017) [2024-03-21 03:53:45,074][04017] Updated weights for policy 0, policy_version 26367 (0.0012) [2024-03-21 03:53:45,521][03784] Fps is (10 sec: 62259.4, 60 sec: 46967.5, 300 sec: 47319.2). Total num frames: 864026624. Throughput: 0: 47529.0. Samples: 865245200. Policy #0 lag: (min: 0.0, avg: 50.2, max: 119.0) [2024-03-21 03:53:45,522][03784] Avg episode reward: [(0, '1.427')] [2024-03-21 03:53:50,521][03784] Fps is (10 sec: 62259.0, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 864256000. Throughput: 0: 47315.5. Samples: 865505100. Policy #0 lag: (min: 0.0, avg: 50.2, max: 119.0) [2024-03-21 03:53:50,522][03784] Avg episode reward: [(0, '0.899')] [2024-03-21 03:53:51,326][04017] Updated weights for policy 0, policy_version 26377 (0.0019) [2024-03-21 03:53:55,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45875.1, 300 sec: 47541.4). Total num frames: 864485376. Throughput: 0: 46562.2. Samples: 865776700. Policy #0 lag: (min: 0.0, avg: 55.0, max: 107.0) [2024-03-21 03:53:55,522][03784] Avg episode reward: [(0, '1.655')] [2024-03-21 03:53:58,011][04017] Updated weights for policy 0, policy_version 26387 (0.0015) [2024-03-21 03:54:00,521][03784] Fps is (10 sec: 55705.5, 60 sec: 48059.8, 300 sec: 47763.5). Total num frames: 864813056. Throughput: 0: 46246.6. Samples: 865911900. Policy #0 lag: (min: 0.0, avg: 55.0, max: 107.0) [2024-03-21 03:54:00,522][03784] Avg episode reward: [(0, '0.663')] [2024-03-21 03:54:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000026392_864813056.pth... [2024-03-21 03:54:00,650][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000026046_853475328.pth [2024-03-21 03:54:05,521][03784] Fps is (10 sec: 39321.3, 60 sec: 45875.1, 300 sec: 47208.1). Total num frames: 864878592. Throughput: 0: 46162.1. Samples: 866203500. Policy #0 lag: (min: 0.0, avg: 55.0, max: 107.0) [2024-03-21 03:54:05,522][03784] Avg episode reward: [(0, '0.601')] [2024-03-21 03:54:06,428][04017] Updated weights for policy 0, policy_version 26397 (0.0028) [2024-03-21 03:54:10,521][03784] Fps is (10 sec: 39321.7, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 865206272. Throughput: 0: 46131.2. Samples: 866477400. Policy #0 lag: (min: 0.0, avg: 55.0, max: 107.0) [2024-03-21 03:54:10,522][03784] Avg episode reward: [(0, '1.429')] [2024-03-21 03:54:14,048][04017] Updated weights for policy 0, policy_version 26407 (0.0011) [2024-03-21 03:54:14,158][03995] Signal inference workers to stop experience collection... (17400 times) [2024-03-21 03:54:14,343][04017] InferenceWorker_p0-w0: stopping experience collection (17400 times) [2024-03-21 03:54:14,368][03995] Signal inference workers to resume experience collection... (17400 times) [2024-03-21 03:54:14,389][04017] InferenceWorker_p0-w0: resuming experience collection (17400 times) [2024-03-21 03:54:15,521][03784] Fps is (10 sec: 49152.7, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 865370112. Throughput: 0: 46608.9. Samples: 866637000. Policy #0 lag: (min: 0.0, avg: 38.2, max: 83.0) [2024-03-21 03:54:15,522][03784] Avg episode reward: [(0, '0.951')] [2024-03-21 03:54:18,483][04017] Updated weights for policy 0, policy_version 26417 (0.0015) [2024-03-21 03:54:20,521][03784] Fps is (10 sec: 45875.4, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 865665024. Throughput: 0: 45980.1. Samples: 866906200. Policy #0 lag: (min: 0.0, avg: 38.2, max: 83.0) [2024-03-21 03:54:20,522][03784] Avg episode reward: [(0, '1.513')] [2024-03-21 03:54:25,521][03784] Fps is (10 sec: 45874.6, 60 sec: 46421.3, 300 sec: 46652.7). Total num frames: 865828864. Throughput: 0: 46184.3. Samples: 867186000. Policy #0 lag: (min: 0.0, avg: 38.2, max: 83.0) [2024-03-21 03:54:25,522][03784] Avg episode reward: [(0, '1.364')] [2024-03-21 03:54:26,904][04017] Updated weights for policy 0, policy_version 26427 (0.0010) [2024-03-21 03:54:30,521][03784] Fps is (10 sec: 36044.4, 60 sec: 45875.1, 300 sec: 47097.0). Total num frames: 866025472. Throughput: 0: 46139.9. Samples: 867321500. Policy #0 lag: (min: 0.0, avg: 38.2, max: 83.0) [2024-03-21 03:54:30,522][03784] Avg episode reward: [(0, '1.291')] [2024-03-21 03:54:35,521][03784] Fps is (10 sec: 45875.5, 60 sec: 48059.7, 300 sec: 47208.2). Total num frames: 866287616. Throughput: 0: 47066.7. Samples: 867623100. Policy #0 lag: (min: 1.0, avg: 37.6, max: 80.0) [2024-03-21 03:54:35,522][03784] Avg episode reward: [(0, '0.910')] [2024-03-21 03:54:35,559][04017] Updated weights for policy 0, policy_version 26437 (0.0011) [2024-03-21 03:54:40,521][03784] Fps is (10 sec: 52429.3, 60 sec: 48605.9, 300 sec: 47319.2). Total num frames: 866549760. Throughput: 0: 47060.1. Samples: 867894400. Policy #0 lag: (min: 1.0, avg: 37.6, max: 80.0) [2024-03-21 03:54:40,522][03784] Avg episode reward: [(0, '1.671')] [2024-03-21 03:54:42,170][04017] Updated weights for policy 0, policy_version 26447 (0.0009) [2024-03-21 03:54:45,521][03784] Fps is (10 sec: 52428.5, 60 sec: 46421.3, 300 sec: 47208.1). Total num frames: 866811904. Throughput: 0: 47099.9. Samples: 868031400. Policy #0 lag: (min: 1.0, avg: 37.6, max: 80.0) [2024-03-21 03:54:45,522][03784] Avg episode reward: [(0, '1.279')] [2024-03-21 03:54:50,521][03784] Fps is (10 sec: 36044.5, 60 sec: 44236.8, 300 sec: 46763.9). Total num frames: 866910208. Throughput: 0: 46902.3. Samples: 868314100. Policy #0 lag: (min: 1.0, avg: 37.6, max: 80.0) [2024-03-21 03:54:50,522][03784] Avg episode reward: [(0, '0.481')] [2024-03-21 03:54:53,380][04017] Updated weights for policy 0, policy_version 26457 (0.0010) [2024-03-21 03:54:55,521][03784] Fps is (10 sec: 19660.9, 60 sec: 42052.3, 300 sec: 45986.3). Total num frames: 867008512. Throughput: 0: 47711.1. Samples: 868624400. Policy #0 lag: (min: 0.0, avg: 23.9, max: 63.0) [2024-03-21 03:54:55,522][03784] Avg episode reward: [(0, '1.287')] [2024-03-21 03:54:58,838][04017] Updated weights for policy 0, policy_version 26467 (0.0037) [2024-03-21 03:55:00,521][03784] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 46763.8). Total num frames: 867434496. Throughput: 0: 46757.6. Samples: 868741100. Policy #0 lag: (min: 0.0, avg: 23.9, max: 63.0) [2024-03-21 03:55:00,522][03784] Avg episode reward: [(0, '1.403')] [2024-03-21 03:55:02,672][04017] Updated weights for policy 0, policy_version 26477 (0.0019) [2024-03-21 03:55:05,435][03995] Signal inference workers to stop experience collection... (17450 times) [2024-03-21 03:55:05,436][03995] Signal inference workers to resume experience collection... (17450 times) [2024-03-21 03:55:05,496][04017] InferenceWorker_p0-w0: stopping experience collection (17450 times) [2024-03-21 03:55:05,496][04017] InferenceWorker_p0-w0: resuming experience collection (17450 times) [2024-03-21 03:55:05,521][03784] Fps is (10 sec: 81918.3, 60 sec: 49151.9, 300 sec: 47541.3). Total num frames: 867827712. Throughput: 0: 46355.3. Samples: 868992200. Policy #0 lag: (min: 0.0, avg: 23.9, max: 63.0) [2024-03-21 03:55:05,522][03784] Avg episode reward: [(0, '0.992')] [2024-03-21 03:55:06,516][04017] Updated weights for policy 0, policy_version 26487 (0.0014) [2024-03-21 03:55:10,521][03784] Fps is (10 sec: 68813.9, 60 sec: 48605.9, 300 sec: 48207.8). Total num frames: 868122624. Throughput: 0: 46513.5. Samples: 869279100. Policy #0 lag: (min: 0.0, avg: 23.9, max: 63.0) [2024-03-21 03:55:10,522][03784] Avg episode reward: [(0, '0.992')] [2024-03-21 03:55:13,981][04017] Updated weights for policy 0, policy_version 26497 (0.0012) [2024-03-21 03:55:15,521][03784] Fps is (10 sec: 45876.3, 60 sec: 48605.8, 300 sec: 47985.7). Total num frames: 868286464. Throughput: 0: 46806.7. Samples: 869427800. Policy #0 lag: (min: 0.0, avg: 48.7, max: 101.0) [2024-03-21 03:55:15,522][03784] Avg episode reward: [(0, '0.917')] [2024-03-21 03:55:20,521][03784] Fps is (10 sec: 32767.8, 60 sec: 46421.3, 300 sec: 47652.5). Total num frames: 868450304. Throughput: 0: 46417.8. Samples: 869711900. Policy #0 lag: (min: 0.0, avg: 48.7, max: 101.0) [2024-03-21 03:55:20,522][03784] Avg episode reward: [(0, '1.397')] [2024-03-21 03:55:24,503][04017] Updated weights for policy 0, policy_version 26507 (0.0015) [2024-03-21 03:55:25,521][03784] Fps is (10 sec: 29491.3, 60 sec: 45875.3, 300 sec: 46763.8). Total num frames: 868581376. Throughput: 0: 47091.1. Samples: 870013500. Policy #0 lag: (min: 0.0, avg: 48.7, max: 101.0) [2024-03-21 03:55:25,522][03784] Avg episode reward: [(0, '0.865')] [2024-03-21 03:55:29,999][04017] Updated weights for policy 0, policy_version 26517 (0.0015) [2024-03-21 03:55:30,521][03784] Fps is (10 sec: 49151.7, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 868941824. Throughput: 0: 47153.3. Samples: 870153300. Policy #0 lag: (min: 0.0, avg: 34.2, max: 81.0) [2024-03-21 03:55:30,522][03784] Avg episode reward: [(0, '1.148')] [2024-03-21 03:55:35,521][03784] Fps is (10 sec: 58982.3, 60 sec: 48059.7, 300 sec: 47430.3). Total num frames: 869171200. Throughput: 0: 47291.1. Samples: 870442200. Policy #0 lag: (min: 0.0, avg: 34.2, max: 81.0) [2024-03-21 03:55:35,522][03784] Avg episode reward: [(0, '0.896')] [2024-03-21 03:55:38,928][04017] Updated weights for policy 0, policy_version 26527 (0.0017) [2024-03-21 03:55:40,521][03784] Fps is (10 sec: 39321.8, 60 sec: 46421.3, 300 sec: 46652.7). Total num frames: 869335040. Throughput: 0: 46800.0. Samples: 870730400. Policy #0 lag: (min: 0.0, avg: 34.2, max: 81.0) [2024-03-21 03:55:40,522][03784] Avg episode reward: [(0, '0.896')] [2024-03-21 03:55:45,035][04017] Updated weights for policy 0, policy_version 26537 (0.0015) [2024-03-21 03:55:45,521][03784] Fps is (10 sec: 42598.6, 60 sec: 46421.4, 300 sec: 46763.8). Total num frames: 869597184. Throughput: 0: 47031.3. Samples: 870857500. Policy #0 lag: (min: 0.0, avg: 34.2, max: 81.0) [2024-03-21 03:55:45,522][03784] Avg episode reward: [(0, '1.177')] [2024-03-21 03:55:50,521][03784] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 869859328. Throughput: 0: 47713.6. Samples: 871139300. Policy #0 lag: (min: 0.0, avg: 30.2, max: 66.0) [2024-03-21 03:55:50,522][03784] Avg episode reward: [(0, '1.354')] [2024-03-21 03:55:50,948][04017] Updated weights for policy 0, policy_version 26547 (0.0010) [2024-03-21 03:55:55,521][03784] Fps is (10 sec: 45874.8, 60 sec: 50790.4, 300 sec: 46986.0). Total num frames: 870055936. Throughput: 0: 47464.3. Samples: 871415000. Policy #0 lag: (min: 0.0, avg: 30.2, max: 66.0) [2024-03-21 03:55:55,522][03784] Avg episode reward: [(0, '0.577')] [2024-03-21 03:55:58,613][04017] Updated weights for policy 0, policy_version 26557 (0.0012) [2024-03-21 03:56:00,521][03784] Fps is (10 sec: 42598.5, 60 sec: 47513.7, 300 sec: 46874.9). Total num frames: 870285312. Throughput: 0: 47064.5. Samples: 871545700. Policy #0 lag: (min: 0.0, avg: 30.2, max: 66.0) [2024-03-21 03:56:00,522][03784] Avg episode reward: [(0, '1.238')] [2024-03-21 03:56:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000026559_870285312.pth... [2024-03-21 03:56:00,647][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000026218_859111424.pth [2024-03-21 03:56:05,521][03784] Fps is (10 sec: 42598.6, 60 sec: 44237.0, 300 sec: 47208.1). Total num frames: 870481920. Throughput: 0: 47493.3. Samples: 871849100. Policy #0 lag: (min: 0.0, avg: 30.2, max: 66.0) [2024-03-21 03:56:05,522][03784] Avg episode reward: [(0, '1.382')] [2024-03-21 03:56:06,266][04017] Updated weights for policy 0, policy_version 26567 (0.0011) [2024-03-21 03:56:08,087][03995] Signal inference workers to stop experience collection... (17500 times) [2024-03-21 03:56:08,155][03995] Signal inference workers to resume experience collection... (17500 times) [2024-03-21 03:56:08,170][04017] InferenceWorker_p0-w0: stopping experience collection (17500 times) [2024-03-21 03:56:08,219][04017] InferenceWorker_p0-w0: resuming experience collection (17500 times) [2024-03-21 03:56:10,521][03784] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 47097.0). Total num frames: 870711296. Throughput: 0: 47168.8. Samples: 872136100. Policy #0 lag: (min: 0.0, avg: 33.4, max: 70.0) [2024-03-21 03:56:10,522][03784] Avg episode reward: [(0, '1.348')] [2024-03-21 03:56:12,857][04017] Updated weights for policy 0, policy_version 26577 (0.0019) [2024-03-21 03:56:15,521][03784] Fps is (10 sec: 62258.5, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 871104512. Throughput: 0: 46988.9. Samples: 872267800. Policy #0 lag: (min: 0.0, avg: 33.4, max: 70.0) [2024-03-21 03:56:15,522][03784] Avg episode reward: [(0, '1.302')] [2024-03-21 03:56:16,326][04017] Updated weights for policy 0, policy_version 26587 (0.0023) [2024-03-21 03:56:20,521][03784] Fps is (10 sec: 65536.1, 60 sec: 48605.8, 300 sec: 47652.5). Total num frames: 871366656. Throughput: 0: 46784.4. Samples: 872547500. Policy #0 lag: (min: 0.0, avg: 33.4, max: 70.0) [2024-03-21 03:56:20,522][03784] Avg episode reward: [(0, '1.604')] [2024-03-21 03:56:23,587][04017] Updated weights for policy 0, policy_version 26597 (0.0025) [2024-03-21 03:56:25,521][03784] Fps is (10 sec: 55706.3, 60 sec: 51336.5, 300 sec: 47763.5). Total num frames: 871661568. Throughput: 0: 46828.9. Samples: 872837700. Policy #0 lag: (min: 1.0, avg: 38.6, max: 74.0) [2024-03-21 03:56:25,522][03784] Avg episode reward: [(0, '1.340')] [2024-03-21 03:56:30,521][03784] Fps is (10 sec: 45875.2, 60 sec: 48059.8, 300 sec: 47763.5). Total num frames: 871825408. Throughput: 0: 47293.3. Samples: 872985700. Policy #0 lag: (min: 1.0, avg: 38.6, max: 74.0) [2024-03-21 03:56:30,522][03784] Avg episode reward: [(0, '0.661')] [2024-03-21 03:56:32,141][04017] Updated weights for policy 0, policy_version 26607 (0.0013) [2024-03-21 03:56:35,521][03784] Fps is (10 sec: 26214.4, 60 sec: 45875.2, 300 sec: 46874.9). Total num frames: 871923712. Throughput: 0: 47333.4. Samples: 873269300. Policy #0 lag: (min: 1.0, avg: 38.6, max: 74.0) [2024-03-21 03:56:35,522][03784] Avg episode reward: [(0, '0.518')] [2024-03-21 03:56:39,426][04017] Updated weights for policy 0, policy_version 26617 (0.0013) [2024-03-21 03:56:40,521][03784] Fps is (10 sec: 39321.8, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 872218624. Throughput: 0: 47235.6. Samples: 873540600. Policy #0 lag: (min: 1.0, avg: 38.6, max: 74.0) [2024-03-21 03:56:40,522][03784] Avg episode reward: [(0, '0.702')] [2024-03-21 03:56:45,521][03784] Fps is (10 sec: 55705.6, 60 sec: 48059.7, 300 sec: 46652.7). Total num frames: 872480768. Throughput: 0: 47646.7. Samples: 873689800. Policy #0 lag: (min: 1.0, avg: 36.7, max: 79.0) [2024-03-21 03:56:45,522][03784] Avg episode reward: [(0, '0.969')] [2024-03-21 03:56:47,193][04017] Updated weights for policy 0, policy_version 26627 (0.0009) [2024-03-21 03:56:50,521][03784] Fps is (10 sec: 36044.4, 60 sec: 45329.0, 300 sec: 46874.9). Total num frames: 872579072. Throughput: 0: 47533.2. Samples: 873988100. Policy #0 lag: (min: 1.0, avg: 36.7, max: 79.0) [2024-03-21 03:56:50,522][03784] Avg episode reward: [(0, '1.458')] [2024-03-21 03:56:54,573][04017] Updated weights for policy 0, policy_version 26637 (0.0019) [2024-03-21 03:56:55,521][03784] Fps is (10 sec: 42598.4, 60 sec: 47513.7, 300 sec: 46763.8). Total num frames: 872906752. Throughput: 0: 47511.2. Samples: 874274100. Policy #0 lag: (min: 1.0, avg: 36.7, max: 79.0) [2024-03-21 03:56:55,531][03784] Avg episode reward: [(0, '0.984')] [2024-03-21 03:56:59,291][03995] Signal inference workers to stop experience collection... (17550 times) [2024-03-21 03:56:59,292][03995] Signal inference workers to resume experience collection... (17550 times) [2024-03-21 03:56:59,408][04017] InferenceWorker_p0-w0: stopping experience collection (17550 times) [2024-03-21 03:56:59,408][04017] InferenceWorker_p0-w0: resuming experience collection (17550 times) [2024-03-21 03:57:00,521][03784] Fps is (10 sec: 55706.1, 60 sec: 47513.6, 300 sec: 47208.1). Total num frames: 873136128. Throughput: 0: 47740.1. Samples: 874416100. Policy #0 lag: (min: 1.0, avg: 30.3, max: 65.0) [2024-03-21 03:57:00,530][03784] Avg episode reward: [(0, '0.984')] [2024-03-21 03:57:01,820][04017] Updated weights for policy 0, policy_version 26647 (0.0016) [2024-03-21 03:57:05,521][03784] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 873398272. Throughput: 0: 46724.6. Samples: 874650100. Policy #0 lag: (min: 1.0, avg: 30.3, max: 65.0) [2024-03-21 03:57:05,522][03784] Avg episode reward: [(0, '1.642')] [2024-03-21 03:57:10,414][04017] Updated weights for policy 0, policy_version 26657 (0.0018) [2024-03-21 03:57:10,521][03784] Fps is (10 sec: 36044.4, 60 sec: 46421.3, 300 sec: 46874.9). Total num frames: 873496576. Throughput: 0: 47231.0. Samples: 874963100. Policy #0 lag: (min: 1.0, avg: 30.3, max: 65.0) [2024-03-21 03:57:10,522][03784] Avg episode reward: [(0, '0.945')] [2024-03-21 03:57:14,855][04017] Updated weights for policy 0, policy_version 26667 (0.0011) [2024-03-21 03:57:15,521][03784] Fps is (10 sec: 49151.5, 60 sec: 46421.4, 300 sec: 47208.1). Total num frames: 873889792. Throughput: 0: 46980.0. Samples: 875099800. Policy #0 lag: (min: 1.0, avg: 30.3, max: 65.0) [2024-03-21 03:57:15,522][03784] Avg episode reward: [(0, '1.101')] [2024-03-21 03:57:20,521][03784] Fps is (10 sec: 62260.3, 60 sec: 45875.3, 300 sec: 47541.4). Total num frames: 874119168. Throughput: 0: 46391.1. Samples: 875356900. Policy #0 lag: (min: 4.0, avg: 32.8, max: 62.0) [2024-03-21 03:57:20,522][03784] Avg episode reward: [(0, '0.984')] [2024-03-21 03:57:20,568][04017] Updated weights for policy 0, policy_version 26677 (0.0011) [2024-03-21 03:57:25,521][03784] Fps is (10 sec: 42598.2, 60 sec: 44236.7, 300 sec: 47208.1). Total num frames: 874315776. Throughput: 0: 46942.1. Samples: 875653000. Policy #0 lag: (min: 4.0, avg: 32.8, max: 62.0) [2024-03-21 03:57:25,522][03784] Avg episode reward: [(0, '1.140')] [2024-03-21 03:57:29,563][04017] Updated weights for policy 0, policy_version 26687 (0.0011) [2024-03-21 03:57:30,521][03784] Fps is (10 sec: 39321.1, 60 sec: 44782.9, 300 sec: 47319.2). Total num frames: 874512384. Throughput: 0: 46871.0. Samples: 875799000. Policy #0 lag: (min: 4.0, avg: 32.8, max: 62.0) [2024-03-21 03:57:30,522][03784] Avg episode reward: [(0, '0.973')] [2024-03-21 03:57:35,521][03784] Fps is (10 sec: 45875.6, 60 sec: 47513.6, 300 sec: 47208.1). Total num frames: 874774528. Throughput: 0: 46800.1. Samples: 876094100. Policy #0 lag: (min: 4.0, avg: 56.8, max: 115.0) [2024-03-21 03:57:35,522][03784] Avg episode reward: [(0, '1.554')] [2024-03-21 03:57:35,693][04017] Updated weights for policy 0, policy_version 26697 (0.0016) [2024-03-21 03:57:40,006][04017] Updated weights for policy 0, policy_version 26707 (0.0015) [2024-03-21 03:57:40,521][03784] Fps is (10 sec: 62259.9, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 875134976. Throughput: 0: 46553.3. Samples: 876369000. Policy #0 lag: (min: 4.0, avg: 56.8, max: 115.0) [2024-03-21 03:57:40,522][03784] Avg episode reward: [(0, '0.732')] [2024-03-21 03:57:43,741][03995] Signal inference workers to stop experience collection... (17600 times) [2024-03-21 03:57:43,819][03995] Signal inference workers to resume experience collection... (17600 times) [2024-03-21 03:57:43,822][04017] InferenceWorker_p0-w0: stopping experience collection (17600 times) [2024-03-21 03:57:43,866][04017] InferenceWorker_p0-w0: resuming experience collection (17600 times) [2024-03-21 03:57:45,521][03784] Fps is (10 sec: 62259.2, 60 sec: 48605.9, 300 sec: 47319.2). Total num frames: 875397120. Throughput: 0: 46802.2. Samples: 876522200. Policy #0 lag: (min: 4.0, avg: 56.8, max: 115.0) [2024-03-21 03:57:45,522][03784] Avg episode reward: [(0, '1.275')] [2024-03-21 03:57:47,856][04017] Updated weights for policy 0, policy_version 26717 (0.0017) [2024-03-21 03:57:50,521][03784] Fps is (10 sec: 36044.5, 60 sec: 48605.9, 300 sec: 46652.7). Total num frames: 875495424. Throughput: 0: 47762.1. Samples: 876799400. Policy #0 lag: (min: 4.0, avg: 56.8, max: 115.0) [2024-03-21 03:57:50,522][03784] Avg episode reward: [(0, '0.584')] [2024-03-21 03:57:55,521][03784] Fps is (10 sec: 19661.0, 60 sec: 44783.0, 300 sec: 46319.5). Total num frames: 875593728. Throughput: 0: 47366.9. Samples: 877094600. Policy #0 lag: (min: 0.0, avg: 29.6, max: 66.0) [2024-03-21 03:57:55,522][03784] Avg episode reward: [(0, '0.848')] [2024-03-21 03:57:57,672][04017] Updated weights for policy 0, policy_version 26727 (0.0012) [2024-03-21 03:58:00,521][03784] Fps is (10 sec: 45875.7, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 875954176. Throughput: 0: 47064.5. Samples: 877217700. Policy #0 lag: (min: 0.0, avg: 29.6, max: 66.0) [2024-03-21 03:58:00,521][03784] Avg episode reward: [(0, '1.371')] [2024-03-21 03:58:00,528][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000026732_875954176.pth... [2024-03-21 03:58:00,633][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000026392_864813056.pth [2024-03-21 03:58:03,500][04017] Updated weights for policy 0, policy_version 26737 (0.0012) [2024-03-21 03:58:05,521][03784] Fps is (10 sec: 58982.0, 60 sec: 46421.3, 300 sec: 46874.9). Total num frames: 876183552. Throughput: 0: 47722.2. Samples: 877504400. Policy #0 lag: (min: 0.0, avg: 29.6, max: 66.0) [2024-03-21 03:58:05,522][03784] Avg episode reward: [(0, '1.117')] [2024-03-21 03:58:10,521][03784] Fps is (10 sec: 42598.0, 60 sec: 48059.8, 300 sec: 47097.0). Total num frames: 876380160. Throughput: 0: 47337.8. Samples: 877783200. Policy #0 lag: (min: 0.0, avg: 29.6, max: 66.0) [2024-03-21 03:58:10,522][03784] Avg episode reward: [(0, '0.737')] [2024-03-21 03:58:11,125][04017] Updated weights for policy 0, policy_version 26747 (0.0011) [2024-03-21 03:58:15,521][03784] Fps is (10 sec: 52427.7, 60 sec: 46967.4, 300 sec: 47208.1). Total num frames: 876707840. Throughput: 0: 47128.8. Samples: 877919800. Policy #0 lag: (min: 1.0, avg: 54.0, max: 101.0) [2024-03-21 03:58:15,522][03784] Avg episode reward: [(0, '1.264')] [2024-03-21 03:58:16,757][04017] Updated weights for policy 0, policy_version 26757 (0.0012) [2024-03-21 03:58:20,521][03784] Fps is (10 sec: 45875.4, 60 sec: 45329.0, 300 sec: 46763.8). Total num frames: 876838912. Throughput: 0: 46657.8. Samples: 878193700. Policy #0 lag: (min: 1.0, avg: 54.0, max: 101.0) [2024-03-21 03:58:20,522][03784] Avg episode reward: [(0, '0.629')] [2024-03-21 03:58:25,020][04017] Updated weights for policy 0, policy_version 26767 (0.0019) [2024-03-21 03:58:25,521][03784] Fps is (10 sec: 42598.8, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 877133824. Throughput: 0: 46084.3. Samples: 878442800. Policy #0 lag: (min: 1.0, avg: 54.0, max: 101.0) [2024-03-21 03:58:25,522][03784] Avg episode reward: [(0, '0.536')] [2024-03-21 03:58:30,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46421.4, 300 sec: 47097.1). Total num frames: 877297664. Throughput: 0: 45906.7. Samples: 878588000. Policy #0 lag: (min: 1.0, avg: 54.0, max: 101.0) [2024-03-21 03:58:30,522][03784] Avg episode reward: [(0, '1.187')] [2024-03-21 03:58:34,936][04017] Updated weights for policy 0, policy_version 26777 (0.0011) [2024-03-21 03:58:35,521][03784] Fps is (10 sec: 32768.4, 60 sec: 44782.9, 300 sec: 46874.9). Total num frames: 877461504. Throughput: 0: 46237.9. Samples: 878880100. Policy #0 lag: (min: 0.0, avg: 50.0, max: 105.0) [2024-03-21 03:58:35,522][03784] Avg episode reward: [(0, '0.976')] [2024-03-21 03:58:36,517][03995] Signal inference workers to stop experience collection... (17650 times) [2024-03-21 03:58:36,582][03995] Signal inference workers to resume experience collection... (17650 times) [2024-03-21 03:58:36,630][04017] InferenceWorker_p0-w0: stopping experience collection (17650 times) [2024-03-21 03:58:36,696][04017] InferenceWorker_p0-w0: resuming experience collection (17650 times) [2024-03-21 03:58:38,765][04017] Updated weights for policy 0, policy_version 26787 (0.0013) [2024-03-21 03:58:40,521][03784] Fps is (10 sec: 55705.7, 60 sec: 45329.1, 300 sec: 46874.9). Total num frames: 877854720. Throughput: 0: 45646.6. Samples: 879148700. Policy #0 lag: (min: 0.0, avg: 50.0, max: 105.0) [2024-03-21 03:58:40,522][03784] Avg episode reward: [(0, '0.950')] [2024-03-21 03:58:43,717][04017] Updated weights for policy 0, policy_version 26797 (0.0012) [2024-03-21 03:58:45,521][03784] Fps is (10 sec: 75366.5, 60 sec: 46967.5, 300 sec: 47319.2). Total num frames: 878215168. Throughput: 0: 46015.5. Samples: 879288400. Policy #0 lag: (min: 0.0, avg: 50.0, max: 105.0) [2024-03-21 03:58:45,522][03784] Avg episode reward: [(0, '0.636')] [2024-03-21 03:58:50,052][04017] Updated weights for policy 0, policy_version 26807 (0.0012) [2024-03-21 03:58:50,521][03784] Fps is (10 sec: 55705.2, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 878411776. Throughput: 0: 46066.6. Samples: 879577400. Policy #0 lag: (min: 0.0, avg: 50.0, max: 105.0) [2024-03-21 03:58:50,522][03784] Avg episode reward: [(0, '1.425')] [2024-03-21 03:58:55,521][03784] Fps is (10 sec: 29490.7, 60 sec: 48605.7, 300 sec: 46430.6). Total num frames: 878510080. Throughput: 0: 46526.6. Samples: 879876900. Policy #0 lag: (min: 0.0, avg: 53.3, max: 105.0) [2024-03-21 03:58:55,522][03784] Avg episode reward: [(0, '1.412')] [2024-03-21 03:58:58,689][04017] Updated weights for policy 0, policy_version 26817 (0.0015) [2024-03-21 03:59:00,521][03784] Fps is (10 sec: 45875.5, 60 sec: 48605.9, 300 sec: 47430.3). Total num frames: 878870528. Throughput: 0: 46402.5. Samples: 880007900. Policy #0 lag: (min: 0.0, avg: 53.3, max: 105.0) [2024-03-21 03:59:00,522][03784] Avg episode reward: [(0, '1.412')] [2024-03-21 03:59:05,521][03784] Fps is (10 sec: 52429.3, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 879034368. Throughput: 0: 46373.3. Samples: 880280500. Policy #0 lag: (min: 0.0, avg: 53.3, max: 105.0) [2024-03-21 03:59:05,522][03784] Avg episode reward: [(0, '0.661')] [2024-03-21 03:59:05,826][04017] Updated weights for policy 0, policy_version 26827 (0.0011) [2024-03-21 03:59:10,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 879198208. Throughput: 0: 47469.0. Samples: 880578900. Policy #0 lag: (min: 0.0, avg: 53.3, max: 105.0) [2024-03-21 03:59:10,522][03784] Avg episode reward: [(0, '0.556')] [2024-03-21 03:59:15,521][03784] Fps is (10 sec: 26214.5, 60 sec: 43144.7, 300 sec: 46208.4). Total num frames: 879296512. Throughput: 0: 47684.4. Samples: 880733800. Policy #0 lag: (min: 0.0, avg: 26.4, max: 87.0) [2024-03-21 03:59:15,522][03784] Avg episode reward: [(0, '1.239')] [2024-03-21 03:59:17,118][04017] Updated weights for policy 0, policy_version 26837 (0.0010) [2024-03-21 03:59:20,521][03784] Fps is (10 sec: 45874.3, 60 sec: 46967.3, 300 sec: 46874.9). Total num frames: 879656960. Throughput: 0: 46826.4. Samples: 880987300. Policy #0 lag: (min: 0.0, avg: 26.4, max: 87.0) [2024-03-21 03:59:20,522][03784] Avg episode reward: [(0, '1.142')] [2024-03-21 03:59:23,713][04017] Updated weights for policy 0, policy_version 26847 (0.0016) [2024-03-21 03:59:25,521][03784] Fps is (10 sec: 52428.9, 60 sec: 44783.0, 300 sec: 46763.8). Total num frames: 879820800. Throughput: 0: 46937.8. Samples: 881260900. Policy #0 lag: (min: 0.0, avg: 26.4, max: 87.0) [2024-03-21 03:59:25,522][03784] Avg episode reward: [(0, '1.195')] [2024-03-21 03:59:30,319][04017] Updated weights for policy 0, policy_version 26857 (0.0018) [2024-03-21 03:59:30,521][03784] Fps is (10 sec: 39322.1, 60 sec: 45875.1, 300 sec: 46652.7). Total num frames: 880050176. Throughput: 0: 47099.9. Samples: 881407900. Policy #0 lag: (min: 0.0, avg: 26.4, max: 87.0) [2024-03-21 03:59:30,522][03784] Avg episode reward: [(0, '0.891')] [2024-03-21 03:59:33,768][03995] Signal inference workers to stop experience collection... (17700 times) [2024-03-21 03:59:33,769][03995] Signal inference workers to resume experience collection... (17700 times) [2024-03-21 03:59:33,829][04017] InferenceWorker_p0-w0: stopping experience collection (17700 times) [2024-03-21 03:59:33,829][04017] InferenceWorker_p0-w0: resuming experience collection (17700 times) [2024-03-21 03:59:35,521][03784] Fps is (10 sec: 52428.0, 60 sec: 48059.6, 300 sec: 46763.8). Total num frames: 880345088. Throughput: 0: 46957.7. Samples: 881690500. Policy #0 lag: (min: 0.0, avg: 31.3, max: 67.0) [2024-03-21 03:59:35,522][03784] Avg episode reward: [(0, '1.280')] [2024-03-21 03:59:35,869][04017] Updated weights for policy 0, policy_version 26867 (0.0011) [2024-03-21 03:59:40,521][03784] Fps is (10 sec: 52429.2, 60 sec: 45329.0, 300 sec: 46652.8). Total num frames: 880574464. Throughput: 0: 46897.9. Samples: 881987300. Policy #0 lag: (min: 0.0, avg: 31.3, max: 67.0) [2024-03-21 03:59:40,522][03784] Avg episode reward: [(0, '1.280')] [2024-03-21 03:59:41,553][04017] Updated weights for policy 0, policy_version 26877 (0.0031) [2024-03-21 03:59:44,622][04017] Updated weights for policy 0, policy_version 26887 (0.0015) [2024-03-21 03:59:45,521][03784] Fps is (10 sec: 78642.6, 60 sec: 48605.7, 300 sec: 48207.8). Total num frames: 881131520. Throughput: 0: 46499.8. Samples: 882100400. Policy #0 lag: (min: 0.0, avg: 31.3, max: 67.0) [2024-03-21 03:59:45,522][03784] Avg episode reward: [(0, '0.628')] [2024-03-21 03:59:50,521][03784] Fps is (10 sec: 72089.2, 60 sec: 48059.7, 300 sec: 48430.0). Total num frames: 881295360. Throughput: 0: 46573.3. Samples: 882376300. Policy #0 lag: (min: 0.0, avg: 31.3, max: 67.0) [2024-03-21 03:59:50,522][03784] Avg episode reward: [(0, '0.450')] [2024-03-21 03:59:53,599][04017] Updated weights for policy 0, policy_version 26897 (0.0016) [2024-03-21 03:59:55,521][03784] Fps is (10 sec: 32768.8, 60 sec: 49152.1, 300 sec: 47541.4). Total num frames: 881459200. Throughput: 0: 46697.8. Samples: 882680300. Policy #0 lag: (min: 0.0, avg: 41.9, max: 78.0) [2024-03-21 03:59:55,522][03784] Avg episode reward: [(0, '0.450')] [2024-03-21 04:00:00,472][04017] Updated weights for policy 0, policy_version 26907 (0.0015) [2024-03-21 04:00:00,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46967.3, 300 sec: 46986.0). Total num frames: 881688576. Throughput: 0: 46515.4. Samples: 882827000. Policy #0 lag: (min: 0.0, avg: 41.9, max: 78.0) [2024-03-21 04:00:00,522][03784] Avg episode reward: [(0, '1.363')] [2024-03-21 04:00:00,892][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000026908_881721344.pth... [2024-03-21 04:00:00,949][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000026559_870285312.pth [2024-03-21 04:00:05,521][03784] Fps is (10 sec: 45875.0, 60 sec: 48059.8, 300 sec: 46763.8). Total num frames: 881917952. Throughput: 0: 46962.4. Samples: 883100600. Policy #0 lag: (min: 0.0, avg: 41.9, max: 78.0) [2024-03-21 04:00:05,522][03784] Avg episode reward: [(0, '1.310')] [2024-03-21 04:00:09,404][04017] Updated weights for policy 0, policy_version 26917 (0.0014) [2024-03-21 04:00:10,521][03784] Fps is (10 sec: 36045.3, 60 sec: 47513.6, 300 sec: 46652.8). Total num frames: 882049024. Throughput: 0: 47673.3. Samples: 883406200. Policy #0 lag: (min: 0.0, avg: 41.9, max: 78.0) [2024-03-21 04:00:10,522][03784] Avg episode reward: [(0, '0.946')] [2024-03-21 04:00:15,521][03784] Fps is (10 sec: 36044.6, 60 sec: 49698.1, 300 sec: 46874.9). Total num frames: 882278400. Throughput: 0: 47691.1. Samples: 883554000. Policy #0 lag: (min: 0.0, avg: 37.7, max: 79.0) [2024-03-21 04:00:15,522][03784] Avg episode reward: [(0, '1.313')] [2024-03-21 04:00:18,045][04017] Updated weights for policy 0, policy_version 26927 (0.0024) [2024-03-21 04:00:20,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46967.6, 300 sec: 47097.1). Total num frames: 882475008. Throughput: 0: 47597.9. Samples: 883832400. Policy #0 lag: (min: 0.0, avg: 37.7, max: 79.0) [2024-03-21 04:00:20,522][03784] Avg episode reward: [(0, '1.201')] [2024-03-21 04:00:22,096][03995] Signal inference workers to stop experience collection... (17750 times) [2024-03-21 04:00:22,097][03995] Signal inference workers to resume experience collection... (17750 times) [2024-03-21 04:00:22,154][04017] InferenceWorker_p0-w0: stopping experience collection (17750 times) [2024-03-21 04:00:22,154][04017] InferenceWorker_p0-w0: resuming experience collection (17750 times) [2024-03-21 04:00:24,546][04017] Updated weights for policy 0, policy_version 26937 (0.0011) [2024-03-21 04:00:25,521][03784] Fps is (10 sec: 45875.3, 60 sec: 48605.8, 300 sec: 46763.8). Total num frames: 882737152. Throughput: 0: 47622.2. Samples: 884130300. Policy #0 lag: (min: 0.0, avg: 37.7, max: 79.0) [2024-03-21 04:00:25,522][03784] Avg episode reward: [(0, '1.201')] [2024-03-21 04:00:28,075][04017] Updated weights for policy 0, policy_version 26947 (0.0011) [2024-03-21 04:00:30,521][03784] Fps is (10 sec: 55705.8, 60 sec: 49698.2, 300 sec: 46986.0). Total num frames: 883032064. Throughput: 0: 47978.0. Samples: 884259400. Policy #0 lag: (min: 0.0, avg: 37.7, max: 79.0) [2024-03-21 04:00:30,522][03784] Avg episode reward: [(0, '1.240')] [2024-03-21 04:00:35,521][03784] Fps is (10 sec: 55705.8, 60 sec: 49152.1, 300 sec: 47319.2). Total num frames: 883294208. Throughput: 0: 48420.1. Samples: 884555200. Policy #0 lag: (min: 0.0, avg: 32.6, max: 76.0) [2024-03-21 04:00:35,522][03784] Avg episode reward: [(0, '1.240')] [2024-03-21 04:00:37,157][04017] Updated weights for policy 0, policy_version 26957 (0.0011) [2024-03-21 04:00:40,521][03784] Fps is (10 sec: 29490.7, 60 sec: 45875.1, 300 sec: 46541.6). Total num frames: 883326976. Throughput: 0: 48130.9. Samples: 884846200. Policy #0 lag: (min: 0.0, avg: 32.6, max: 76.0) [2024-03-21 04:00:40,522][03784] Avg episode reward: [(0, '0.739')] [2024-03-21 04:00:43,835][04017] Updated weights for policy 0, policy_version 26967 (0.0011) [2024-03-21 04:00:45,521][03784] Fps is (10 sec: 49151.5, 60 sec: 44236.9, 300 sec: 47208.1). Total num frames: 883785728. Throughput: 0: 47588.9. Samples: 884968500. Policy #0 lag: (min: 0.0, avg: 32.6, max: 76.0) [2024-03-21 04:00:45,522][03784] Avg episode reward: [(0, '0.770')] [2024-03-21 04:00:48,189][04017] Updated weights for policy 0, policy_version 26977 (0.0021) [2024-03-21 04:00:50,521][03784] Fps is (10 sec: 68813.7, 60 sec: 45329.1, 300 sec: 47319.2). Total num frames: 884015104. Throughput: 0: 47635.6. Samples: 885244200. Policy #0 lag: (min: 0.0, avg: 32.6, max: 76.0) [2024-03-21 04:00:50,522][03784] Avg episode reward: [(0, '0.669')] [2024-03-21 04:00:55,521][03784] Fps is (10 sec: 32768.1, 60 sec: 44236.7, 300 sec: 46874.9). Total num frames: 884113408. Throughput: 0: 47675.5. Samples: 885551600. Policy #0 lag: (min: 0.0, avg: 49.0, max: 99.0) [2024-03-21 04:00:55,522][03784] Avg episode reward: [(0, '1.529')] [2024-03-21 04:00:59,659][04017] Updated weights for policy 0, policy_version 26987 (0.0014) [2024-03-21 04:01:00,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44783.0, 300 sec: 47097.1). Total num frames: 884375552. Throughput: 0: 47728.9. Samples: 885701800. Policy #0 lag: (min: 0.0, avg: 49.0, max: 99.0) [2024-03-21 04:01:00,522][03784] Avg episode reward: [(0, '1.529')] [2024-03-21 04:01:03,005][04017] Updated weights for policy 0, policy_version 26997 (0.0024) [2024-03-21 04:01:03,543][03995] Signal inference workers to stop experience collection... (17800 times) [2024-03-21 04:01:03,594][04017] InferenceWorker_p0-w0: stopping experience collection (17800 times) [2024-03-21 04:01:03,611][03995] Signal inference workers to resume experience collection... (17800 times) [2024-03-21 04:01:03,644][04017] InferenceWorker_p0-w0: resuming experience collection (17800 times) [2024-03-21 04:01:05,521][03784] Fps is (10 sec: 75366.2, 60 sec: 49151.9, 300 sec: 47985.7). Total num frames: 884867072. Throughput: 0: 47628.8. Samples: 885975700. Policy #0 lag: (min: 0.0, avg: 49.0, max: 99.0) [2024-03-21 04:01:05,522][03784] Avg episode reward: [(0, '1.158')] [2024-03-21 04:01:08,734][04017] Updated weights for policy 0, policy_version 27007 (0.0014) [2024-03-21 04:01:10,521][03784] Fps is (10 sec: 58982.9, 60 sec: 48605.9, 300 sec: 46986.0). Total num frames: 884965376. Throughput: 0: 47277.9. Samples: 886257800. Policy #0 lag: (min: 0.0, avg: 42.8, max: 110.0) [2024-03-21 04:01:10,522][03784] Avg episode reward: [(0, '1.135')] [2024-03-21 04:01:15,218][04017] Updated weights for policy 0, policy_version 27017 (0.0015) [2024-03-21 04:01:15,521][03784] Fps is (10 sec: 42598.6, 60 sec: 50244.3, 300 sec: 47208.1). Total num frames: 885293056. Throughput: 0: 46991.0. Samples: 886374000. Policy #0 lag: (min: 0.0, avg: 42.8, max: 110.0) [2024-03-21 04:01:15,522][03784] Avg episode reward: [(0, '0.941')] [2024-03-21 04:01:20,521][03784] Fps is (10 sec: 45874.8, 60 sec: 49152.0, 300 sec: 46652.7). Total num frames: 885424128. Throughput: 0: 47137.7. Samples: 886676400. Policy #0 lag: (min: 0.0, avg: 42.8, max: 110.0) [2024-03-21 04:01:20,522][03784] Avg episode reward: [(0, '1.307')] [2024-03-21 04:01:25,394][04017] Updated weights for policy 0, policy_version 27027 (0.0014) [2024-03-21 04:01:25,521][03784] Fps is (10 sec: 32768.2, 60 sec: 48059.8, 300 sec: 46763.8). Total num frames: 885620736. Throughput: 0: 47064.6. Samples: 886964100. Policy #0 lag: (min: 0.0, avg: 42.8, max: 110.0) [2024-03-21 04:01:25,522][03784] Avg episode reward: [(0, '1.472')] [2024-03-21 04:01:30,521][03784] Fps is (10 sec: 49151.1, 60 sec: 48059.6, 300 sec: 47430.3). Total num frames: 885915648. Throughput: 0: 46971.0. Samples: 887082200. Policy #0 lag: (min: 2.0, avg: 33.5, max: 73.0) [2024-03-21 04:01:30,523][03784] Avg episode reward: [(0, '1.446')] [2024-03-21 04:01:30,773][04017] Updated weights for policy 0, policy_version 27037 (0.0017) [2024-03-21 04:01:35,521][03784] Fps is (10 sec: 42598.3, 60 sec: 45875.2, 300 sec: 46874.9). Total num frames: 886046720. Throughput: 0: 47597.8. Samples: 887386100. Policy #0 lag: (min: 2.0, avg: 33.5, max: 73.0) [2024-03-21 04:01:35,522][03784] Avg episode reward: [(0, '1.446')] [2024-03-21 04:01:40,521][03784] Fps is (10 sec: 26214.7, 60 sec: 47513.6, 300 sec: 46430.6). Total num frames: 886177792. Throughput: 0: 47168.8. Samples: 887674200. Policy #0 lag: (min: 2.0, avg: 33.5, max: 73.0) [2024-03-21 04:01:40,522][03784] Avg episode reward: [(0, '1.134')] [2024-03-21 04:01:44,952][04017] Updated weights for policy 0, policy_version 27047 (0.0015) [2024-03-21 04:01:45,521][03784] Fps is (10 sec: 26214.3, 60 sec: 42052.3, 300 sec: 46541.7). Total num frames: 886308864. Throughput: 0: 46920.0. Samples: 887813200. Policy #0 lag: (min: 0.0, avg: 22.9, max: 58.0) [2024-03-21 04:01:45,522][03784] Avg episode reward: [(0, '1.016')] [2024-03-21 04:01:48,679][04017] Updated weights for policy 0, policy_version 27057 (0.0019) [2024-03-21 04:01:50,521][03784] Fps is (10 sec: 52429.3, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 886702080. Throughput: 0: 46277.9. Samples: 888058200. Policy #0 lag: (min: 0.0, avg: 22.9, max: 58.0) [2024-03-21 04:01:50,522][03784] Avg episode reward: [(0, '0.936')] [2024-03-21 04:01:54,652][04017] Updated weights for policy 0, policy_version 27067 (0.0029) [2024-03-21 04:01:55,521][03784] Fps is (10 sec: 68813.2, 60 sec: 48059.8, 300 sec: 46986.0). Total num frames: 886996992. Throughput: 0: 46377.7. Samples: 888344800. Policy #0 lag: (min: 0.0, avg: 22.9, max: 58.0) [2024-03-21 04:01:55,522][03784] Avg episode reward: [(0, '1.251')] [2024-03-21 04:01:58,307][03995] Signal inference workers to stop experience collection... (17850 times) [2024-03-21 04:01:58,308][03995] Signal inference workers to resume experience collection... (17850 times) [2024-03-21 04:01:58,320][04017] Updated weights for policy 0, policy_version 27077 (0.0015) [2024-03-21 04:01:58,374][04017] InferenceWorker_p0-w0: stopping experience collection (17850 times) [2024-03-21 04:01:58,375][04017] InferenceWorker_p0-w0: resuming experience collection (17850 times) [2024-03-21 04:02:00,521][03784] Fps is (10 sec: 72088.9, 60 sec: 50790.3, 300 sec: 47541.3). Total num frames: 887422976. Throughput: 0: 46604.4. Samples: 888471200. Policy #0 lag: (min: 0.0, avg: 22.9, max: 58.0) [2024-03-21 04:02:00,522][03784] Avg episode reward: [(0, '0.464')] [2024-03-21 04:02:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000027082_887422976.pth... [2024-03-21 04:02:00,658][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000026732_875954176.pth [2024-03-21 04:02:05,521][03784] Fps is (10 sec: 52428.8, 60 sec: 44236.9, 300 sec: 47541.4). Total num frames: 887521280. Throughput: 0: 46297.8. Samples: 888759800. Policy #0 lag: (min: 0.0, avg: 57.9, max: 114.0) [2024-03-21 04:02:05,524][03784] Avg episode reward: [(0, '0.945')] [2024-03-21 04:02:06,058][04017] Updated weights for policy 0, policy_version 27087 (0.0016) [2024-03-21 04:02:10,521][03784] Fps is (10 sec: 39321.2, 60 sec: 47513.4, 300 sec: 47208.1). Total num frames: 887816192. Throughput: 0: 46390.9. Samples: 889051700. Policy #0 lag: (min: 0.0, avg: 57.9, max: 114.0) [2024-03-21 04:02:10,522][03784] Avg episode reward: [(0, '0.718')] [2024-03-21 04:02:11,434][04017] Updated weights for policy 0, policy_version 27097 (0.0016) [2024-03-21 04:02:15,521][03784] Fps is (10 sec: 58982.4, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 888111104. Throughput: 0: 47082.5. Samples: 889200900. Policy #0 lag: (min: 0.0, avg: 57.9, max: 114.0) [2024-03-21 04:02:15,522][03784] Avg episode reward: [(0, '0.718')] [2024-03-21 04:02:17,789][04017] Updated weights for policy 0, policy_version 27107 (0.0015) [2024-03-21 04:02:20,521][03784] Fps is (10 sec: 52429.7, 60 sec: 48605.9, 300 sec: 47541.4). Total num frames: 888340480. Throughput: 0: 46613.3. Samples: 889483700. Policy #0 lag: (min: 0.0, avg: 57.9, max: 114.0) [2024-03-21 04:02:20,522][03784] Avg episode reward: [(0, '1.004')] [2024-03-21 04:02:24,142][04017] Updated weights for policy 0, policy_version 27117 (0.0023) [2024-03-21 04:02:25,521][03784] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 47652.5). Total num frames: 888569856. Throughput: 0: 46215.7. Samples: 889753900. Policy #0 lag: (min: 1.0, avg: 43.1, max: 80.0) [2024-03-21 04:02:25,522][03784] Avg episode reward: [(0, '1.286')] [2024-03-21 04:02:30,521][03784] Fps is (10 sec: 32768.1, 60 sec: 45875.4, 300 sec: 47097.1). Total num frames: 888668160. Throughput: 0: 46497.8. Samples: 889905600. Policy #0 lag: (min: 1.0, avg: 43.1, max: 80.0) [2024-03-21 04:02:30,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 04:02:35,521][03784] Fps is (10 sec: 29491.0, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 888864768. Throughput: 0: 47268.9. Samples: 890185300. Policy #0 lag: (min: 1.0, avg: 43.1, max: 80.0) [2024-03-21 04:02:35,521][03784] Avg episode reward: [(0, '1.222')] [2024-03-21 04:02:40,112][04017] Updated weights for policy 0, policy_version 27127 (0.0017) [2024-03-21 04:02:40,521][03784] Fps is (10 sec: 26214.0, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 888930304. Throughput: 0: 47219.8. Samples: 890469700. Policy #0 lag: (min: 0.0, avg: 31.8, max: 82.0) [2024-03-21 04:02:40,522][03784] Avg episode reward: [(0, '0.523')] [2024-03-21 04:02:44,116][04017] Updated weights for policy 0, policy_version 27137 (0.0034) [2024-03-21 04:02:45,521][03784] Fps is (10 sec: 42598.5, 60 sec: 49698.2, 300 sec: 46763.8). Total num frames: 889290752. Throughput: 0: 47064.6. Samples: 890589100. Policy #0 lag: (min: 0.0, avg: 31.8, max: 82.0) [2024-03-21 04:02:45,522][03784] Avg episode reward: [(0, '0.968')] [2024-03-21 04:02:50,521][03784] Fps is (10 sec: 55706.0, 60 sec: 46421.3, 300 sec: 47097.0). Total num frames: 889487360. Throughput: 0: 46704.3. Samples: 890861500. Policy #0 lag: (min: 0.0, avg: 31.8, max: 82.0) [2024-03-21 04:02:50,522][03784] Avg episode reward: [(0, '0.640')] [2024-03-21 04:02:50,862][04017] Updated weights for policy 0, policy_version 27147 (0.0010) [2024-03-21 04:02:54,000][03995] Signal inference workers to stop experience collection... (17900 times) [2024-03-21 04:02:54,001][03995] Signal inference workers to resume experience collection... (17900 times) [2024-03-21 04:02:54,063][04017] InferenceWorker_p0-w0: stopping experience collection (17900 times) [2024-03-21 04:02:54,063][04017] InferenceWorker_p0-w0: resuming experience collection (17900 times) [2024-03-21 04:02:55,521][03784] Fps is (10 sec: 55705.2, 60 sec: 47513.5, 300 sec: 47097.0). Total num frames: 889847808. Throughput: 0: 45866.8. Samples: 891115700. Policy #0 lag: (min: 0.0, avg: 31.8, max: 82.0) [2024-03-21 04:02:55,522][03784] Avg episode reward: [(0, '1.240')] [2024-03-21 04:02:55,661][04017] Updated weights for policy 0, policy_version 27157 (0.0012) [2024-03-21 04:03:00,521][03784] Fps is (10 sec: 49152.3, 60 sec: 42598.5, 300 sec: 46763.8). Total num frames: 889978880. Throughput: 0: 45782.2. Samples: 891261100. Policy #0 lag: (min: 0.0, avg: 31.9, max: 70.0) [2024-03-21 04:03:00,522][03784] Avg episode reward: [(0, '1.461')] [2024-03-21 04:03:03,728][04017] Updated weights for policy 0, policy_version 27167 (0.0011) [2024-03-21 04:03:05,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45875.1, 300 sec: 47097.1). Total num frames: 890273792. Throughput: 0: 46157.7. Samples: 891560800. Policy #0 lag: (min: 0.0, avg: 31.9, max: 70.0) [2024-03-21 04:03:05,522][03784] Avg episode reward: [(0, '0.820')] [2024-03-21 04:03:10,152][04017] Updated weights for policy 0, policy_version 27177 (0.0016) [2024-03-21 04:03:10,521][03784] Fps is (10 sec: 58981.8, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 890568704. Throughput: 0: 46704.3. Samples: 891855600. Policy #0 lag: (min: 0.0, avg: 31.9, max: 70.0) [2024-03-21 04:03:10,522][03784] Avg episode reward: [(0, '0.823')] [2024-03-21 04:03:14,136][04017] Updated weights for policy 0, policy_version 27187 (0.0010) [2024-03-21 04:03:15,521][03784] Fps is (10 sec: 65536.1, 60 sec: 46967.4, 300 sec: 47763.5). Total num frames: 890929152. Throughput: 0: 46524.4. Samples: 891999200. Policy #0 lag: (min: 0.0, avg: 31.9, max: 70.0) [2024-03-21 04:03:15,522][03784] Avg episode reward: [(0, '0.869')] [2024-03-21 04:03:20,521][03784] Fps is (10 sec: 58982.8, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 891158528. Throughput: 0: 46240.0. Samples: 892266100. Policy #0 lag: (min: 0.0, avg: 38.8, max: 81.0) [2024-03-21 04:03:20,522][03784] Avg episode reward: [(0, '0.569')] [2024-03-21 04:03:20,834][04017] Updated weights for policy 0, policy_version 27197 (0.0015) [2024-03-21 04:03:25,521][03784] Fps is (10 sec: 45875.5, 60 sec: 46967.5, 300 sec: 47763.5). Total num frames: 891387904. Throughput: 0: 46431.3. Samples: 892559100. Policy #0 lag: (min: 0.0, avg: 38.8, max: 81.0) [2024-03-21 04:03:25,522][03784] Avg episode reward: [(0, '0.898')] [2024-03-21 04:03:30,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 891486208. Throughput: 0: 47164.3. Samples: 892711500. Policy #0 lag: (min: 0.0, avg: 38.8, max: 81.0) [2024-03-21 04:03:30,522][03784] Avg episode reward: [(0, '1.274')] [2024-03-21 04:03:31,376][04017] Updated weights for policy 0, policy_version 27207 (0.0010) [2024-03-21 04:03:34,597][04017] Updated weights for policy 0, policy_version 27217 (0.0011) [2024-03-21 04:03:35,521][03784] Fps is (10 sec: 45874.9, 60 sec: 49698.1, 300 sec: 47430.3). Total num frames: 891846656. Throughput: 0: 47568.9. Samples: 893002100. Policy #0 lag: (min: 0.0, avg: 38.8, max: 81.0) [2024-03-21 04:03:35,522][03784] Avg episode reward: [(0, '1.274')] [2024-03-21 04:03:40,521][03784] Fps is (10 sec: 45875.5, 60 sec: 50244.4, 300 sec: 46541.7). Total num frames: 891944960. Throughput: 0: 48677.8. Samples: 893306200. Policy #0 lag: (min: 0.0, avg: 47.5, max: 90.0) [2024-03-21 04:03:40,522][03784] Avg episode reward: [(0, '1.251')] [2024-03-21 04:03:44,924][04017] Updated weights for policy 0, policy_version 27227 (0.0012) [2024-03-21 04:03:45,521][03784] Fps is (10 sec: 36044.7, 60 sec: 48605.8, 300 sec: 46763.8). Total num frames: 892207104. Throughput: 0: 48535.5. Samples: 893445200. Policy #0 lag: (min: 0.0, avg: 47.5, max: 90.0) [2024-03-21 04:03:45,522][03784] Avg episode reward: [(0, '0.616')] [2024-03-21 04:03:49,648][03995] Signal inference workers to stop experience collection... (17950 times) [2024-03-21 04:03:49,707][04017] InferenceWorker_p0-w0: stopping experience collection (17950 times) [2024-03-21 04:03:49,882][03995] Signal inference workers to resume experience collection... (17950 times) [2024-03-21 04:03:49,883][04017] InferenceWorker_p0-w0: resuming experience collection (17950 times) [2024-03-21 04:03:50,521][03784] Fps is (10 sec: 49151.5, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 892436480. Throughput: 0: 48413.3. Samples: 893739400. Policy #0 lag: (min: 0.0, avg: 47.5, max: 90.0) [2024-03-21 04:03:50,522][03784] Avg episode reward: [(0, '1.323')] [2024-03-21 04:03:51,358][04017] Updated weights for policy 0, policy_version 27237 (0.0011) [2024-03-21 04:03:55,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45329.1, 300 sec: 46430.6). Total num frames: 892567552. Throughput: 0: 47971.2. Samples: 894014300. Policy #0 lag: (min: 0.0, avg: 47.5, max: 90.0) [2024-03-21 04:03:55,522][03784] Avg episode reward: [(0, '1.335')] [2024-03-21 04:03:59,367][04017] Updated weights for policy 0, policy_version 27247 (0.0018) [2024-03-21 04:04:00,521][03784] Fps is (10 sec: 42598.7, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 892862464. Throughput: 0: 47537.8. Samples: 894138400. Policy #0 lag: (min: 0.0, avg: 28.9, max: 99.0) [2024-03-21 04:04:00,522][03784] Avg episode reward: [(0, '1.299')] [2024-03-21 04:04:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000027248_892862464.pth... [2024-03-21 04:04:00,650][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000026908_881721344.pth [2024-03-21 04:04:05,521][03784] Fps is (10 sec: 49152.0, 60 sec: 46421.3, 300 sec: 46986.0). Total num frames: 893059072. Throughput: 0: 47682.2. Samples: 894411800. Policy #0 lag: (min: 0.0, avg: 28.9, max: 99.0) [2024-03-21 04:04:05,522][03784] Avg episode reward: [(0, '1.081')] [2024-03-21 04:04:07,640][04017] Updated weights for policy 0, policy_version 27257 (0.0011) [2024-03-21 04:04:10,521][03784] Fps is (10 sec: 36045.0, 60 sec: 44236.9, 300 sec: 47208.1). Total num frames: 893222912. Throughput: 0: 47926.6. Samples: 894715800. Policy #0 lag: (min: 0.0, avg: 28.9, max: 99.0) [2024-03-21 04:04:10,522][03784] Avg episode reward: [(0, '0.963')] [2024-03-21 04:04:15,521][03784] Fps is (10 sec: 36045.2, 60 sec: 41506.2, 300 sec: 46652.8). Total num frames: 893419520. Throughput: 0: 47684.6. Samples: 894857300. Policy #0 lag: (min: 0.0, avg: 41.5, max: 93.0) [2024-03-21 04:04:15,521][03784] Avg episode reward: [(0, '0.426')] [2024-03-21 04:04:15,907][04017] Updated weights for policy 0, policy_version 27267 (0.0016) [2024-03-21 04:04:20,462][04017] Updated weights for policy 0, policy_version 27277 (0.0012) [2024-03-21 04:04:20,521][03784] Fps is (10 sec: 58982.0, 60 sec: 44236.8, 300 sec: 47430.3). Total num frames: 893812736. Throughput: 0: 47480.0. Samples: 895138700. Policy #0 lag: (min: 0.0, avg: 41.5, max: 93.0) [2024-03-21 04:04:20,522][03784] Avg episode reward: [(0, '1.365')] [2024-03-21 04:04:24,390][04017] Updated weights for policy 0, policy_version 27287 (0.0015) [2024-03-21 04:04:25,521][03784] Fps is (10 sec: 81919.0, 60 sec: 47513.5, 300 sec: 48096.8). Total num frames: 894238720. Throughput: 0: 46695.5. Samples: 895407500. Policy #0 lag: (min: 0.0, avg: 41.5, max: 93.0) [2024-03-21 04:04:25,522][03784] Avg episode reward: [(0, '1.246')] [2024-03-21 04:04:28,197][04017] Updated weights for policy 0, policy_version 27297 (0.0009) [2024-03-21 04:04:30,521][03784] Fps is (10 sec: 75366.0, 60 sec: 51336.5, 300 sec: 48207.8). Total num frames: 894566400. Throughput: 0: 46506.6. Samples: 895538000. Policy #0 lag: (min: 0.0, avg: 41.5, max: 93.0) [2024-03-21 04:04:30,522][03784] Avg episode reward: [(0, '1.150')] [2024-03-21 04:04:35,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46967.5, 300 sec: 47763.5). Total num frames: 894664704. Throughput: 0: 46724.5. Samples: 895842000. Policy #0 lag: (min: 0.0, avg: 39.7, max: 73.0) [2024-03-21 04:04:35,522][03784] Avg episode reward: [(0, '1.150')] [2024-03-21 04:04:38,834][04017] Updated weights for policy 0, policy_version 27307 (0.0010) [2024-03-21 04:04:40,521][03784] Fps is (10 sec: 36045.2, 60 sec: 49698.1, 300 sec: 46763.9). Total num frames: 894926848. Throughput: 0: 46937.8. Samples: 896126500. Policy #0 lag: (min: 0.0, avg: 39.7, max: 73.0) [2024-03-21 04:04:40,522][03784] Avg episode reward: [(0, '1.598')] [2024-03-21 04:04:40,601][03995] Signal inference workers to stop experience collection... (18000 times) [2024-03-21 04:04:40,710][04017] InferenceWorker_p0-w0: stopping experience collection (18000 times) [2024-03-21 04:04:40,815][03995] Signal inference workers to resume experience collection... (18000 times) [2024-03-21 04:04:40,816][04017] InferenceWorker_p0-w0: resuming experience collection (18000 times) [2024-03-21 04:04:44,012][04017] Updated weights for policy 0, policy_version 27317 (0.0011) [2024-03-21 04:04:45,521][03784] Fps is (10 sec: 55706.3, 60 sec: 50244.4, 300 sec: 47208.2). Total num frames: 895221760. Throughput: 0: 47206.8. Samples: 896262700. Policy #0 lag: (min: 0.0, avg: 39.7, max: 73.0) [2024-03-21 04:04:45,522][03784] Avg episode reward: [(0, '1.352')] [2024-03-21 04:04:50,521][03784] Fps is (10 sec: 36044.6, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 895287296. Throughput: 0: 47686.6. Samples: 896557700. Policy #0 lag: (min: 0.0, avg: 39.7, max: 73.0) [2024-03-21 04:04:50,522][03784] Avg episode reward: [(0, '1.132')] [2024-03-21 04:04:55,521][03784] Fps is (10 sec: 19660.4, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 895418368. Throughput: 0: 47293.2. Samples: 896844000. Policy #0 lag: (min: 0.0, avg: 25.4, max: 70.0) [2024-03-21 04:04:55,522][03784] Avg episode reward: [(0, '1.518')] [2024-03-21 04:04:55,744][04017] Updated weights for policy 0, policy_version 27327 (0.0011) [2024-03-21 04:05:00,521][03784] Fps is (10 sec: 42598.4, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 895713280. Throughput: 0: 47339.8. Samples: 896987600. Policy #0 lag: (min: 0.0, avg: 25.4, max: 70.0) [2024-03-21 04:05:00,522][03784] Avg episode reward: [(0, '1.518')] [2024-03-21 04:05:00,908][04017] Updated weights for policy 0, policy_version 27337 (0.0011) [2024-03-21 04:05:05,521][03784] Fps is (10 sec: 62259.8, 60 sec: 49698.2, 300 sec: 47430.3). Total num frames: 896040960. Throughput: 0: 47237.8. Samples: 897264400. Policy #0 lag: (min: 0.0, avg: 25.4, max: 70.0) [2024-03-21 04:05:05,522][03784] Avg episode reward: [(0, '1.098')] [2024-03-21 04:05:10,108][04017] Updated weights for policy 0, policy_version 27347 (0.0011) [2024-03-21 04:05:10,521][03784] Fps is (10 sec: 42598.3, 60 sec: 48605.8, 300 sec: 46986.0). Total num frames: 896139264. Throughput: 0: 47342.2. Samples: 897537900. Policy #0 lag: (min: 0.0, avg: 25.4, max: 70.0) [2024-03-21 04:05:10,522][03784] Avg episode reward: [(0, '0.710')] [2024-03-21 04:05:15,521][03784] Fps is (10 sec: 29491.2, 60 sec: 48605.8, 300 sec: 46986.0). Total num frames: 896335872. Throughput: 0: 47515.7. Samples: 897676200. Policy #0 lag: (min: 1.0, avg: 41.4, max: 117.0) [2024-03-21 04:05:15,522][03784] Avg episode reward: [(0, '0.916')] [2024-03-21 04:05:17,010][04017] Updated weights for policy 0, policy_version 27357 (0.0024) [2024-03-21 04:05:20,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 896532480. Throughput: 0: 47248.9. Samples: 897968200. Policy #0 lag: (min: 1.0, avg: 41.4, max: 117.0) [2024-03-21 04:05:20,522][03784] Avg episode reward: [(0, '1.550')] [2024-03-21 04:05:24,742][04017] Updated weights for policy 0, policy_version 27367 (0.0013) [2024-03-21 04:05:25,521][03784] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 46541.7). Total num frames: 896761856. Throughput: 0: 47822.2. Samples: 898278500. Policy #0 lag: (min: 1.0, avg: 41.4, max: 117.0) [2024-03-21 04:05:25,522][03784] Avg episode reward: [(0, '1.227')] [2024-03-21 04:05:28,833][04017] Updated weights for policy 0, policy_version 27377 (0.0022) [2024-03-21 04:05:30,521][03784] Fps is (10 sec: 62259.6, 60 sec: 43144.6, 300 sec: 46986.0). Total num frames: 897155072. Throughput: 0: 47737.7. Samples: 898410900. Policy #0 lag: (min: 3.0, avg: 47.6, max: 95.0) [2024-03-21 04:05:30,522][03784] Avg episode reward: [(0, '0.759')] [2024-03-21 04:05:34,166][04017] Updated weights for policy 0, policy_version 27387 (0.0013) [2024-03-21 04:05:35,521][03784] Fps is (10 sec: 75366.2, 60 sec: 47513.6, 300 sec: 48096.8). Total num frames: 897515520. Throughput: 0: 46904.5. Samples: 898668400. Policy #0 lag: (min: 3.0, avg: 47.6, max: 95.0) [2024-03-21 04:05:35,522][03784] Avg episode reward: [(0, '0.979')] [2024-03-21 04:05:35,651][03995] Signal inference workers to stop experience collection... (18050 times) [2024-03-21 04:05:35,652][03995] Signal inference workers to resume experience collection... (18050 times) [2024-03-21 04:05:35,720][04017] InferenceWorker_p0-w0: stopping experience collection (18050 times) [2024-03-21 04:05:35,721][04017] InferenceWorker_p0-w0: resuming experience collection (18050 times) [2024-03-21 04:05:39,226][04017] Updated weights for policy 0, policy_version 27397 (0.0012) [2024-03-21 04:05:40,521][03784] Fps is (10 sec: 62259.1, 60 sec: 47513.6, 300 sec: 47430.3). Total num frames: 897777664. Throughput: 0: 46482.3. Samples: 898935700. Policy #0 lag: (min: 3.0, avg: 47.6, max: 95.0) [2024-03-21 04:05:40,522][03784] Avg episode reward: [(0, '0.961')] [2024-03-21 04:05:45,236][04017] Updated weights for policy 0, policy_version 27407 (0.0020) [2024-03-21 04:05:45,521][03784] Fps is (10 sec: 55705.4, 60 sec: 47513.5, 300 sec: 47652.4). Total num frames: 898072576. Throughput: 0: 46117.8. Samples: 899062900. Policy #0 lag: (min: 3.0, avg: 47.6, max: 95.0) [2024-03-21 04:05:45,522][03784] Avg episode reward: [(0, '0.621')] [2024-03-21 04:05:50,521][03784] Fps is (10 sec: 39321.4, 60 sec: 48059.8, 300 sec: 47652.5). Total num frames: 898170880. Throughput: 0: 46293.3. Samples: 899347600. Policy #0 lag: (min: 3.0, avg: 47.6, max: 95.0) [2024-03-21 04:05:50,522][03784] Avg episode reward: [(0, '0.894')] [2024-03-21 04:05:55,011][04017] Updated weights for policy 0, policy_version 27417 (0.0010) [2024-03-21 04:05:55,521][03784] Fps is (10 sec: 32768.3, 60 sec: 49698.2, 300 sec: 47541.4). Total num frames: 898400256. Throughput: 0: 46515.6. Samples: 899631100. Policy #0 lag: (min: 0.0, avg: 53.2, max: 85.0) [2024-03-21 04:05:55,522][03784] Avg episode reward: [(0, '1.615')] [2024-03-21 04:06:00,521][03784] Fps is (10 sec: 42598.4, 60 sec: 48059.8, 300 sec: 46541.7). Total num frames: 898596864. Throughput: 0: 46340.0. Samples: 899761500. Policy #0 lag: (min: 0.0, avg: 53.2, max: 85.0) [2024-03-21 04:06:00,522][03784] Avg episode reward: [(0, '0.657')] [2024-03-21 04:06:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000027423_898596864.pth... [2024-03-21 04:06:00,661][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000027082_887422976.pth [2024-03-21 04:06:05,521][03784] Fps is (10 sec: 29491.2, 60 sec: 44236.8, 300 sec: 46541.7). Total num frames: 898695168. Throughput: 0: 46655.6. Samples: 900067700. Policy #0 lag: (min: 0.0, avg: 53.2, max: 85.0) [2024-03-21 04:06:05,522][03784] Avg episode reward: [(0, '0.478')] [2024-03-21 04:06:06,056][04017] Updated weights for policy 0, policy_version 27427 (0.0024) [2024-03-21 04:06:10,521][03784] Fps is (10 sec: 36045.1, 60 sec: 46967.6, 300 sec: 46319.5). Total num frames: 898957312. Throughput: 0: 46011.2. Samples: 900349000. Policy #0 lag: (min: 0.0, avg: 25.1, max: 75.0) [2024-03-21 04:06:10,522][03784] Avg episode reward: [(0, '0.954')] [2024-03-21 04:06:14,354][04017] Updated weights for policy 0, policy_version 27437 (0.0015) [2024-03-21 04:06:15,521][03784] Fps is (10 sec: 42598.1, 60 sec: 46421.3, 300 sec: 46430.6). Total num frames: 899121152. Throughput: 0: 46519.9. Samples: 900504300. Policy #0 lag: (min: 0.0, avg: 25.1, max: 75.0) [2024-03-21 04:06:15,522][03784] Avg episode reward: [(0, '1.039')] [2024-03-21 04:06:20,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46421.4, 300 sec: 46430.6). Total num frames: 899317760. Throughput: 0: 46969.0. Samples: 900782000. Policy #0 lag: (min: 0.0, avg: 25.1, max: 75.0) [2024-03-21 04:06:20,522][03784] Avg episode reward: [(0, '1.295')] [2024-03-21 04:06:21,780][04017] Updated weights for policy 0, policy_version 27447 (0.0016) [2024-03-21 04:06:25,521][03784] Fps is (10 sec: 52429.4, 60 sec: 48059.8, 300 sec: 46541.7). Total num frames: 899645440. Throughput: 0: 46962.3. Samples: 901049000. Policy #0 lag: (min: 0.0, avg: 25.1, max: 75.0) [2024-03-21 04:06:25,522][03784] Avg episode reward: [(0, '0.854')] [2024-03-21 04:06:26,029][04017] Updated weights for policy 0, policy_version 27457 (0.0013) [2024-03-21 04:06:30,521][03784] Fps is (10 sec: 65535.2, 60 sec: 46967.4, 300 sec: 47208.1). Total num frames: 899973120. Throughput: 0: 47064.4. Samples: 901180800. Policy #0 lag: (min: 0.0, avg: 39.6, max: 102.0) [2024-03-21 04:06:30,522][03784] Avg episode reward: [(0, '0.846')] [2024-03-21 04:06:31,342][04017] Updated weights for policy 0, policy_version 27467 (0.0015) [2024-03-21 04:06:33,315][03995] Signal inference workers to stop experience collection... (18100 times) [2024-03-21 04:06:33,401][04017] InferenceWorker_p0-w0: stopping experience collection (18100 times) [2024-03-21 04:06:33,544][03995] Signal inference workers to resume experience collection... (18100 times) [2024-03-21 04:06:33,544][04017] InferenceWorker_p0-w0: resuming experience collection (18100 times) [2024-03-21 04:06:35,521][03784] Fps is (10 sec: 65535.3, 60 sec: 46421.4, 300 sec: 47874.6). Total num frames: 900300800. Throughput: 0: 47013.3. Samples: 901463200. Policy #0 lag: (min: 0.0, avg: 39.6, max: 102.0) [2024-03-21 04:06:35,522][03784] Avg episode reward: [(0, '1.174')] [2024-03-21 04:06:36,589][04017] Updated weights for policy 0, policy_version 27477 (0.0025) [2024-03-21 04:06:40,521][03784] Fps is (10 sec: 55705.9, 60 sec: 45875.1, 300 sec: 48207.8). Total num frames: 900530176. Throughput: 0: 47228.8. Samples: 901756400. Policy #0 lag: (min: 0.0, avg: 39.6, max: 102.0) [2024-03-21 04:06:40,522][03784] Avg episode reward: [(0, '1.287')] [2024-03-21 04:06:43,365][04017] Updated weights for policy 0, policy_version 27487 (0.0010) [2024-03-21 04:06:45,521][03784] Fps is (10 sec: 52428.0, 60 sec: 45875.1, 300 sec: 47874.6). Total num frames: 900825088. Throughput: 0: 47684.3. Samples: 901907300. Policy #0 lag: (min: 0.0, avg: 39.6, max: 102.0) [2024-03-21 04:06:45,522][03784] Avg episode reward: [(0, '1.515')] [2024-03-21 04:06:50,049][04017] Updated weights for policy 0, policy_version 27497 (0.0011) [2024-03-21 04:06:50,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48059.7, 300 sec: 47652.4). Total num frames: 901054464. Throughput: 0: 47586.6. Samples: 902209100. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 04:06:50,522][03784] Avg episode reward: [(0, '1.515')] [2024-03-21 04:06:55,521][03784] Fps is (10 sec: 42599.1, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 901251072. Throughput: 0: 47975.5. Samples: 902507900. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 04:06:55,522][03784] Avg episode reward: [(0, '0.831')] [2024-03-21 04:06:57,714][04017] Updated weights for policy 0, policy_version 27507 (0.0011) [2024-03-21 04:07:00,521][03784] Fps is (10 sec: 39321.6, 60 sec: 47513.6, 300 sec: 47208.1). Total num frames: 901447680. Throughput: 0: 47588.9. Samples: 902645800. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 04:07:00,522][03784] Avg episode reward: [(0, '1.080')] [2024-03-21 04:07:04,530][04017] Updated weights for policy 0, policy_version 27517 (0.0015) [2024-03-21 04:07:05,521][03784] Fps is (10 sec: 52428.8, 60 sec: 51336.5, 300 sec: 47319.2). Total num frames: 901775360. Throughput: 0: 48188.9. Samples: 902950500. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 04:07:05,522][03784] Avg episode reward: [(0, '1.080')] [2024-03-21 04:07:10,008][04017] Updated weights for policy 0, policy_version 27527 (0.0012) [2024-03-21 04:07:10,521][03784] Fps is (10 sec: 58982.1, 60 sec: 51336.4, 300 sec: 47208.1). Total num frames: 902037504. Throughput: 0: 48226.4. Samples: 903219200. Policy #0 lag: (min: 0.0, avg: 43.1, max: 86.0) [2024-03-21 04:07:10,522][03784] Avg episode reward: [(0, '1.269')] [2024-03-21 04:07:15,521][03784] Fps is (10 sec: 39321.7, 60 sec: 50790.5, 300 sec: 46874.9). Total num frames: 902168576. Throughput: 0: 48355.7. Samples: 903356800. Policy #0 lag: (min: 0.0, avg: 43.1, max: 86.0) [2024-03-21 04:07:15,522][03784] Avg episode reward: [(0, '1.314')] [2024-03-21 04:07:20,521][03784] Fps is (10 sec: 22937.6, 60 sec: 49151.9, 300 sec: 46430.6). Total num frames: 902266880. Throughput: 0: 48606.6. Samples: 903650500. Policy #0 lag: (min: 0.0, avg: 43.1, max: 86.0) [2024-03-21 04:07:20,522][03784] Avg episode reward: [(0, '0.440')] [2024-03-21 04:07:20,919][04017] Updated weights for policy 0, policy_version 27537 (0.0010) [2024-03-21 04:07:25,521][03784] Fps is (10 sec: 32768.0, 60 sec: 47513.5, 300 sec: 46874.9). Total num frames: 902496256. Throughput: 0: 48489.0. Samples: 903938400. Policy #0 lag: (min: 0.0, avg: 43.1, max: 86.0) [2024-03-21 04:07:25,522][03784] Avg episode reward: [(0, '1.254')] [2024-03-21 04:07:30,521][03784] Fps is (10 sec: 32768.4, 60 sec: 43690.7, 300 sec: 46541.7). Total num frames: 902594560. Throughput: 0: 48433.5. Samples: 904086800. Policy #0 lag: (min: 0.0, avg: 26.4, max: 60.0) [2024-03-21 04:07:30,522][03784] Avg episode reward: [(0, '0.961')] [2024-03-21 04:07:31,784][04017] Updated weights for policy 0, policy_version 27547 (0.0015) [2024-03-21 04:07:35,521][03784] Fps is (10 sec: 29491.0, 60 sec: 41506.1, 300 sec: 46986.0). Total num frames: 902791168. Throughput: 0: 48011.1. Samples: 904369600. Policy #0 lag: (min: 0.0, avg: 26.4, max: 60.0) [2024-03-21 04:07:35,522][03784] Avg episode reward: [(0, '1.032')] [2024-03-21 04:07:36,021][03995] Signal inference workers to stop experience collection... (18150 times) [2024-03-21 04:07:36,022][03995] Signal inference workers to resume experience collection... (18150 times) [2024-03-21 04:07:36,115][04017] InferenceWorker_p0-w0: stopping experience collection (18150 times) [2024-03-21 04:07:36,115][04017] InferenceWorker_p0-w0: resuming experience collection (18150 times) [2024-03-21 04:07:37,236][04017] Updated weights for policy 0, policy_version 27557 (0.0011) [2024-03-21 04:07:40,521][03784] Fps is (10 sec: 55705.8, 60 sec: 43690.7, 300 sec: 46986.0). Total num frames: 903151616. Throughput: 0: 47962.2. Samples: 904666200. Policy #0 lag: (min: 0.0, avg: 26.4, max: 60.0) [2024-03-21 04:07:40,522][03784] Avg episode reward: [(0, '1.032')] [2024-03-21 04:07:42,760][04017] Updated weights for policy 0, policy_version 27567 (0.0010) [2024-03-21 04:07:45,521][03784] Fps is (10 sec: 65536.3, 60 sec: 43690.8, 300 sec: 47319.2). Total num frames: 903446528. Throughput: 0: 47664.5. Samples: 904790700. Policy #0 lag: (min: 0.0, avg: 33.3, max: 70.0) [2024-03-21 04:07:45,522][03784] Avg episode reward: [(0, '1.032')] [2024-03-21 04:07:48,064][04017] Updated weights for policy 0, policy_version 27577 (0.0018) [2024-03-21 04:07:50,521][03784] Fps is (10 sec: 62258.4, 60 sec: 45329.0, 300 sec: 47208.1). Total num frames: 903774208. Throughput: 0: 47153.2. Samples: 905072400. Policy #0 lag: (min: 0.0, avg: 33.3, max: 70.0) [2024-03-21 04:07:50,522][03784] Avg episode reward: [(0, '1.603')] [2024-03-21 04:07:53,041][04017] Updated weights for policy 0, policy_version 27587 (0.0011) [2024-03-21 04:07:55,521][03784] Fps is (10 sec: 68812.6, 60 sec: 48059.7, 300 sec: 47985.7). Total num frames: 904134656. Throughput: 0: 46777.9. Samples: 905324200. Policy #0 lag: (min: 0.0, avg: 33.3, max: 70.0) [2024-03-21 04:07:55,522][03784] Avg episode reward: [(0, '1.221')] [2024-03-21 04:07:56,878][04017] Updated weights for policy 0, policy_version 27597 (0.0014) [2024-03-21 04:08:00,521][03784] Fps is (10 sec: 81920.6, 60 sec: 52428.8, 300 sec: 48541.1). Total num frames: 904593408. Throughput: 0: 46168.8. Samples: 905434400. Policy #0 lag: (min: 0.0, avg: 33.3, max: 70.0) [2024-03-21 04:08:00,522][03784] Avg episode reward: [(0, '0.918')] [2024-03-21 04:08:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000027606_904593408.pth... [2024-03-21 04:08:00,683][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000027248_892862464.pth [2024-03-21 04:08:01,140][04017] Updated weights for policy 0, policy_version 27607 (0.0017) [2024-03-21 04:08:05,521][03784] Fps is (10 sec: 58982.7, 60 sec: 49152.0, 300 sec: 47985.7). Total num frames: 904724480. Throughput: 0: 46357.9. Samples: 905736600. Policy #0 lag: (min: 1.0, avg: 53.0, max: 79.0) [2024-03-21 04:08:05,522][03784] Avg episode reward: [(0, '0.918')] [2024-03-21 04:08:10,521][03784] Fps is (10 sec: 29491.2, 60 sec: 47513.7, 300 sec: 47319.2). Total num frames: 904888320. Throughput: 0: 46966.6. Samples: 906051900. Policy #0 lag: (min: 1.0, avg: 53.0, max: 79.0) [2024-03-21 04:08:10,522][03784] Avg episode reward: [(0, '1.304')] [2024-03-21 04:08:11,670][04017] Updated weights for policy 0, policy_version 27617 (0.0015) [2024-03-21 04:08:15,521][03784] Fps is (10 sec: 32767.8, 60 sec: 48059.7, 300 sec: 47097.1). Total num frames: 905052160. Throughput: 0: 46764.4. Samples: 906191200. Policy #0 lag: (min: 1.0, avg: 53.0, max: 79.0) [2024-03-21 04:08:15,522][03784] Avg episode reward: [(0, '0.858')] [2024-03-21 04:08:20,521][03784] Fps is (10 sec: 26214.4, 60 sec: 48059.8, 300 sec: 46652.7). Total num frames: 905150464. Throughput: 0: 46520.0. Samples: 906463000. Policy #0 lag: (min: 1.0, avg: 53.0, max: 79.0) [2024-03-21 04:08:20,522][03784] Avg episode reward: [(0, '0.859')] [2024-03-21 04:08:24,202][04017] Updated weights for policy 0, policy_version 27627 (0.0011) [2024-03-21 04:08:24,965][03995] Signal inference workers to stop experience collection... (18200 times) [2024-03-21 04:08:24,966][03995] Signal inference workers to resume experience collection... (18200 times) [2024-03-21 04:08:25,051][04017] InferenceWorker_p0-w0: stopping experience collection (18200 times) [2024-03-21 04:08:25,051][04017] InferenceWorker_p0-w0: resuming experience collection (18200 times) [2024-03-21 04:08:25,521][03784] Fps is (10 sec: 32768.2, 60 sec: 48059.7, 300 sec: 47097.1). Total num frames: 905379840. Throughput: 0: 46333.3. Samples: 906751200. Policy #0 lag: (min: 0.0, avg: 26.8, max: 68.0) [2024-03-21 04:08:25,522][03784] Avg episode reward: [(0, '1.001')] [2024-03-21 04:08:30,414][04017] Updated weights for policy 0, policy_version 27637 (0.0012) [2024-03-21 04:08:30,521][03784] Fps is (10 sec: 45875.5, 60 sec: 50244.3, 300 sec: 46652.8). Total num frames: 905609216. Throughput: 0: 46806.7. Samples: 906897000. Policy #0 lag: (min: 0.0, avg: 26.8, max: 68.0) [2024-03-21 04:08:30,522][03784] Avg episode reward: [(0, '0.988')] [2024-03-21 04:08:35,521][03784] Fps is (10 sec: 42598.4, 60 sec: 50244.3, 300 sec: 46986.0). Total num frames: 905805824. Throughput: 0: 46524.6. Samples: 907166000. Policy #0 lag: (min: 0.0, avg: 26.8, max: 68.0) [2024-03-21 04:08:35,522][03784] Avg episode reward: [(0, '1.211')] [2024-03-21 04:08:38,128][04017] Updated weights for policy 0, policy_version 27647 (0.0011) [2024-03-21 04:08:40,521][03784] Fps is (10 sec: 45874.9, 60 sec: 48605.8, 300 sec: 46986.0). Total num frames: 906067968. Throughput: 0: 46431.1. Samples: 907413600. Policy #0 lag: (min: 0.0, avg: 26.8, max: 68.0) [2024-03-21 04:08:40,522][03784] Avg episode reward: [(0, '1.561')] [2024-03-21 04:08:44,407][04017] Updated weights for policy 0, policy_version 27657 (0.0011) [2024-03-21 04:08:45,521][03784] Fps is (10 sec: 52429.0, 60 sec: 48059.8, 300 sec: 47097.1). Total num frames: 906330112. Throughput: 0: 46815.7. Samples: 907541100. Policy #0 lag: (min: 1.0, avg: 26.6, max: 49.0) [2024-03-21 04:08:45,522][03784] Avg episode reward: [(0, '1.144')] [2024-03-21 04:08:50,521][03784] Fps is (10 sec: 45875.5, 60 sec: 45875.3, 300 sec: 47319.2). Total num frames: 906526720. Throughput: 0: 46264.4. Samples: 907818500. Policy #0 lag: (min: 1.0, avg: 26.6, max: 49.0) [2024-03-21 04:08:50,522][03784] Avg episode reward: [(0, '0.996')] [2024-03-21 04:08:52,774][04017] Updated weights for policy 0, policy_version 27667 (0.0010) [2024-03-21 04:08:55,521][03784] Fps is (10 sec: 36044.7, 60 sec: 42598.5, 300 sec: 46874.9). Total num frames: 906690560. Throughput: 0: 45795.6. Samples: 908112700. Policy #0 lag: (min: 1.0, avg: 26.6, max: 49.0) [2024-03-21 04:08:55,522][03784] Avg episode reward: [(0, '1.547')] [2024-03-21 04:09:00,521][03784] Fps is (10 sec: 36044.7, 60 sec: 38229.3, 300 sec: 46874.9). Total num frames: 906887168. Throughput: 0: 46251.1. Samples: 908272500. Policy #0 lag: (min: 1.0, avg: 26.6, max: 49.0) [2024-03-21 04:09:00,522][03784] Avg episode reward: [(0, '1.547')] [2024-03-21 04:09:00,612][04017] Updated weights for policy 0, policy_version 27677 (0.0023) [2024-03-21 04:09:04,907][04017] Updated weights for policy 0, policy_version 27687 (0.0012) [2024-03-21 04:09:05,521][03784] Fps is (10 sec: 62258.7, 60 sec: 43144.5, 300 sec: 47763.5). Total num frames: 907313152. Throughput: 0: 46304.4. Samples: 908546700. Policy #0 lag: (min: 0.0, avg: 35.3, max: 112.0) [2024-03-21 04:09:05,522][03784] Avg episode reward: [(0, '1.208')] [2024-03-21 04:09:10,521][03784] Fps is (10 sec: 62258.8, 60 sec: 43690.6, 300 sec: 47763.5). Total num frames: 907509760. Throughput: 0: 46146.6. Samples: 908827800. Policy #0 lag: (min: 0.0, avg: 35.3, max: 112.0) [2024-03-21 04:09:10,522][03784] Avg episode reward: [(0, '1.242')] [2024-03-21 04:09:11,084][04017] Updated weights for policy 0, policy_version 27697 (0.0012) [2024-03-21 04:09:15,224][04017] Updated weights for policy 0, policy_version 27707 (0.0010) [2024-03-21 04:09:15,480][03995] Signal inference workers to stop experience collection... (18250 times) [2024-03-21 04:09:15,521][03784] Fps is (10 sec: 58982.7, 60 sec: 47513.6, 300 sec: 47763.5). Total num frames: 907902976. Throughput: 0: 46011.1. Samples: 908967500. Policy #0 lag: (min: 0.0, avg: 35.3, max: 112.0) [2024-03-21 04:09:15,522][03784] Avg episode reward: [(0, '1.316')] [2024-03-21 04:09:15,546][03995] Signal inference workers to resume experience collection... (18250 times) [2024-03-21 04:09:15,558][04017] InferenceWorker_p0-w0: stopping experience collection (18250 times) [2024-03-21 04:09:15,601][04017] InferenceWorker_p0-w0: resuming experience collection (18250 times) [2024-03-21 04:09:18,496][04017] Updated weights for policy 0, policy_version 27717 (0.0020) [2024-03-21 04:09:20,521][03784] Fps is (10 sec: 72089.8, 60 sec: 51336.5, 300 sec: 47430.3). Total num frames: 908230656. Throughput: 0: 46099.9. Samples: 909240500. Policy #0 lag: (min: 0.0, avg: 35.3, max: 112.0) [2024-03-21 04:09:20,522][03784] Avg episode reward: [(0, '1.316')] [2024-03-21 04:09:25,521][03784] Fps is (10 sec: 42597.6, 60 sec: 49151.9, 300 sec: 46652.7). Total num frames: 908328960. Throughput: 0: 47339.9. Samples: 909543900. Policy #0 lag: (min: 0.0, avg: 35.9, max: 92.0) [2024-03-21 04:09:25,522][03784] Avg episode reward: [(0, '1.139')] [2024-03-21 04:09:29,142][04017] Updated weights for policy 0, policy_version 27727 (0.0011) [2024-03-21 04:09:30,521][03784] Fps is (10 sec: 39321.6, 60 sec: 50244.2, 300 sec: 47319.2). Total num frames: 908623872. Throughput: 0: 47446.5. Samples: 909676200. Policy #0 lag: (min: 0.0, avg: 35.9, max: 92.0) [2024-03-21 04:09:30,522][03784] Avg episode reward: [(0, '1.100')] [2024-03-21 04:09:35,521][03784] Fps is (10 sec: 45875.6, 60 sec: 49698.0, 300 sec: 46986.0). Total num frames: 908787712. Throughput: 0: 47899.9. Samples: 909974000. Policy #0 lag: (min: 0.0, avg: 35.9, max: 92.0) [2024-03-21 04:09:35,522][03784] Avg episode reward: [(0, '1.384')] [2024-03-21 04:09:40,521][03784] Fps is (10 sec: 19661.0, 60 sec: 45875.3, 300 sec: 46097.3). Total num frames: 908820480. Throughput: 0: 48295.5. Samples: 910286000. Policy #0 lag: (min: 0.0, avg: 35.9, max: 92.0) [2024-03-21 04:09:40,522][03784] Avg episode reward: [(0, '1.110')] [2024-03-21 04:09:41,142][04017] Updated weights for policy 0, policy_version 27737 (0.0014) [2024-03-21 04:09:45,521][03784] Fps is (10 sec: 26214.4, 60 sec: 45329.0, 300 sec: 46652.7). Total num frames: 909049856. Throughput: 0: 47646.6. Samples: 910416600. Policy #0 lag: (min: 0.0, avg: 41.2, max: 82.0) [2024-03-21 04:09:45,522][03784] Avg episode reward: [(0, '0.478')] [2024-03-21 04:09:50,168][04017] Updated weights for policy 0, policy_version 27747 (0.0016) [2024-03-21 04:09:50,521][03784] Fps is (10 sec: 42597.8, 60 sec: 45329.0, 300 sec: 46874.9). Total num frames: 909246464. Throughput: 0: 48313.3. Samples: 910720800. Policy #0 lag: (min: 0.0, avg: 41.2, max: 82.0) [2024-03-21 04:09:50,522][03784] Avg episode reward: [(0, '1.031')] [2024-03-21 04:09:55,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45329.0, 300 sec: 46430.6). Total num frames: 909410304. Throughput: 0: 48671.2. Samples: 911018000. Policy #0 lag: (min: 0.0, avg: 41.2, max: 82.0) [2024-03-21 04:09:55,522][03784] Avg episode reward: [(0, '1.316')] [2024-03-21 04:09:57,081][04017] Updated weights for policy 0, policy_version 27757 (0.0011) [2024-03-21 04:10:00,521][03784] Fps is (10 sec: 58982.2, 60 sec: 49151.9, 300 sec: 46763.8). Total num frames: 909836288. Throughput: 0: 48077.6. Samples: 911131000. Policy #0 lag: (min: 6.0, avg: 31.4, max: 63.0) [2024-03-21 04:10:00,522][03784] Avg episode reward: [(0, '0.917')] [2024-03-21 04:10:00,788][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000027767_909869056.pth... [2024-03-21 04:10:00,799][04017] Updated weights for policy 0, policy_version 27767 (0.0018) [2024-03-21 04:10:00,920][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000027423_898596864.pth [2024-03-21 04:10:05,521][03784] Fps is (10 sec: 75366.3, 60 sec: 47513.6, 300 sec: 47541.4). Total num frames: 910163968. Throughput: 0: 48566.7. Samples: 911426000. Policy #0 lag: (min: 6.0, avg: 31.4, max: 63.0) [2024-03-21 04:10:05,522][03784] Avg episode reward: [(0, '0.917')] [2024-03-21 04:10:05,623][03995] Signal inference workers to stop experience collection... (18300 times) [2024-03-21 04:10:05,683][03995] Signal inference workers to resume experience collection... (18300 times) [2024-03-21 04:10:05,717][04017] InferenceWorker_p0-w0: stopping experience collection (18300 times) [2024-03-21 04:10:05,719][04017] Updated weights for policy 0, policy_version 27777 (0.0023) [2024-03-21 04:10:05,759][04017] InferenceWorker_p0-w0: resuming experience collection (18300 times) [2024-03-21 04:10:09,567][04017] Updated weights for policy 0, policy_version 27787 (0.0024) [2024-03-21 04:10:10,521][03784] Fps is (10 sec: 68813.8, 60 sec: 50244.3, 300 sec: 48096.8). Total num frames: 910524416. Throughput: 0: 47882.4. Samples: 911698600. Policy #0 lag: (min: 6.0, avg: 31.4, max: 63.0) [2024-03-21 04:10:10,522][03784] Avg episode reward: [(0, '1.033')] [2024-03-21 04:10:15,521][03784] Fps is (10 sec: 52429.3, 60 sec: 46421.4, 300 sec: 47985.7). Total num frames: 910688256. Throughput: 0: 47957.9. Samples: 911834300. Policy #0 lag: (min: 6.0, avg: 31.4, max: 63.0) [2024-03-21 04:10:15,522][03784] Avg episode reward: [(0, '1.367')] [2024-03-21 04:10:18,031][04017] Updated weights for policy 0, policy_version 27797 (0.0011) [2024-03-21 04:10:20,521][03784] Fps is (10 sec: 45874.8, 60 sec: 45875.2, 300 sec: 48207.8). Total num frames: 910983168. Throughput: 0: 47484.4. Samples: 912110800. Policy #0 lag: (min: 3.0, avg: 39.4, max: 67.0) [2024-03-21 04:10:20,522][03784] Avg episode reward: [(0, '1.210')] [2024-03-21 04:10:22,966][04017] Updated weights for policy 0, policy_version 27807 (0.0014) [2024-03-21 04:10:25,521][03784] Fps is (10 sec: 58982.3, 60 sec: 49152.2, 300 sec: 47874.6). Total num frames: 911278080. Throughput: 0: 46906.7. Samples: 912396800. Policy #0 lag: (min: 3.0, avg: 39.4, max: 67.0) [2024-03-21 04:10:25,522][03784] Avg episode reward: [(0, '1.248')] [2024-03-21 04:10:30,481][04017] Updated weights for policy 0, policy_version 27817 (0.0016) [2024-03-21 04:10:30,521][03784] Fps is (10 sec: 52429.2, 60 sec: 48059.8, 300 sec: 47430.3). Total num frames: 911507456. Throughput: 0: 47635.6. Samples: 912560200. Policy #0 lag: (min: 3.0, avg: 39.4, max: 67.0) [2024-03-21 04:10:30,522][03784] Avg episode reward: [(0, '1.248')] [2024-03-21 04:10:35,521][03784] Fps is (10 sec: 36044.5, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 911638528. Throughput: 0: 47053.4. Samples: 912838200. Policy #0 lag: (min: 3.0, avg: 39.4, max: 67.0) [2024-03-21 04:10:35,522][03784] Avg episode reward: [(0, '1.054')] [2024-03-21 04:10:40,521][03784] Fps is (10 sec: 26214.4, 60 sec: 49152.0, 300 sec: 46430.6). Total num frames: 911769600. Throughput: 0: 46904.5. Samples: 913128700. Policy #0 lag: (min: 1.0, avg: 30.7, max: 70.0) [2024-03-21 04:10:40,522][03784] Avg episode reward: [(0, '1.480')] [2024-03-21 04:10:44,466][04017] Updated weights for policy 0, policy_version 27827 (0.0016) [2024-03-21 04:10:45,521][03784] Fps is (10 sec: 29491.1, 60 sec: 48059.7, 300 sec: 46652.7). Total num frames: 911933440. Throughput: 0: 47797.8. Samples: 913281900. Policy #0 lag: (min: 1.0, avg: 30.7, max: 70.0) [2024-03-21 04:10:45,522][03784] Avg episode reward: [(0, '1.160')] [2024-03-21 04:10:47,664][04017] Updated weights for policy 0, policy_version 27837 (0.0016) [2024-03-21 04:10:50,521][03784] Fps is (10 sec: 55705.6, 60 sec: 51336.6, 300 sec: 47208.1). Total num frames: 912326656. Throughput: 0: 47217.8. Samples: 913550800. Policy #0 lag: (min: 1.0, avg: 30.7, max: 70.0) [2024-03-21 04:10:50,522][03784] Avg episode reward: [(0, '1.401')] [2024-03-21 04:10:55,521][03784] Fps is (10 sec: 52429.7, 60 sec: 50790.5, 300 sec: 46986.0). Total num frames: 912457728. Throughput: 0: 47728.9. Samples: 913846400. Policy #0 lag: (min: 0.0, avg: 37.4, max: 89.0) [2024-03-21 04:10:55,522][03784] Avg episode reward: [(0, '1.049')] [2024-03-21 04:10:55,623][04017] Updated weights for policy 0, policy_version 27847 (0.0011) [2024-03-21 04:11:00,097][03995] Signal inference workers to stop experience collection... (18350 times) [2024-03-21 04:11:00,175][03995] Signal inference workers to resume experience collection... (18350 times) [2024-03-21 04:11:00,181][04017] InferenceWorker_p0-w0: stopping experience collection (18350 times) [2024-03-21 04:11:00,214][04017] InferenceWorker_p0-w0: resuming experience collection (18350 times) [2024-03-21 04:11:00,521][03784] Fps is (10 sec: 36044.5, 60 sec: 47513.7, 300 sec: 47430.3). Total num frames: 912687104. Throughput: 0: 48146.5. Samples: 914000900. Policy #0 lag: (min: 0.0, avg: 37.4, max: 89.0) [2024-03-21 04:11:00,522][03784] Avg episode reward: [(0, '0.839')] [2024-03-21 04:11:02,686][04017] Updated weights for policy 0, policy_version 27857 (0.0011) [2024-03-21 04:11:05,521][03784] Fps is (10 sec: 39321.0, 60 sec: 44782.9, 300 sec: 47097.0). Total num frames: 912850944. Throughput: 0: 47902.2. Samples: 914266400. Policy #0 lag: (min: 0.0, avg: 37.4, max: 89.0) [2024-03-21 04:11:05,522][03784] Avg episode reward: [(0, '1.313')] [2024-03-21 04:11:08,984][04017] Updated weights for policy 0, policy_version 27867 (0.0026) [2024-03-21 04:11:10,521][03784] Fps is (10 sec: 45875.5, 60 sec: 43690.7, 300 sec: 47541.4). Total num frames: 913145856. Throughput: 0: 47464.4. Samples: 914532700. Policy #0 lag: (min: 0.0, avg: 37.4, max: 89.0) [2024-03-21 04:11:10,522][03784] Avg episode reward: [(0, '1.059')] [2024-03-21 04:11:15,316][04017] Updated weights for policy 0, policy_version 27877 (0.0012) [2024-03-21 04:11:15,521][03784] Fps is (10 sec: 62259.2, 60 sec: 46421.2, 300 sec: 47985.7). Total num frames: 913473536. Throughput: 0: 46871.0. Samples: 914669400. Policy #0 lag: (min: 2.0, avg: 45.1, max: 91.0) [2024-03-21 04:11:15,522][03784] Avg episode reward: [(0, '0.766')] [2024-03-21 04:11:20,521][03784] Fps is (10 sec: 58982.6, 60 sec: 45875.3, 300 sec: 47763.5). Total num frames: 913735680. Throughput: 0: 46813.4. Samples: 914944800. Policy #0 lag: (min: 2.0, avg: 45.1, max: 91.0) [2024-03-21 04:11:20,522][03784] Avg episode reward: [(0, '0.607')] [2024-03-21 04:11:21,457][04017] Updated weights for policy 0, policy_version 27887 (0.0014) [2024-03-21 04:11:25,521][03784] Fps is (10 sec: 49152.6, 60 sec: 44782.9, 300 sec: 47430.3). Total num frames: 913965056. Throughput: 0: 46646.7. Samples: 915227800. Policy #0 lag: (min: 2.0, avg: 45.1, max: 91.0) [2024-03-21 04:11:25,522][03784] Avg episode reward: [(0, '1.242')] [2024-03-21 04:11:30,015][04017] Updated weights for policy 0, policy_version 27897 (0.0009) [2024-03-21 04:11:30,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 46986.0). Total num frames: 914161664. Throughput: 0: 46515.7. Samples: 915375100. Policy #0 lag: (min: 2.0, avg: 45.1, max: 91.0) [2024-03-21 04:11:30,522][03784] Avg episode reward: [(0, '1.242')] [2024-03-21 04:11:34,811][04017] Updated weights for policy 0, policy_version 27907 (0.0012) [2024-03-21 04:11:35,521][03784] Fps is (10 sec: 55704.4, 60 sec: 48059.6, 300 sec: 47430.3). Total num frames: 914522112. Throughput: 0: 46530.9. Samples: 915644700. Policy #0 lag: (min: 2.0, avg: 31.6, max: 65.0) [2024-03-21 04:11:35,522][03784] Avg episode reward: [(0, '0.733')] [2024-03-21 04:11:40,521][03784] Fps is (10 sec: 49152.3, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 914653184. Throughput: 0: 46357.8. Samples: 915932500. Policy #0 lag: (min: 2.0, avg: 31.6, max: 65.0) [2024-03-21 04:11:40,522][03784] Avg episode reward: [(0, '1.471')] [2024-03-21 04:11:42,329][04017] Updated weights for policy 0, policy_version 27917 (0.0012) [2024-03-21 04:11:45,521][03784] Fps is (10 sec: 45876.4, 60 sec: 50790.5, 300 sec: 47208.2). Total num frames: 914980864. Throughput: 0: 45762.4. Samples: 916060200. Policy #0 lag: (min: 2.0, avg: 31.6, max: 65.0) [2024-03-21 04:11:45,522][03784] Avg episode reward: [(0, '1.021')] [2024-03-21 04:11:50,521][03784] Fps is (10 sec: 42597.8, 60 sec: 45875.1, 300 sec: 46874.9). Total num frames: 915079168. Throughput: 0: 46144.5. Samples: 916342900. Policy #0 lag: (min: 2.0, avg: 31.6, max: 65.0) [2024-03-21 04:11:50,522][03784] Avg episode reward: [(0, '1.449')] [2024-03-21 04:11:50,946][04017] Updated weights for policy 0, policy_version 27927 (0.0013) [2024-03-21 04:11:51,720][03995] Signal inference workers to stop experience collection... (18400 times) [2024-03-21 04:11:51,721][03995] Signal inference workers to resume experience collection... (18400 times) [2024-03-21 04:11:51,812][04017] InferenceWorker_p0-w0: stopping experience collection (18400 times) [2024-03-21 04:11:51,812][04017] InferenceWorker_p0-w0: resuming experience collection (18400 times) [2024-03-21 04:11:55,521][03784] Fps is (10 sec: 39321.6, 60 sec: 48605.9, 300 sec: 47208.2). Total num frames: 915374080. Throughput: 0: 45835.6. Samples: 916595300. Policy #0 lag: (min: 0.0, avg: 40.3, max: 95.0) [2024-03-21 04:11:55,522][03784] Avg episode reward: [(0, '0.582')] [2024-03-21 04:11:57,844][04017] Updated weights for policy 0, policy_version 27937 (0.0012) [2024-03-21 04:12:00,521][03784] Fps is (10 sec: 45875.2, 60 sec: 47513.6, 300 sec: 46652.7). Total num frames: 915537920. Throughput: 0: 45840.0. Samples: 916732200. Policy #0 lag: (min: 0.0, avg: 40.3, max: 95.0) [2024-03-21 04:12:00,522][03784] Avg episode reward: [(0, '1.210')] [2024-03-21 04:12:00,536][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000027940_915537920.pth... [2024-03-21 04:12:00,655][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000027606_904593408.pth [2024-03-21 04:12:05,521][03784] Fps is (10 sec: 36044.8, 60 sec: 48059.9, 300 sec: 46430.6). Total num frames: 915734528. Throughput: 0: 45426.7. Samples: 916989000. Policy #0 lag: (min: 0.0, avg: 40.3, max: 95.0) [2024-03-21 04:12:05,522][03784] Avg episode reward: [(0, '0.764')] [2024-03-21 04:12:06,769][04017] Updated weights for policy 0, policy_version 27947 (0.0012) [2024-03-21 04:12:10,521][03784] Fps is (10 sec: 32767.8, 60 sec: 45329.0, 300 sec: 46430.6). Total num frames: 915865600. Throughput: 0: 45177.6. Samples: 917260800. Policy #0 lag: (min: 0.0, avg: 40.3, max: 95.0) [2024-03-21 04:12:10,522][03784] Avg episode reward: [(0, '1.445')] [2024-03-21 04:12:15,521][03784] Fps is (10 sec: 32767.8, 60 sec: 43144.6, 300 sec: 46763.8). Total num frames: 916062208. Throughput: 0: 44942.2. Samples: 917397500. Policy #0 lag: (min: 0.0, avg: 23.1, max: 52.0) [2024-03-21 04:12:15,522][03784] Avg episode reward: [(0, '0.740')] [2024-03-21 04:12:19,257][04017] Updated weights for policy 0, policy_version 27957 (0.0016) [2024-03-21 04:12:20,521][03784] Fps is (10 sec: 32768.1, 60 sec: 40959.9, 300 sec: 46430.6). Total num frames: 916193280. Throughput: 0: 45909.0. Samples: 917710600. Policy #0 lag: (min: 0.0, avg: 23.1, max: 52.0) [2024-03-21 04:12:20,522][03784] Avg episode reward: [(0, '1.476')] [2024-03-21 04:12:22,769][04017] Updated weights for policy 0, policy_version 27967 (0.0016) [2024-03-21 04:12:25,521][03784] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 47430.3). Total num frames: 916586496. Throughput: 0: 45546.5. Samples: 917982100. Policy #0 lag: (min: 0.0, avg: 23.1, max: 52.0) [2024-03-21 04:12:25,522][03784] Avg episode reward: [(0, '0.778')] [2024-03-21 04:12:29,597][04017] Updated weights for policy 0, policy_version 27977 (0.0031) [2024-03-21 04:12:30,521][03784] Fps is (10 sec: 62259.9, 60 sec: 44236.8, 300 sec: 47541.4). Total num frames: 916815872. Throughput: 0: 45808.8. Samples: 918121600. Policy #0 lag: (min: 0.0, avg: 23.1, max: 52.0) [2024-03-21 04:12:30,522][03784] Avg episode reward: [(0, '1.004')] [2024-03-21 04:12:34,730][04017] Updated weights for policy 0, policy_version 27987 (0.0011) [2024-03-21 04:12:35,521][03784] Fps is (10 sec: 55706.6, 60 sec: 43690.9, 300 sec: 47430.3). Total num frames: 917143552. Throughput: 0: 45866.9. Samples: 918406900. Policy #0 lag: (min: 2.0, avg: 35.9, max: 78.0) [2024-03-21 04:12:35,521][03784] Avg episode reward: [(0, '1.220')] [2024-03-21 04:12:38,043][04017] Updated weights for policy 0, policy_version 27997 (0.0017) [2024-03-21 04:12:38,473][03995] Signal inference workers to stop experience collection... (18450 times) [2024-03-21 04:12:38,473][03995] Signal inference workers to resume experience collection... (18450 times) [2024-03-21 04:12:38,533][04017] InferenceWorker_p0-w0: stopping experience collection (18450 times) [2024-03-21 04:12:38,533][04017] InferenceWorker_p0-w0: resuming experience collection (18450 times) [2024-03-21 04:12:40,521][03784] Fps is (10 sec: 72088.8, 60 sec: 48059.6, 300 sec: 47763.5). Total num frames: 917536768. Throughput: 0: 45659.8. Samples: 918650000. Policy #0 lag: (min: 2.0, avg: 35.9, max: 78.0) [2024-03-21 04:12:40,522][03784] Avg episode reward: [(0, '0.861')] [2024-03-21 04:12:44,595][04017] Updated weights for policy 0, policy_version 28007 (0.0016) [2024-03-21 04:12:45,521][03784] Fps is (10 sec: 65534.7, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 917798912. Throughput: 0: 45935.5. Samples: 918799300. Policy #0 lag: (min: 2.0, avg: 35.9, max: 78.0) [2024-03-21 04:12:45,522][03784] Avg episode reward: [(0, '1.244')] [2024-03-21 04:12:50,521][03784] Fps is (10 sec: 45875.6, 60 sec: 48605.9, 300 sec: 46986.0). Total num frames: 917995520. Throughput: 0: 46542.2. Samples: 919083400. Policy #0 lag: (min: 2.0, avg: 35.9, max: 78.0) [2024-03-21 04:12:50,522][03784] Avg episode reward: [(0, '0.929')] [2024-03-21 04:12:51,792][04017] Updated weights for policy 0, policy_version 28017 (0.0010) [2024-03-21 04:12:55,521][03784] Fps is (10 sec: 36045.1, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 918159360. Throughput: 0: 47033.5. Samples: 919377300. Policy #0 lag: (min: 0.0, avg: 45.5, max: 80.0) [2024-03-21 04:12:55,522][03784] Avg episode reward: [(0, '1.491')] [2024-03-21 04:13:00,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46967.5, 300 sec: 46208.4). Total num frames: 918355968. Throughput: 0: 47159.9. Samples: 919519700. Policy #0 lag: (min: 0.0, avg: 45.5, max: 80.0) [2024-03-21 04:13:00,522][03784] Avg episode reward: [(0, '1.300')] [2024-03-21 04:13:00,887][04017] Updated weights for policy 0, policy_version 28027 (0.0010) [2024-03-21 04:13:05,521][03784] Fps is (10 sec: 45875.3, 60 sec: 48059.7, 300 sec: 46541.7). Total num frames: 918618112. Throughput: 0: 46404.5. Samples: 919798800. Policy #0 lag: (min: 0.0, avg: 45.5, max: 80.0) [2024-03-21 04:13:05,522][03784] Avg episode reward: [(0, '1.231')] [2024-03-21 04:13:09,490][04017] Updated weights for policy 0, policy_version 28037 (0.0016) [2024-03-21 04:13:10,521][03784] Fps is (10 sec: 42598.6, 60 sec: 48605.9, 300 sec: 46541.7). Total num frames: 918781952. Throughput: 0: 46673.4. Samples: 920082400. Policy #0 lag: (min: 0.0, avg: 33.5, max: 81.0) [2024-03-21 04:13:10,522][03784] Avg episode reward: [(0, '1.087')] [2024-03-21 04:13:15,521][03784] Fps is (10 sec: 26214.5, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 918880256. Throughput: 0: 46573.3. Samples: 920217400. Policy #0 lag: (min: 0.0, avg: 33.5, max: 81.0) [2024-03-21 04:13:15,522][03784] Avg episode reward: [(0, '1.330')] [2024-03-21 04:13:19,404][04017] Updated weights for policy 0, policy_version 28047 (0.0010) [2024-03-21 04:13:20,521][03784] Fps is (10 sec: 29490.9, 60 sec: 48059.7, 300 sec: 46430.6). Total num frames: 919076864. Throughput: 0: 47199.7. Samples: 920530900. Policy #0 lag: (min: 0.0, avg: 33.5, max: 81.0) [2024-03-21 04:13:20,522][03784] Avg episode reward: [(0, '1.045')] [2024-03-21 04:13:23,656][04017] Updated weights for policy 0, policy_version 28057 (0.0013) [2024-03-21 04:13:25,521][03784] Fps is (10 sec: 58982.4, 60 sec: 48059.8, 300 sec: 46986.0). Total num frames: 919470080. Throughput: 0: 47691.3. Samples: 920796100. Policy #0 lag: (min: 0.0, avg: 33.5, max: 81.0) [2024-03-21 04:13:25,522][03784] Avg episode reward: [(0, '0.669')] [2024-03-21 04:13:27,833][04017] Updated weights for policy 0, policy_version 28067 (0.0011) [2024-03-21 04:13:30,521][03784] Fps is (10 sec: 75367.0, 60 sec: 50244.2, 300 sec: 47541.4). Total num frames: 919830528. Throughput: 0: 47204.5. Samples: 920923500. Policy #0 lag: (min: 0.0, avg: 39.7, max: 109.0) [2024-03-21 04:13:30,522][03784] Avg episode reward: [(0, '0.930')] [2024-03-21 04:13:35,521][03784] Fps is (10 sec: 42598.2, 60 sec: 45875.1, 300 sec: 46874.9). Total num frames: 919896064. Throughput: 0: 47724.4. Samples: 921231000. Policy #0 lag: (min: 0.0, avg: 39.7, max: 109.0) [2024-03-21 04:13:35,522][03784] Avg episode reward: [(0, '0.853')] [2024-03-21 04:13:40,497][04017] Updated weights for policy 0, policy_version 28077 (0.0010) [2024-03-21 04:13:40,521][03784] Fps is (10 sec: 19660.7, 60 sec: 41506.1, 300 sec: 46430.6). Total num frames: 920027136. Throughput: 0: 48599.9. Samples: 921564300. Policy #0 lag: (min: 0.0, avg: 39.7, max: 109.0) [2024-03-21 04:13:40,522][03784] Avg episode reward: [(0, '0.853')] [2024-03-21 04:13:44,074][03995] Signal inference workers to stop experience collection... (18500 times) [2024-03-21 04:13:44,075][03995] Signal inference workers to resume experience collection... (18500 times) [2024-03-21 04:13:44,134][04017] InferenceWorker_p0-w0: stopping experience collection (18500 times) [2024-03-21 04:13:44,134][04017] InferenceWorker_p0-w0: resuming experience collection (18500 times) [2024-03-21 04:13:45,521][03784] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 46763.8). Total num frames: 920322048. Throughput: 0: 48564.6. Samples: 921705100. Policy #0 lag: (min: 0.0, avg: 39.7, max: 109.0) [2024-03-21 04:13:45,522][03784] Avg episode reward: [(0, '1.176')] [2024-03-21 04:13:45,837][04017] Updated weights for policy 0, policy_version 28087 (0.0015) [2024-03-21 04:13:49,364][04017] Updated weights for policy 0, policy_version 28098 (0.0011) [2024-03-21 04:13:50,521][03784] Fps is (10 sec: 75367.0, 60 sec: 46421.3, 300 sec: 47763.5). Total num frames: 920780800. Throughput: 0: 48095.5. Samples: 921963100. Policy #0 lag: (min: 8.0, avg: 42.9, max: 89.0) [2024-03-21 04:13:50,522][03784] Avg episode reward: [(0, '1.265')] [2024-03-21 04:13:54,258][04017] Updated weights for policy 0, policy_version 28108 (0.0024) [2024-03-21 04:13:55,521][03784] Fps is (10 sec: 81919.0, 60 sec: 49698.1, 300 sec: 48318.9). Total num frames: 921141248. Throughput: 0: 47771.1. Samples: 922232100. Policy #0 lag: (min: 8.0, avg: 42.9, max: 89.0) [2024-03-21 04:13:55,522][03784] Avg episode reward: [(0, '1.331')] [2024-03-21 04:14:00,521][03784] Fps is (10 sec: 55705.4, 60 sec: 49698.1, 300 sec: 47541.4). Total num frames: 921337856. Throughput: 0: 47906.5. Samples: 922373200. Policy #0 lag: (min: 8.0, avg: 42.9, max: 89.0) [2024-03-21 04:14:00,522][03784] Avg episode reward: [(0, '1.001')] [2024-03-21 04:14:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000028117_921337856.pth... [2024-03-21 04:14:00,649][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000027767_909869056.pth [2024-03-21 04:14:01,406][04017] Updated weights for policy 0, policy_version 28118 (0.0013) [2024-03-21 04:14:05,521][03784] Fps is (10 sec: 32767.9, 60 sec: 47513.5, 300 sec: 47319.2). Total num frames: 921468928. Throughput: 0: 48146.7. Samples: 922697500. Policy #0 lag: (min: 8.0, avg: 42.9, max: 89.0) [2024-03-21 04:14:05,522][03784] Avg episode reward: [(0, '1.001')] [2024-03-21 04:14:09,026][04017] Updated weights for policy 0, policy_version 28128 (0.0022) [2024-03-21 04:14:10,521][03784] Fps is (10 sec: 45875.5, 60 sec: 50244.3, 300 sec: 47097.1). Total num frames: 921796608. Throughput: 0: 48468.8. Samples: 922977200. Policy #0 lag: (min: 3.0, avg: 35.5, max: 80.0) [2024-03-21 04:14:10,522][03784] Avg episode reward: [(0, '0.895')] [2024-03-21 04:14:15,521][03784] Fps is (10 sec: 49152.7, 60 sec: 51336.5, 300 sec: 46541.7). Total num frames: 921960448. Throughput: 0: 49080.1. Samples: 923132100. Policy #0 lag: (min: 3.0, avg: 35.5, max: 80.0) [2024-03-21 04:14:15,522][03784] Avg episode reward: [(0, '0.895')] [2024-03-21 04:14:15,918][04017] Updated weights for policy 0, policy_version 28138 (0.0011) [2024-03-21 04:14:20,521][03784] Fps is (10 sec: 42598.2, 60 sec: 52428.9, 300 sec: 47097.1). Total num frames: 922222592. Throughput: 0: 48340.0. Samples: 923406300. Policy #0 lag: (min: 3.0, avg: 35.5, max: 80.0) [2024-03-21 04:14:20,522][03784] Avg episode reward: [(0, '1.194')] [2024-03-21 04:14:25,521][03784] Fps is (10 sec: 32767.6, 60 sec: 46967.4, 300 sec: 46319.5). Total num frames: 922288128. Throughput: 0: 47624.5. Samples: 923707400. Policy #0 lag: (min: 3.0, avg: 35.5, max: 80.0) [2024-03-21 04:14:25,522][03784] Avg episode reward: [(0, '0.622')] [2024-03-21 04:14:26,134][04017] Updated weights for policy 0, policy_version 28148 (0.0011) [2024-03-21 04:14:30,521][03784] Fps is (10 sec: 19660.9, 60 sec: 43144.6, 300 sec: 46208.4). Total num frames: 922419200. Throughput: 0: 47402.2. Samples: 923838200. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 04:14:30,522][03784] Avg episode reward: [(0, '1.113')] [2024-03-21 04:14:34,283][03995] Signal inference workers to stop experience collection... (18550 times) [2024-03-21 04:14:34,345][03995] Signal inference workers to resume experience collection... (18550 times) [2024-03-21 04:14:34,366][04017] InferenceWorker_p0-w0: stopping experience collection (18550 times) [2024-03-21 04:14:34,399][04017] Updated weights for policy 0, policy_version 28158 (0.0017) [2024-03-21 04:14:34,417][04017] InferenceWorker_p0-w0: resuming experience collection (18550 times) [2024-03-21 04:14:35,521][03784] Fps is (10 sec: 45876.0, 60 sec: 47513.7, 300 sec: 47208.1). Total num frames: 922746880. Throughput: 0: 47915.7. Samples: 924119300. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 04:14:35,522][03784] Avg episode reward: [(0, '1.154')] [2024-03-21 04:14:40,521][03784] Fps is (10 sec: 55705.9, 60 sec: 49152.1, 300 sec: 47208.2). Total num frames: 922976256. Throughput: 0: 47693.5. Samples: 924378300. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 04:14:40,522][03784] Avg episode reward: [(0, '1.081')] [2024-03-21 04:14:40,826][04017] Updated weights for policy 0, policy_version 28168 (0.0014) [2024-03-21 04:14:45,521][03784] Fps is (10 sec: 52428.6, 60 sec: 49152.0, 300 sec: 47541.4). Total num frames: 923271168. Throughput: 0: 47640.1. Samples: 924517000. Policy #0 lag: (min: 0.0, avg: 39.1, max: 94.0) [2024-03-21 04:14:45,522][03784] Avg episode reward: [(0, '0.740')] [2024-03-21 04:14:46,232][04017] Updated weights for policy 0, policy_version 28178 (0.0015) [2024-03-21 04:14:50,521][03784] Fps is (10 sec: 58982.1, 60 sec: 46421.3, 300 sec: 47985.7). Total num frames: 923566080. Throughput: 0: 46891.2. Samples: 924807600. Policy #0 lag: (min: 0.0, avg: 39.1, max: 94.0) [2024-03-21 04:14:50,522][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 04:14:51,605][04017] Updated weights for policy 0, policy_version 28188 (0.0018) [2024-03-21 04:14:55,521][03784] Fps is (10 sec: 58982.2, 60 sec: 45329.1, 300 sec: 47541.4). Total num frames: 923860992. Throughput: 0: 46888.9. Samples: 925087200. Policy #0 lag: (min: 0.0, avg: 39.1, max: 94.0) [2024-03-21 04:14:55,522][03784] Avg episode reward: [(0, '1.109')] [2024-03-21 04:14:59,768][04017] Updated weights for policy 0, policy_version 28198 (0.0010) [2024-03-21 04:15:00,521][03784] Fps is (10 sec: 49151.7, 60 sec: 45329.1, 300 sec: 47097.1). Total num frames: 924057600. Throughput: 0: 46846.5. Samples: 925240200. Policy #0 lag: (min: 0.0, avg: 39.1, max: 94.0) [2024-03-21 04:15:00,522][03784] Avg episode reward: [(0, '1.109')] [2024-03-21 04:15:05,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46421.4, 300 sec: 46541.7). Total num frames: 924254208. Throughput: 0: 46800.0. Samples: 925512300. Policy #0 lag: (min: 0.0, avg: 39.7, max: 79.0) [2024-03-21 04:15:05,522][03784] Avg episode reward: [(0, '1.512')] [2024-03-21 04:15:06,490][04017] Updated weights for policy 0, policy_version 28208 (0.0010) [2024-03-21 04:15:10,521][03784] Fps is (10 sec: 49152.1, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 924549120. Throughput: 0: 46517.8. Samples: 925800700. Policy #0 lag: (min: 0.0, avg: 39.7, max: 79.0) [2024-03-21 04:15:10,522][03784] Avg episode reward: [(0, '1.023')] [2024-03-21 04:15:11,338][04017] Updated weights for policy 0, policy_version 28218 (0.0011) [2024-03-21 04:15:15,521][03784] Fps is (10 sec: 62258.7, 60 sec: 48605.8, 300 sec: 47097.1). Total num frames: 924876800. Throughput: 0: 46377.7. Samples: 925925200. Policy #0 lag: (min: 0.0, avg: 39.7, max: 79.0) [2024-03-21 04:15:15,522][03784] Avg episode reward: [(0, '1.293')] [2024-03-21 04:15:19,067][04017] Updated weights for policy 0, policy_version 28228 (0.0015) [2024-03-21 04:15:20,521][03784] Fps is (10 sec: 52429.0, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 925073408. Throughput: 0: 45788.8. Samples: 926179800. Policy #0 lag: (min: 0.0, avg: 39.7, max: 79.0) [2024-03-21 04:15:20,522][03784] Avg episode reward: [(0, '1.202')] [2024-03-21 04:15:25,521][03784] Fps is (10 sec: 29491.5, 60 sec: 48059.8, 300 sec: 46319.5). Total num frames: 925171712. Throughput: 0: 46088.9. Samples: 926452300. Policy #0 lag: (min: 0.0, avg: 35.4, max: 80.0) [2024-03-21 04:15:25,522][03784] Avg episode reward: [(0, '1.216')] [2024-03-21 04:15:29,839][03995] Signal inference workers to stop experience collection... (18600 times) [2024-03-21 04:15:29,840][03995] Signal inference workers to resume experience collection... (18600 times) [2024-03-21 04:15:29,920][04017] InferenceWorker_p0-w0: stopping experience collection (18600 times) [2024-03-21 04:15:29,920][04017] InferenceWorker_p0-w0: resuming experience collection (18600 times) [2024-03-21 04:15:30,183][04017] Updated weights for policy 0, policy_version 28238 (0.0009) [2024-03-21 04:15:30,521][03784] Fps is (10 sec: 26213.8, 60 sec: 48605.7, 300 sec: 46430.6). Total num frames: 925335552. Throughput: 0: 46217.5. Samples: 926596800. Policy #0 lag: (min: 0.0, avg: 35.4, max: 80.0) [2024-03-21 04:15:30,522][03784] Avg episode reward: [(0, '1.072')] [2024-03-21 04:15:35,041][04017] Updated weights for policy 0, policy_version 28248 (0.0021) [2024-03-21 04:15:35,521][03784] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 47097.1). Total num frames: 925663232. Throughput: 0: 45300.1. Samples: 926846100. Policy #0 lag: (min: 0.0, avg: 35.4, max: 80.0) [2024-03-21 04:15:35,522][03784] Avg episode reward: [(0, '0.999')] [2024-03-21 04:15:40,521][03784] Fps is (10 sec: 55707.1, 60 sec: 48605.9, 300 sec: 47319.2). Total num frames: 925892608. Throughput: 0: 45593.3. Samples: 927138900. Policy #0 lag: (min: 0.0, avg: 35.4, max: 80.0) [2024-03-21 04:15:40,522][03784] Avg episode reward: [(0, '1.089')] [2024-03-21 04:15:45,521][03784] Fps is (10 sec: 26214.1, 60 sec: 44236.7, 300 sec: 46097.3). Total num frames: 925925376. Throughput: 0: 45664.5. Samples: 927295100. Policy #0 lag: (min: 0.0, avg: 35.4, max: 80.0) [2024-03-21 04:15:45,522][03784] Avg episode reward: [(0, '1.207')] [2024-03-21 04:15:45,796][04017] Updated weights for policy 0, policy_version 28258 (0.0020) [2024-03-21 04:15:50,521][03784] Fps is (10 sec: 26214.4, 60 sec: 43144.6, 300 sec: 46430.6). Total num frames: 926154752. Throughput: 0: 46346.7. Samples: 927597900. Policy #0 lag: (min: 0.0, avg: 28.9, max: 68.0) [2024-03-21 04:15:50,522][03784] Avg episode reward: [(0, '1.207')] [2024-03-21 04:15:55,292][04017] Updated weights for policy 0, policy_version 28268 (0.0012) [2024-03-21 04:15:55,521][03784] Fps is (10 sec: 36045.0, 60 sec: 40413.9, 300 sec: 46097.4). Total num frames: 926285824. Throughput: 0: 45842.3. Samples: 927863600. Policy #0 lag: (min: 0.0, avg: 28.9, max: 68.0) [2024-03-21 04:15:55,522][03784] Avg episode reward: [(0, '1.157')] [2024-03-21 04:15:59,817][04017] Updated weights for policy 0, policy_version 28278 (0.0012) [2024-03-21 04:16:00,521][03784] Fps is (10 sec: 52428.2, 60 sec: 43690.7, 300 sec: 46874.9). Total num frames: 926679040. Throughput: 0: 46111.1. Samples: 928000200. Policy #0 lag: (min: 0.0, avg: 28.9, max: 68.0) [2024-03-21 04:16:00,522][03784] Avg episode reward: [(0, '0.714')] [2024-03-21 04:16:00,807][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000028281_926711808.pth... [2024-03-21 04:16:00,937][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000027940_915537920.pth [2024-03-21 04:16:03,034][04017] Updated weights for policy 0, policy_version 28288 (0.0019) [2024-03-21 04:16:05,521][03784] Fps is (10 sec: 81919.7, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 927105024. Throughput: 0: 45904.4. Samples: 928245500. Policy #0 lag: (min: 4.0, avg: 49.3, max: 92.0) [2024-03-21 04:16:05,522][03784] Avg episode reward: [(0, '0.873')] [2024-03-21 04:16:08,493][04017] Updated weights for policy 0, policy_version 28298 (0.0014) [2024-03-21 04:16:10,521][03784] Fps is (10 sec: 68813.8, 60 sec: 46967.5, 300 sec: 47097.1). Total num frames: 927367168. Throughput: 0: 45980.0. Samples: 928521400. Policy #0 lag: (min: 4.0, avg: 49.3, max: 92.0) [2024-03-21 04:16:10,522][03784] Avg episode reward: [(0, '1.219')] [2024-03-21 04:16:15,521][03784] Fps is (10 sec: 45875.1, 60 sec: 44783.0, 300 sec: 46874.9). Total num frames: 927563776. Throughput: 0: 46142.4. Samples: 928673200. Policy #0 lag: (min: 4.0, avg: 49.3, max: 92.0) [2024-03-21 04:16:15,522][03784] Avg episode reward: [(0, '1.196')] [2024-03-21 04:16:15,705][03995] Signal inference workers to stop experience collection... (18650 times) [2024-03-21 04:16:15,773][03995] Signal inference workers to resume experience collection... (18650 times) [2024-03-21 04:16:15,783][04017] InferenceWorker_p0-w0: stopping experience collection (18650 times) [2024-03-21 04:16:15,785][04017] Updated weights for policy 0, policy_version 28308 (0.0020) [2024-03-21 04:16:15,838][04017] InferenceWorker_p0-w0: resuming experience collection (18650 times) [2024-03-21 04:16:20,521][03784] Fps is (10 sec: 52428.3, 60 sec: 46967.4, 300 sec: 47208.1). Total num frames: 927891456. Throughput: 0: 47051.0. Samples: 928963400. Policy #0 lag: (min: 4.0, avg: 49.3, max: 92.0) [2024-03-21 04:16:20,522][03784] Avg episode reward: [(0, '1.355')] [2024-03-21 04:16:21,053][04017] Updated weights for policy 0, policy_version 28318 (0.0010) [2024-03-21 04:16:25,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45875.2, 300 sec: 46652.7). Total num frames: 927924224. Throughput: 0: 47562.2. Samples: 929279200. Policy #0 lag: (min: 4.0, avg: 49.3, max: 92.0) [2024-03-21 04:16:25,522][03784] Avg episode reward: [(0, '1.355')] [2024-03-21 04:16:30,442][04017] Updated weights for policy 0, policy_version 28328 (0.0017) [2024-03-21 04:16:30,521][03784] Fps is (10 sec: 36044.8, 60 sec: 48606.0, 300 sec: 46541.7). Total num frames: 928251904. Throughput: 0: 47342.2. Samples: 929425500. Policy #0 lag: (min: 0.0, avg: 42.0, max: 80.0) [2024-03-21 04:16:30,522][03784] Avg episode reward: [(0, '1.419')] [2024-03-21 04:16:35,521][03784] Fps is (10 sec: 62259.4, 60 sec: 48059.7, 300 sec: 47097.0). Total num frames: 928546816. Throughput: 0: 46448.9. Samples: 929688100. Policy #0 lag: (min: 0.0, avg: 42.0, max: 80.0) [2024-03-21 04:16:35,522][03784] Avg episode reward: [(0, '1.419')] [2024-03-21 04:16:36,275][04017] Updated weights for policy 0, policy_version 28338 (0.0015) [2024-03-21 04:16:40,521][03784] Fps is (10 sec: 39321.4, 60 sec: 45875.1, 300 sec: 46319.5). Total num frames: 928645120. Throughput: 0: 46535.4. Samples: 929957700. Policy #0 lag: (min: 0.0, avg: 42.0, max: 80.0) [2024-03-21 04:16:40,522][03784] Avg episode reward: [(0, '1.169')] [2024-03-21 04:16:45,521][03784] Fps is (10 sec: 29491.1, 60 sec: 48605.9, 300 sec: 46652.8). Total num frames: 928841728. Throughput: 0: 46224.5. Samples: 930080300. Policy #0 lag: (min: 0.0, avg: 42.0, max: 80.0) [2024-03-21 04:16:45,522][03784] Avg episode reward: [(0, '0.514')] [2024-03-21 04:16:47,464][04017] Updated weights for policy 0, policy_version 28348 (0.0011) [2024-03-21 04:16:50,521][03784] Fps is (10 sec: 36044.8, 60 sec: 47513.5, 300 sec: 46208.4). Total num frames: 929005568. Throughput: 0: 47995.5. Samples: 930405300. Policy #0 lag: (min: 0.0, avg: 34.3, max: 81.0) [2024-03-21 04:16:50,522][03784] Avg episode reward: [(0, '0.836')] [2024-03-21 04:16:53,302][04017] Updated weights for policy 0, policy_version 28358 (0.0011) [2024-03-21 04:16:55,521][03784] Fps is (10 sec: 45875.4, 60 sec: 50244.3, 300 sec: 46652.8). Total num frames: 929300480. Throughput: 0: 48848.8. Samples: 930719600. Policy #0 lag: (min: 0.0, avg: 34.3, max: 81.0) [2024-03-21 04:16:55,522][03784] Avg episode reward: [(0, '0.836')] [2024-03-21 04:16:58,738][04017] Updated weights for policy 0, policy_version 28368 (0.0020) [2024-03-21 04:17:00,521][03784] Fps is (10 sec: 65536.0, 60 sec: 49698.1, 300 sec: 47208.1). Total num frames: 929660928. Throughput: 0: 48615.5. Samples: 930860900. Policy #0 lag: (min: 0.0, avg: 34.3, max: 81.0) [2024-03-21 04:17:00,523][03784] Avg episode reward: [(0, '0.496')] [2024-03-21 04:17:05,521][03784] Fps is (10 sec: 52428.9, 60 sec: 45329.1, 300 sec: 47319.2). Total num frames: 929824768. Throughput: 0: 48511.2. Samples: 931146400. Policy #0 lag: (min: 0.0, avg: 34.3, max: 81.0) [2024-03-21 04:17:05,522][03784] Avg episode reward: [(0, '0.969')] [2024-03-21 04:17:08,574][04017] Updated weights for policy 0, policy_version 28378 (0.0014) [2024-03-21 04:17:10,521][03784] Fps is (10 sec: 32768.2, 60 sec: 43690.6, 300 sec: 47208.1). Total num frames: 929988608. Throughput: 0: 47800.0. Samples: 931430200. Policy #0 lag: (min: 0.0, avg: 35.6, max: 87.0) [2024-03-21 04:17:10,522][03784] Avg episode reward: [(0, '1.292')] [2024-03-21 04:17:13,595][03995] Signal inference workers to stop experience collection... (18700 times) [2024-03-21 04:17:13,644][04017] InferenceWorker_p0-w0: stopping experience collection (18700 times) [2024-03-21 04:17:13,670][03995] Signal inference workers to resume experience collection... (18700 times) [2024-03-21 04:17:13,688][04017] InferenceWorker_p0-w0: resuming experience collection (18700 times) [2024-03-21 04:17:14,374][04017] Updated weights for policy 0, policy_version 28388 (0.0020) [2024-03-21 04:17:15,521][03784] Fps is (10 sec: 42598.2, 60 sec: 44783.0, 300 sec: 47652.5). Total num frames: 930250752. Throughput: 0: 47762.3. Samples: 931574800. Policy #0 lag: (min: 0.0, avg: 35.6, max: 87.0) [2024-03-21 04:17:15,522][03784] Avg episode reward: [(0, '1.319')] [2024-03-21 04:17:18,350][04017] Updated weights for policy 0, policy_version 28398 (0.0009) [2024-03-21 04:17:20,521][03784] Fps is (10 sec: 75366.7, 60 sec: 47513.6, 300 sec: 47985.7). Total num frames: 930742272. Throughput: 0: 47611.1. Samples: 931830600. Policy #0 lag: (min: 0.0, avg: 35.6, max: 87.0) [2024-03-21 04:17:20,522][03784] Avg episode reward: [(0, '1.221')] [2024-03-21 04:17:23,970][04017] Updated weights for policy 0, policy_version 28408 (0.0014) [2024-03-21 04:17:25,521][03784] Fps is (10 sec: 75366.0, 60 sec: 51336.5, 300 sec: 48096.7). Total num frames: 931004416. Throughput: 0: 48095.6. Samples: 932122000. Policy #0 lag: (min: 0.0, avg: 35.6, max: 87.0) [2024-03-21 04:17:25,522][03784] Avg episode reward: [(0, '1.221')] [2024-03-21 04:17:29,870][04017] Updated weights for policy 0, policy_version 28418 (0.0015) [2024-03-21 04:17:30,521][03784] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 47652.4). Total num frames: 931201024. Throughput: 0: 48602.3. Samples: 932267400. Policy #0 lag: (min: 0.0, avg: 43.9, max: 79.0) [2024-03-21 04:17:30,522][03784] Avg episode reward: [(0, '1.221')] [2024-03-21 04:17:35,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46967.4, 300 sec: 46874.9). Total num frames: 931364864. Throughput: 0: 48215.6. Samples: 932575000. Policy #0 lag: (min: 0.0, avg: 43.9, max: 79.0) [2024-03-21 04:17:35,522][03784] Avg episode reward: [(0, '0.610')] [2024-03-21 04:17:36,936][04017] Updated weights for policy 0, policy_version 28428 (0.0013) [2024-03-21 04:17:40,521][03784] Fps is (10 sec: 36044.8, 60 sec: 48605.9, 300 sec: 46652.8). Total num frames: 931561472. Throughput: 0: 47513.3. Samples: 932857700. Policy #0 lag: (min: 0.0, avg: 43.9, max: 79.0) [2024-03-21 04:17:40,522][03784] Avg episode reward: [(0, '1.213')] [2024-03-21 04:17:45,521][03784] Fps is (10 sec: 39321.6, 60 sec: 48605.8, 300 sec: 46652.7). Total num frames: 931758080. Throughput: 0: 47680.0. Samples: 933006500. Policy #0 lag: (min: 0.0, avg: 43.9, max: 79.0) [2024-03-21 04:17:45,522][03784] Avg episode reward: [(0, '1.113')] [2024-03-21 04:17:47,483][04017] Updated weights for policy 0, policy_version 28438 (0.0015) [2024-03-21 04:17:50,521][03784] Fps is (10 sec: 45875.0, 60 sec: 50244.3, 300 sec: 46986.0). Total num frames: 932020224. Throughput: 0: 47297.7. Samples: 933274800. Policy #0 lag: (min: 0.0, avg: 38.2, max: 89.0) [2024-03-21 04:17:50,522][03784] Avg episode reward: [(0, '1.113')] [2024-03-21 04:17:52,304][04017] Updated weights for policy 0, policy_version 28448 (0.0011) [2024-03-21 04:17:55,521][03784] Fps is (10 sec: 42598.4, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 932184064. Throughput: 0: 46615.5. Samples: 933527900. Policy #0 lag: (min: 0.0, avg: 38.2, max: 89.0) [2024-03-21 04:17:55,522][03784] Avg episode reward: [(0, '0.543')] [2024-03-21 04:18:00,521][03784] Fps is (10 sec: 32768.1, 60 sec: 44783.0, 300 sec: 46541.7). Total num frames: 932347904. Throughput: 0: 46131.1. Samples: 933650700. Policy #0 lag: (min: 0.0, avg: 38.2, max: 89.0) [2024-03-21 04:18:00,522][03784] Avg episode reward: [(0, '1.360')] [2024-03-21 04:18:00,573][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000028454_932380672.pth... [2024-03-21 04:18:00,688][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000028117_921337856.pth [2024-03-21 04:18:01,570][03995] Signal inference workers to stop experience collection... (18750 times) [2024-03-21 04:18:01,571][03995] Signal inference workers to resume experience collection... (18750 times) [2024-03-21 04:18:01,652][04017] InferenceWorker_p0-w0: stopping experience collection (18750 times) [2024-03-21 04:18:01,653][04017] InferenceWorker_p0-w0: resuming experience collection (18750 times) [2024-03-21 04:18:04,677][04017] Updated weights for policy 0, policy_version 28458 (0.0013) [2024-03-21 04:18:05,521][03784] Fps is (10 sec: 42598.6, 60 sec: 46421.3, 300 sec: 46874.9). Total num frames: 932610048. Throughput: 0: 47168.8. Samples: 933953200. Policy #0 lag: (min: 2.0, avg: 33.6, max: 76.0) [2024-03-21 04:18:05,522][03784] Avg episode reward: [(0, '0.760')] [2024-03-21 04:18:10,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46967.5, 300 sec: 47208.1). Total num frames: 932806656. Throughput: 0: 46628.9. Samples: 934220300. Policy #0 lag: (min: 2.0, avg: 33.6, max: 76.0) [2024-03-21 04:18:10,522][03784] Avg episode reward: [(0, '1.420')] [2024-03-21 04:18:10,671][04017] Updated weights for policy 0, policy_version 28468 (0.0011) [2024-03-21 04:18:15,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 933068800. Throughput: 0: 46533.3. Samples: 934361400. Policy #0 lag: (min: 2.0, avg: 33.6, max: 76.0) [2024-03-21 04:18:15,522][03784] Avg episode reward: [(0, '0.827')] [2024-03-21 04:18:17,254][04017] Updated weights for policy 0, policy_version 28478 (0.0011) [2024-03-21 04:18:20,521][03784] Fps is (10 sec: 58982.1, 60 sec: 44236.7, 300 sec: 47208.1). Total num frames: 933396480. Throughput: 0: 46428.9. Samples: 934664300. Policy #0 lag: (min: 2.0, avg: 33.6, max: 76.0) [2024-03-21 04:18:20,522][03784] Avg episode reward: [(0, '0.827')] [2024-03-21 04:18:22,180][04017] Updated weights for policy 0, policy_version 28488 (0.0018) [2024-03-21 04:18:25,521][03784] Fps is (10 sec: 55706.0, 60 sec: 43690.7, 300 sec: 46763.8). Total num frames: 933625856. Throughput: 0: 46188.9. Samples: 934936200. Policy #0 lag: (min: 2.0, avg: 33.6, max: 76.0) [2024-03-21 04:18:25,522][03784] Avg episode reward: [(0, '1.421')] [2024-03-21 04:18:27,695][04017] Updated weights for policy 0, policy_version 28498 (0.0012) [2024-03-21 04:18:30,521][03784] Fps is (10 sec: 49152.4, 60 sec: 44782.9, 300 sec: 47430.3). Total num frames: 933888000. Throughput: 0: 45880.1. Samples: 935071100. Policy #0 lag: (min: 0.0, avg: 34.6, max: 68.0) [2024-03-21 04:18:30,522][03784] Avg episode reward: [(0, '1.283')] [2024-03-21 04:18:35,521][03784] Fps is (10 sec: 45874.8, 60 sec: 45329.1, 300 sec: 47652.5). Total num frames: 934084608. Throughput: 0: 46273.3. Samples: 935357100. Policy #0 lag: (min: 0.0, avg: 34.6, max: 68.0) [2024-03-21 04:18:35,522][03784] Avg episode reward: [(0, '1.219')] [2024-03-21 04:18:38,197][04017] Updated weights for policy 0, policy_version 28508 (0.0010) [2024-03-21 04:18:40,521][03784] Fps is (10 sec: 49152.4, 60 sec: 46967.5, 300 sec: 47652.5). Total num frames: 934379520. Throughput: 0: 46700.2. Samples: 935629400. Policy #0 lag: (min: 0.0, avg: 34.6, max: 68.0) [2024-03-21 04:18:40,521][03784] Avg episode reward: [(0, '1.038')] [2024-03-21 04:18:41,601][04017] Updated weights for policy 0, policy_version 28518 (0.0023) [2024-03-21 04:18:45,521][03784] Fps is (10 sec: 55705.9, 60 sec: 48059.8, 300 sec: 46986.0). Total num frames: 934641664. Throughput: 0: 46888.9. Samples: 935760700. Policy #0 lag: (min: 0.0, avg: 34.6, max: 68.0) [2024-03-21 04:18:45,522][03784] Avg episode reward: [(0, '1.096')] [2024-03-21 04:18:49,382][04017] Updated weights for policy 0, policy_version 28528 (0.0014) [2024-03-21 04:18:50,521][03784] Fps is (10 sec: 42597.9, 60 sec: 46421.3, 300 sec: 46319.5). Total num frames: 934805504. Throughput: 0: 46315.6. Samples: 936037400. Policy #0 lag: (min: 0.0, avg: 35.2, max: 73.0) [2024-03-21 04:18:50,522][03784] Avg episode reward: [(0, '1.245')] [2024-03-21 04:18:55,521][03784] Fps is (10 sec: 39321.6, 60 sec: 47513.7, 300 sec: 46430.6). Total num frames: 935034880. Throughput: 0: 46857.8. Samples: 936328900. Policy #0 lag: (min: 0.0, avg: 35.2, max: 73.0) [2024-03-21 04:18:55,522][03784] Avg episode reward: [(0, '1.027')] [2024-03-21 04:18:56,099][03995] Signal inference workers to stop experience collection... (18800 times) [2024-03-21 04:18:56,108][03995] Signal inference workers to resume experience collection... (18800 times) [2024-03-21 04:18:56,187][04017] InferenceWorker_p0-w0: stopping experience collection (18800 times) [2024-03-21 04:18:56,187][04017] InferenceWorker_p0-w0: resuming experience collection (18800 times) [2024-03-21 04:18:56,764][04017] Updated weights for policy 0, policy_version 28538 (0.0012) [2024-03-21 04:19:00,521][03784] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 46874.9). Total num frames: 935297024. Throughput: 0: 46924.6. Samples: 936473000. Policy #0 lag: (min: 0.0, avg: 35.2, max: 73.0) [2024-03-21 04:19:00,522][03784] Avg episode reward: [(0, '1.027')] [2024-03-21 04:19:03,009][04017] Updated weights for policy 0, policy_version 28548 (0.0015) [2024-03-21 04:19:05,521][03784] Fps is (10 sec: 49151.6, 60 sec: 48605.8, 300 sec: 46541.7). Total num frames: 935526400. Throughput: 0: 46242.2. Samples: 936745200. Policy #0 lag: (min: 0.0, avg: 35.2, max: 73.0) [2024-03-21 04:19:05,522][03784] Avg episode reward: [(0, '0.970')] [2024-03-21 04:19:10,521][03784] Fps is (10 sec: 42597.6, 60 sec: 48605.8, 300 sec: 46652.7). Total num frames: 935723008. Throughput: 0: 46626.6. Samples: 937034400. Policy #0 lag: (min: 0.0, avg: 30.6, max: 72.0) [2024-03-21 04:19:10,522][03784] Avg episode reward: [(0, '1.197')] [2024-03-21 04:19:11,816][04017] Updated weights for policy 0, policy_version 28558 (0.0013) [2024-03-21 04:19:15,521][03784] Fps is (10 sec: 45875.2, 60 sec: 48605.8, 300 sec: 46652.7). Total num frames: 935985152. Throughput: 0: 46935.5. Samples: 937183200. Policy #0 lag: (min: 0.0, avg: 30.6, max: 72.0) [2024-03-21 04:19:15,522][03784] Avg episode reward: [(0, '1.010')] [2024-03-21 04:19:17,713][04017] Updated weights for policy 0, policy_version 28568 (0.0010) [2024-03-21 04:19:20,521][03784] Fps is (10 sec: 45875.7, 60 sec: 46421.4, 300 sec: 47097.1). Total num frames: 936181760. Throughput: 0: 46726.8. Samples: 937459800. Policy #0 lag: (min: 0.0, avg: 30.6, max: 72.0) [2024-03-21 04:19:20,522][03784] Avg episode reward: [(0, '1.240')] [2024-03-21 04:19:25,272][04017] Updated weights for policy 0, policy_version 28578 (0.0010) [2024-03-21 04:19:25,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 936443904. Throughput: 0: 46995.4. Samples: 937744200. Policy #0 lag: (min: 1.0, avg: 41.8, max: 76.0) [2024-03-21 04:19:25,522][03784] Avg episode reward: [(0, '1.459')] [2024-03-21 04:19:30,521][03784] Fps is (10 sec: 49151.6, 60 sec: 46421.3, 300 sec: 47208.1). Total num frames: 936673280. Throughput: 0: 47526.6. Samples: 937899400. Policy #0 lag: (min: 1.0, avg: 41.8, max: 76.0) [2024-03-21 04:19:30,522][03784] Avg episode reward: [(0, '1.153')] [2024-03-21 04:19:33,501][04017] Updated weights for policy 0, policy_version 28588 (0.0015) [2024-03-21 04:19:35,521][03784] Fps is (10 sec: 36045.3, 60 sec: 45329.1, 300 sec: 46874.9). Total num frames: 936804352. Throughput: 0: 47748.9. Samples: 938186100. Policy #0 lag: (min: 1.0, avg: 41.8, max: 76.0) [2024-03-21 04:19:35,522][03784] Avg episode reward: [(0, '1.188')] [2024-03-21 04:19:40,521][03784] Fps is (10 sec: 39322.3, 60 sec: 44783.0, 300 sec: 46763.8). Total num frames: 937066496. Throughput: 0: 47746.8. Samples: 938477500. Policy #0 lag: (min: 1.0, avg: 41.8, max: 76.0) [2024-03-21 04:19:40,522][03784] Avg episode reward: [(0, '0.565')] [2024-03-21 04:19:40,877][04017] Updated weights for policy 0, policy_version 28598 (0.0011) [2024-03-21 04:19:45,521][03784] Fps is (10 sec: 58982.0, 60 sec: 45875.2, 300 sec: 46874.9). Total num frames: 937394176. Throughput: 0: 47566.5. Samples: 938613500. Policy #0 lag: (min: 0.0, avg: 39.2, max: 78.0) [2024-03-21 04:19:45,522][03784] Avg episode reward: [(0, '0.565')] [2024-03-21 04:19:46,011][04017] Updated weights for policy 0, policy_version 28609 (0.0011) [2024-03-21 04:19:50,521][03784] Fps is (10 sec: 65534.6, 60 sec: 48605.8, 300 sec: 46986.0). Total num frames: 937721856. Throughput: 0: 47720.0. Samples: 938892600. Policy #0 lag: (min: 0.0, avg: 39.2, max: 78.0) [2024-03-21 04:19:50,522][03784] Avg episode reward: [(0, '0.467')] [2024-03-21 04:19:51,797][04017] Updated weights for policy 0, policy_version 28619 (0.0012) [2024-03-21 04:19:52,141][03995] Signal inference workers to stop experience collection... (18850 times) [2024-03-21 04:19:52,217][04017] InferenceWorker_p0-w0: stopping experience collection (18850 times) [2024-03-21 04:19:52,425][03995] Signal inference workers to resume experience collection... (18850 times) [2024-03-21 04:19:52,425][04017] InferenceWorker_p0-w0: resuming experience collection (18850 times) [2024-03-21 04:19:55,521][03784] Fps is (10 sec: 62259.1, 60 sec: 49698.1, 300 sec: 47319.2). Total num frames: 938016768. Throughput: 0: 47515.6. Samples: 939172600. Policy #0 lag: (min: 0.0, avg: 39.2, max: 78.0) [2024-03-21 04:19:55,522][03784] Avg episode reward: [(0, '0.684')] [2024-03-21 04:19:58,596][04017] Updated weights for policy 0, policy_version 28629 (0.0020) [2024-03-21 04:20:00,521][03784] Fps is (10 sec: 55705.8, 60 sec: 49698.0, 300 sec: 47541.4). Total num frames: 938278912. Throughput: 0: 47553.4. Samples: 939323100. Policy #0 lag: (min: 0.0, avg: 39.2, max: 78.0) [2024-03-21 04:20:00,522][03784] Avg episode reward: [(0, '1.071')] [2024-03-21 04:20:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000028634_938278912.pth... [2024-03-21 04:20:00,660][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000028281_926711808.pth [2024-03-21 04:20:03,910][04017] Updated weights for policy 0, policy_version 28639 (0.0017) [2024-03-21 04:20:05,521][03784] Fps is (10 sec: 52429.0, 60 sec: 50244.3, 300 sec: 47430.3). Total num frames: 938541056. Throughput: 0: 47840.0. Samples: 939612600. Policy #0 lag: (min: 0.0, avg: 47.1, max: 117.0) [2024-03-21 04:20:05,522][03784] Avg episode reward: [(0, '0.968')] [2024-03-21 04:20:10,521][03784] Fps is (10 sec: 39322.1, 60 sec: 49152.1, 300 sec: 46763.9). Total num frames: 938672128. Throughput: 0: 48051.3. Samples: 939906500. Policy #0 lag: (min: 0.0, avg: 47.1, max: 117.0) [2024-03-21 04:20:10,522][03784] Avg episode reward: [(0, '0.577')] [2024-03-21 04:20:13,830][04017] Updated weights for policy 0, policy_version 28649 (0.0009) [2024-03-21 04:20:15,521][03784] Fps is (10 sec: 29491.2, 60 sec: 47513.7, 300 sec: 46652.8). Total num frames: 938835968. Throughput: 0: 47746.7. Samples: 940048000. Policy #0 lag: (min: 0.0, avg: 47.1, max: 117.0) [2024-03-21 04:20:15,522][03784] Avg episode reward: [(0, '1.444')] [2024-03-21 04:20:20,521][03784] Fps is (10 sec: 26214.3, 60 sec: 45875.2, 300 sec: 46652.8). Total num frames: 938934272. Throughput: 0: 48037.8. Samples: 940347800. Policy #0 lag: (min: 0.0, avg: 47.1, max: 117.0) [2024-03-21 04:20:20,522][03784] Avg episode reward: [(0, '0.676')] [2024-03-21 04:20:24,842][04017] Updated weights for policy 0, policy_version 28659 (0.0010) [2024-03-21 04:20:25,521][03784] Fps is (10 sec: 26214.4, 60 sec: 44236.9, 300 sec: 46652.8). Total num frames: 939098112. Throughput: 0: 48244.3. Samples: 940648500. Policy #0 lag: (min: 0.0, avg: 28.8, max: 70.0) [2024-03-21 04:20:25,522][03784] Avg episode reward: [(0, '0.908')] [2024-03-21 04:20:29,837][04017] Updated weights for policy 0, policy_version 28669 (0.0012) [2024-03-21 04:20:30,521][03784] Fps is (10 sec: 55704.1, 60 sec: 46967.3, 300 sec: 46874.9). Total num frames: 939491328. Throughput: 0: 48410.9. Samples: 940792000. Policy #0 lag: (min: 0.0, avg: 28.8, max: 70.0) [2024-03-21 04:20:30,522][03784] Avg episode reward: [(0, '1.103')] [2024-03-21 04:20:35,521][03784] Fps is (10 sec: 58982.0, 60 sec: 48059.7, 300 sec: 46763.8). Total num frames: 939687936. Throughput: 0: 48302.3. Samples: 941066200. Policy #0 lag: (min: 0.0, avg: 28.8, max: 70.0) [2024-03-21 04:20:35,522][03784] Avg episode reward: [(0, '1.258')] [2024-03-21 04:20:35,987][04017] Updated weights for policy 0, policy_version 28679 (0.0019) [2024-03-21 04:20:39,568][04017] Updated weights for policy 0, policy_version 28689 (0.0013) [2024-03-21 04:20:40,521][03784] Fps is (10 sec: 65536.7, 60 sec: 51336.3, 300 sec: 48207.8). Total num frames: 940146688. Throughput: 0: 47948.8. Samples: 941330300. Policy #0 lag: (min: 0.0, avg: 28.8, max: 70.0) [2024-03-21 04:20:40,522][03784] Avg episode reward: [(0, '1.151')] [2024-03-21 04:20:40,985][03995] Signal inference workers to stop experience collection... (18900 times) [2024-03-21 04:20:41,059][03995] Signal inference workers to resume experience collection... (18900 times) [2024-03-21 04:20:41,079][04017] InferenceWorker_p0-w0: stopping experience collection (18900 times) [2024-03-21 04:20:41,120][04017] InferenceWorker_p0-w0: resuming experience collection (18900 times) [2024-03-21 04:20:43,441][04017] Updated weights for policy 0, policy_version 28699 (0.0009) [2024-03-21 04:20:45,521][03784] Fps is (10 sec: 75366.3, 60 sec: 50790.4, 300 sec: 48430.0). Total num frames: 940441600. Throughput: 0: 47213.3. Samples: 941447700. Policy #0 lag: (min: 1.0, avg: 65.0, max: 128.0) [2024-03-21 04:20:45,522][03784] Avg episode reward: [(0, '0.774')] [2024-03-21 04:20:50,521][03784] Fps is (10 sec: 32768.2, 60 sec: 45875.2, 300 sec: 48096.7). Total num frames: 940474368. Throughput: 0: 47871.1. Samples: 941766800. Policy #0 lag: (min: 1.0, avg: 65.0, max: 128.0) [2024-03-21 04:20:50,522][03784] Avg episode reward: [(0, '1.255')] [2024-03-21 04:20:53,578][04017] Updated weights for policy 0, policy_version 28709 (0.0015) [2024-03-21 04:20:55,521][03784] Fps is (10 sec: 32768.0, 60 sec: 45875.2, 300 sec: 47763.5). Total num frames: 940769280. Throughput: 0: 47579.9. Samples: 942047600. Policy #0 lag: (min: 1.0, avg: 65.0, max: 128.0) [2024-03-21 04:20:55,522][03784] Avg episode reward: [(0, '0.920')] [2024-03-21 04:21:00,521][03784] Fps is (10 sec: 49151.5, 60 sec: 44782.9, 300 sec: 46986.0). Total num frames: 940965888. Throughput: 0: 47584.3. Samples: 942189300. Policy #0 lag: (min: 1.0, avg: 65.0, max: 128.0) [2024-03-21 04:21:00,522][03784] Avg episode reward: [(0, '0.900')] [2024-03-21 04:21:01,533][04017] Updated weights for policy 0, policy_version 28719 (0.0010) [2024-03-21 04:21:05,521][03784] Fps is (10 sec: 36044.9, 60 sec: 43144.5, 300 sec: 46652.7). Total num frames: 941129728. Throughput: 0: 47299.9. Samples: 942476300. Policy #0 lag: (min: 0.0, avg: 44.4, max: 112.0) [2024-03-21 04:21:05,522][03784] Avg episode reward: [(0, '0.670')] [2024-03-21 04:21:10,521][03784] Fps is (10 sec: 39322.2, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 941359104. Throughput: 0: 46357.8. Samples: 942734600. Policy #0 lag: (min: 0.0, avg: 44.4, max: 112.0) [2024-03-21 04:21:10,522][03784] Avg episode reward: [(0, '1.067')] [2024-03-21 04:21:11,269][04017] Updated weights for policy 0, policy_version 28729 (0.0012) [2024-03-21 04:21:15,521][03784] Fps is (10 sec: 45875.7, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 941588480. Throughput: 0: 45693.6. Samples: 942848200. Policy #0 lag: (min: 0.0, avg: 44.4, max: 112.0) [2024-03-21 04:21:15,522][03784] Avg episode reward: [(0, '0.687')] [2024-03-21 04:21:17,632][04017] Updated weights for policy 0, policy_version 28739 (0.0018) [2024-03-21 04:21:20,521][03784] Fps is (10 sec: 58982.3, 60 sec: 50244.2, 300 sec: 47541.4). Total num frames: 941948928. Throughput: 0: 45488.9. Samples: 943113200. Policy #0 lag: (min: 0.0, avg: 44.4, max: 112.0) [2024-03-21 04:21:20,522][03784] Avg episode reward: [(0, '0.914')] [2024-03-21 04:21:22,132][04017] Updated weights for policy 0, policy_version 28749 (0.0014) [2024-03-21 04:21:25,521][03784] Fps is (10 sec: 68812.5, 60 sec: 52974.9, 300 sec: 47541.4). Total num frames: 942276608. Throughput: 0: 45784.6. Samples: 943390600. Policy #0 lag: (min: 2.0, avg: 32.8, max: 72.0) [2024-03-21 04:21:25,522][03784] Avg episode reward: [(0, '1.416')] [2024-03-21 04:21:26,885][04017] Updated weights for policy 0, policy_version 28759 (0.0011) [2024-03-21 04:21:30,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49698.3, 300 sec: 47208.1). Total num frames: 942473216. Throughput: 0: 46344.5. Samples: 943533200. Policy #0 lag: (min: 2.0, avg: 32.8, max: 72.0) [2024-03-21 04:21:30,522][03784] Avg episode reward: [(0, '1.075')] [2024-03-21 04:21:35,521][03784] Fps is (10 sec: 39321.6, 60 sec: 49698.2, 300 sec: 47541.4). Total num frames: 942669824. Throughput: 0: 45451.2. Samples: 943812100. Policy #0 lag: (min: 2.0, avg: 32.8, max: 72.0) [2024-03-21 04:21:35,522][03784] Avg episode reward: [(0, '0.743')] [2024-03-21 04:21:37,165][04017] Updated weights for policy 0, policy_version 28769 (0.0015) [2024-03-21 04:21:40,521][03784] Fps is (10 sec: 29491.2, 60 sec: 43690.7, 300 sec: 47208.1). Total num frames: 942768128. Throughput: 0: 45800.0. Samples: 944108600. Policy #0 lag: (min: 2.0, avg: 32.8, max: 72.0) [2024-03-21 04:21:40,522][03784] Avg episode reward: [(0, '1.144')] [2024-03-21 04:21:43,183][03995] Signal inference workers to stop experience collection... (18950 times) [2024-03-21 04:21:43,195][03995] Signal inference workers to resume experience collection... (18950 times) [2024-03-21 04:21:43,239][04017] InferenceWorker_p0-w0: stopping experience collection (18950 times) [2024-03-21 04:21:43,286][04017] InferenceWorker_p0-w0: resuming experience collection (18950 times) [2024-03-21 04:21:45,521][03784] Fps is (10 sec: 32768.0, 60 sec: 42598.5, 300 sec: 47430.3). Total num frames: 942997504. Throughput: 0: 45800.2. Samples: 944250300. Policy #0 lag: (min: 0.0, avg: 34.8, max: 75.0) [2024-03-21 04:21:45,522][03784] Avg episode reward: [(0, '0.851')] [2024-03-21 04:21:45,564][04017] Updated weights for policy 0, policy_version 28779 (0.0012) [2024-03-21 04:21:50,521][03784] Fps is (10 sec: 32768.1, 60 sec: 43690.7, 300 sec: 46763.8). Total num frames: 943095808. Throughput: 0: 45760.0. Samples: 944535500. Policy #0 lag: (min: 0.0, avg: 34.8, max: 75.0) [2024-03-21 04:21:50,522][03784] Avg episode reward: [(0, '1.541')] [2024-03-21 04:21:55,020][04017] Updated weights for policy 0, policy_version 28789 (0.0017) [2024-03-21 04:21:55,521][03784] Fps is (10 sec: 36044.5, 60 sec: 43144.5, 300 sec: 46430.6). Total num frames: 943357952. Throughput: 0: 46435.5. Samples: 944824200. Policy #0 lag: (min: 0.0, avg: 34.8, max: 75.0) [2024-03-21 04:21:55,522][03784] Avg episode reward: [(0, '1.089')] [2024-03-21 04:22:00,521][03784] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 46541.6). Total num frames: 943554560. Throughput: 0: 47042.0. Samples: 944965100. Policy #0 lag: (min: 0.0, avg: 34.8, max: 75.0) [2024-03-21 04:22:00,522][03784] Avg episode reward: [(0, '1.161')] [2024-03-21 04:22:00,653][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000028796_943587328.pth... [2024-03-21 04:22:00,789][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000028454_932380672.pth [2024-03-21 04:22:03,261][04017] Updated weights for policy 0, policy_version 28799 (0.0019) [2024-03-21 04:22:05,521][03784] Fps is (10 sec: 49152.0, 60 sec: 45329.1, 300 sec: 46986.0). Total num frames: 943849472. Throughput: 0: 46633.3. Samples: 945211700. Policy #0 lag: (min: 0.0, avg: 30.0, max: 77.0) [2024-03-21 04:22:05,522][03784] Avg episode reward: [(0, '1.305')] [2024-03-21 04:22:07,491][04017] Updated weights for policy 0, policy_version 28809 (0.0018) [2024-03-21 04:22:10,521][03784] Fps is (10 sec: 75367.4, 60 sec: 49152.0, 300 sec: 47652.4). Total num frames: 944308224. Throughput: 0: 46246.6. Samples: 945471700. Policy #0 lag: (min: 0.0, avg: 30.0, max: 77.0) [2024-03-21 04:22:10,522][03784] Avg episode reward: [(0, '0.661')] [2024-03-21 04:22:10,702][04017] Updated weights for policy 0, policy_version 28819 (0.0024) [2024-03-21 04:22:15,348][04017] Updated weights for policy 0, policy_version 28829 (0.0019) [2024-03-21 04:22:15,521][03784] Fps is (10 sec: 81920.5, 60 sec: 51336.5, 300 sec: 47208.1). Total num frames: 944668672. Throughput: 0: 46275.6. Samples: 945615600. Policy #0 lag: (min: 0.0, avg: 30.0, max: 77.0) [2024-03-21 04:22:15,522][03784] Avg episode reward: [(0, '0.661')] [2024-03-21 04:22:20,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46967.5, 300 sec: 46652.8). Total num frames: 944766976. Throughput: 0: 46513.3. Samples: 945905200. Policy #0 lag: (min: 0.0, avg: 30.0, max: 77.0) [2024-03-21 04:22:20,522][03784] Avg episode reward: [(0, '1.116')] [2024-03-21 04:22:24,159][04017] Updated weights for policy 0, policy_version 28839 (0.0018) [2024-03-21 04:22:25,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46967.4, 300 sec: 47097.0). Total num frames: 945094656. Throughput: 0: 45764.4. Samples: 946168000. Policy #0 lag: (min: 0.0, avg: 42.1, max: 99.0) [2024-03-21 04:22:25,522][03784] Avg episode reward: [(0, '0.883')] [2024-03-21 04:22:30,521][03784] Fps is (10 sec: 45874.9, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 945225728. Throughput: 0: 45864.4. Samples: 946314200. Policy #0 lag: (min: 0.0, avg: 42.1, max: 99.0) [2024-03-21 04:22:30,522][03784] Avg episode reward: [(0, '0.715')] [2024-03-21 04:22:31,178][03995] Signal inference workers to stop experience collection... (19000 times) [2024-03-21 04:22:31,178][03995] Signal inference workers to resume experience collection... (19000 times) [2024-03-21 04:22:31,219][04017] InferenceWorker_p0-w0: stopping experience collection (19000 times) [2024-03-21 04:22:31,219][04017] InferenceWorker_p0-w0: resuming experience collection (19000 times) [2024-03-21 04:22:34,745][04017] Updated weights for policy 0, policy_version 28849 (0.0012) [2024-03-21 04:22:35,521][03784] Fps is (10 sec: 26214.4, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 945356800. Throughput: 0: 46426.6. Samples: 946624700. Policy #0 lag: (min: 0.0, avg: 42.1, max: 99.0) [2024-03-21 04:22:35,522][03784] Avg episode reward: [(0, '1.444')] [2024-03-21 04:22:40,521][03784] Fps is (10 sec: 22937.6, 60 sec: 44782.9, 300 sec: 46430.6). Total num frames: 945455104. Throughput: 0: 46722.2. Samples: 946926700. Policy #0 lag: (min: 0.0, avg: 42.1, max: 99.0) [2024-03-21 04:22:40,522][03784] Avg episode reward: [(0, '1.569')] [2024-03-21 04:22:43,837][04017] Updated weights for policy 0, policy_version 28859 (0.0012) [2024-03-21 04:22:45,521][03784] Fps is (10 sec: 29491.1, 60 sec: 44236.7, 300 sec: 46208.4). Total num frames: 945651712. Throughput: 0: 46664.5. Samples: 947065000. Policy #0 lag: (min: 0.0, avg: 30.6, max: 81.0) [2024-03-21 04:22:45,522][03784] Avg episode reward: [(0, '1.323')] [2024-03-21 04:22:50,521][03784] Fps is (10 sec: 49151.8, 60 sec: 47513.5, 300 sec: 46652.7). Total num frames: 945946624. Throughput: 0: 47580.0. Samples: 947352800. Policy #0 lag: (min: 0.0, avg: 30.6, max: 81.0) [2024-03-21 04:22:50,522][03784] Avg episode reward: [(0, '1.041')] [2024-03-21 04:22:50,884][04017] Updated weights for policy 0, policy_version 28869 (0.0011) [2024-03-21 04:22:55,466][04017] Updated weights for policy 0, policy_version 28879 (0.0010) [2024-03-21 04:22:55,521][03784] Fps is (10 sec: 65535.7, 60 sec: 49152.0, 300 sec: 47319.2). Total num frames: 946307072. Throughput: 0: 47986.6. Samples: 947631100. Policy #0 lag: (min: 0.0, avg: 30.6, max: 81.0) [2024-03-21 04:22:55,522][03784] Avg episode reward: [(0, '0.646')] [2024-03-21 04:23:00,001][04017] Updated weights for policy 0, policy_version 28889 (0.0011) [2024-03-21 04:23:00,521][03784] Fps is (10 sec: 72090.8, 60 sec: 51882.9, 300 sec: 47652.5). Total num frames: 946667520. Throughput: 0: 48026.8. Samples: 947776800. Policy #0 lag: (min: 0.0, avg: 30.6, max: 81.0) [2024-03-21 04:23:00,521][03784] Avg episode reward: [(0, '0.646')] [2024-03-21 04:23:05,341][04017] Updated weights for policy 0, policy_version 28899 (0.0010) [2024-03-21 04:23:05,521][03784] Fps is (10 sec: 65536.4, 60 sec: 51882.7, 300 sec: 47985.7). Total num frames: 946962432. Throughput: 0: 48066.6. Samples: 948068200. Policy #0 lag: (min: 0.0, avg: 46.2, max: 100.0) [2024-03-21 04:23:05,522][03784] Avg episode reward: [(0, '1.117')] [2024-03-21 04:23:10,521][03784] Fps is (10 sec: 52428.2, 60 sec: 48059.7, 300 sec: 47874.6). Total num frames: 947191808. Throughput: 0: 48902.2. Samples: 948368600. Policy #0 lag: (min: 0.0, avg: 46.2, max: 100.0) [2024-03-21 04:23:10,522][03784] Avg episode reward: [(0, '1.117')] [2024-03-21 04:23:12,951][04017] Updated weights for policy 0, policy_version 28909 (0.0017) [2024-03-21 04:23:15,521][03784] Fps is (10 sec: 49152.4, 60 sec: 46421.4, 300 sec: 47652.5). Total num frames: 947453952. Throughput: 0: 48857.9. Samples: 948512800. Policy #0 lag: (min: 0.0, avg: 46.2, max: 100.0) [2024-03-21 04:23:15,522][03784] Avg episode reward: [(0, '1.117')] [2024-03-21 04:23:20,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46967.4, 300 sec: 47319.2). Total num frames: 947585024. Throughput: 0: 48308.9. Samples: 948798600. Policy #0 lag: (min: 0.0, avg: 46.2, max: 100.0) [2024-03-21 04:23:20,522][03784] Avg episode reward: [(0, '0.741')] [2024-03-21 04:23:21,706][03995] Signal inference workers to stop experience collection... (19050 times) [2024-03-21 04:23:21,754][04017] InferenceWorker_p0-w0: stopping experience collection (19050 times) [2024-03-21 04:23:21,982][03995] Signal inference workers to resume experience collection... (19050 times) [2024-03-21 04:23:21,982][04017] InferenceWorker_p0-w0: resuming experience collection (19050 times) [2024-03-21 04:23:21,984][04017] Updated weights for policy 0, policy_version 28919 (0.0016) [2024-03-21 04:23:25,521][03784] Fps is (10 sec: 29491.0, 60 sec: 44236.8, 300 sec: 46986.0). Total num frames: 947748864. Throughput: 0: 48028.9. Samples: 949088000. Policy #0 lag: (min: 0.0, avg: 48.8, max: 128.0) [2024-03-21 04:23:25,522][03784] Avg episode reward: [(0, '1.203')] [2024-03-21 04:23:30,521][03784] Fps is (10 sec: 26214.5, 60 sec: 43690.7, 300 sec: 46652.8). Total num frames: 947847168. Throughput: 0: 48300.1. Samples: 949238500. Policy #0 lag: (min: 0.0, avg: 48.8, max: 128.0) [2024-03-21 04:23:30,522][03784] Avg episode reward: [(0, '1.371')] [2024-03-21 04:23:32,745][04017] Updated weights for policy 0, policy_version 28929 (0.0011) [2024-03-21 04:23:35,521][03784] Fps is (10 sec: 39321.8, 60 sec: 46421.4, 300 sec: 46652.7). Total num frames: 948142080. Throughput: 0: 48240.1. Samples: 949523600. Policy #0 lag: (min: 0.0, avg: 48.8, max: 128.0) [2024-03-21 04:23:35,522][03784] Avg episode reward: [(0, '1.188')] [2024-03-21 04:23:37,186][04017] Updated weights for policy 0, policy_version 28940 (0.0011) [2024-03-21 04:23:40,521][03784] Fps is (10 sec: 72089.4, 60 sec: 51882.7, 300 sec: 47208.1). Total num frames: 948568064. Throughput: 0: 48024.5. Samples: 949792200. Policy #0 lag: (min: 0.0, avg: 48.8, max: 128.0) [2024-03-21 04:23:40,522][03784] Avg episode reward: [(0, '1.550')] [2024-03-21 04:23:41,717][04017] Updated weights for policy 0, policy_version 28950 (0.0014) [2024-03-21 04:23:45,521][03784] Fps is (10 sec: 62259.1, 60 sec: 51882.7, 300 sec: 47319.2). Total num frames: 948764672. Throughput: 0: 48228.8. Samples: 949947100. Policy #0 lag: (min: 0.0, avg: 35.4, max: 74.0) [2024-03-21 04:23:45,522][03784] Avg episode reward: [(0, '1.550')] [2024-03-21 04:23:48,824][04017] Updated weights for policy 0, policy_version 28960 (0.0011) [2024-03-21 04:23:50,521][03784] Fps is (10 sec: 52428.9, 60 sec: 52428.8, 300 sec: 47652.4). Total num frames: 949092352. Throughput: 0: 47786.7. Samples: 950218600. Policy #0 lag: (min: 0.0, avg: 35.4, max: 74.0) [2024-03-21 04:23:50,522][03784] Avg episode reward: [(0, '0.899')] [2024-03-21 04:23:55,370][04017] Updated weights for policy 0, policy_version 28970 (0.0018) [2024-03-21 04:23:55,521][03784] Fps is (10 sec: 52427.8, 60 sec: 49698.1, 300 sec: 47430.2). Total num frames: 949288960. Throughput: 0: 47388.7. Samples: 950501100. Policy #0 lag: (min: 0.0, avg: 35.4, max: 74.0) [2024-03-21 04:23:55,522][03784] Avg episode reward: [(0, '0.615')] [2024-03-21 04:24:00,521][03784] Fps is (10 sec: 39321.8, 60 sec: 46967.4, 300 sec: 47319.2). Total num frames: 949485568. Throughput: 0: 47360.0. Samples: 950644000. Policy #0 lag: (min: 0.0, avg: 35.4, max: 74.0) [2024-03-21 04:24:00,522][03784] Avg episode reward: [(0, '1.669')] [2024-03-21 04:24:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000028976_949485568.pth... [2024-03-21 04:24:00,647][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000028634_938278912.pth [2024-03-21 04:24:03,754][04017] Updated weights for policy 0, policy_version 28980 (0.0010) [2024-03-21 04:24:05,521][03784] Fps is (10 sec: 49152.8, 60 sec: 46967.5, 300 sec: 47652.5). Total num frames: 949780480. Throughput: 0: 47822.2. Samples: 950950600. Policy #0 lag: (min: 1.0, avg: 34.8, max: 69.0) [2024-03-21 04:24:05,522][03784] Avg episode reward: [(0, '0.944')] [2024-03-21 04:24:10,074][04017] Updated weights for policy 0, policy_version 28990 (0.0012) [2024-03-21 04:24:10,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 47319.2). Total num frames: 949944320. Throughput: 0: 47828.9. Samples: 951240300. Policy #0 lag: (min: 1.0, avg: 34.8, max: 69.0) [2024-03-21 04:24:10,522][03784] Avg episode reward: [(0, '1.066')] [2024-03-21 04:24:15,521][03784] Fps is (10 sec: 22937.4, 60 sec: 42598.3, 300 sec: 46874.9). Total num frames: 950009856. Throughput: 0: 47991.0. Samples: 951398100. Policy #0 lag: (min: 1.0, avg: 34.8, max: 69.0) [2024-03-21 04:24:15,522][03784] Avg episode reward: [(0, '1.416')] [2024-03-21 04:24:20,521][03784] Fps is (10 sec: 29491.2, 60 sec: 44236.8, 300 sec: 46763.8). Total num frames: 950239232. Throughput: 0: 47893.3. Samples: 951678800. Policy #0 lag: (min: 1.0, avg: 34.8, max: 69.0) [2024-03-21 04:24:20,522][03784] Avg episode reward: [(0, '1.495')] [2024-03-21 04:24:20,551][04017] Updated weights for policy 0, policy_version 29000 (0.0020) [2024-03-21 04:24:20,911][03995] Signal inference workers to stop experience collection... (19100 times) [2024-03-21 04:24:20,911][03995] Signal inference workers to resume experience collection... (19100 times) [2024-03-21 04:24:20,993][04017] InferenceWorker_p0-w0: stopping experience collection (19100 times) [2024-03-21 04:24:20,993][04017] InferenceWorker_p0-w0: resuming experience collection (19100 times) [2024-03-21 04:24:25,521][03784] Fps is (10 sec: 55705.5, 60 sec: 46967.4, 300 sec: 47097.0). Total num frames: 950566912. Throughput: 0: 47711.0. Samples: 951939200. Policy #0 lag: (min: 1.0, avg: 38.4, max: 77.0) [2024-03-21 04:24:25,522][03784] Avg episode reward: [(0, '1.183')] [2024-03-21 04:24:25,741][04017] Updated weights for policy 0, policy_version 29010 (0.0015) [2024-03-21 04:24:30,195][04017] Updated weights for policy 0, policy_version 29020 (0.0012) [2024-03-21 04:24:30,521][03784] Fps is (10 sec: 68812.8, 60 sec: 51336.5, 300 sec: 47874.6). Total num frames: 950927360. Throughput: 0: 46813.3. Samples: 952053700. Policy #0 lag: (min: 1.0, avg: 38.4, max: 77.0) [2024-03-21 04:24:30,522][03784] Avg episode reward: [(0, '1.462')] [2024-03-21 04:24:35,521][03784] Fps is (10 sec: 65537.0, 60 sec: 51336.5, 300 sec: 47985.7). Total num frames: 951222272. Throughput: 0: 46564.5. Samples: 952314000. Policy #0 lag: (min: 1.0, avg: 38.4, max: 77.0) [2024-03-21 04:24:35,522][03784] Avg episode reward: [(0, '1.224')] [2024-03-21 04:24:37,912][04017] Updated weights for policy 0, policy_version 29030 (0.0019) [2024-03-21 04:24:40,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 951386112. Throughput: 0: 46662.4. Samples: 952600900. Policy #0 lag: (min: 1.0, avg: 38.4, max: 77.0) [2024-03-21 04:24:40,522][03784] Avg episode reward: [(0, '0.859')] [2024-03-21 04:24:44,204][04017] Updated weights for policy 0, policy_version 29040 (0.0015) [2024-03-21 04:24:45,521][03784] Fps is (10 sec: 42598.5, 60 sec: 48059.7, 300 sec: 47208.2). Total num frames: 951648256. Throughput: 0: 46731.1. Samples: 952746900. Policy #0 lag: (min: 0.0, avg: 55.2, max: 114.0) [2024-03-21 04:24:45,522][03784] Avg episode reward: [(0, '1.137')] [2024-03-21 04:24:50,478][04017] Updated weights for policy 0, policy_version 29050 (0.0022) [2024-03-21 04:24:50,521][03784] Fps is (10 sec: 52428.9, 60 sec: 46967.5, 300 sec: 47097.1). Total num frames: 951910400. Throughput: 0: 46106.7. Samples: 953025400. Policy #0 lag: (min: 0.0, avg: 55.2, max: 114.0) [2024-03-21 04:24:50,522][03784] Avg episode reward: [(0, '1.448')] [2024-03-21 04:24:55,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.6, 300 sec: 46874.9). Total num frames: 952107008. Throughput: 0: 45897.8. Samples: 953305700. Policy #0 lag: (min: 0.0, avg: 55.2, max: 114.0) [2024-03-21 04:24:55,522][03784] Avg episode reward: [(0, '1.125')] [2024-03-21 04:25:00,521][03784] Fps is (10 sec: 29490.9, 60 sec: 45329.0, 300 sec: 46319.5). Total num frames: 952205312. Throughput: 0: 45575.6. Samples: 953449000. Policy #0 lag: (min: 0.0, avg: 55.2, max: 114.0) [2024-03-21 04:25:00,522][03784] Avg episode reward: [(0, '1.217')] [2024-03-21 04:25:01,887][04017] Updated weights for policy 0, policy_version 29060 (0.0019) [2024-03-21 04:25:05,521][03784] Fps is (10 sec: 26214.4, 60 sec: 43144.6, 300 sec: 46430.6). Total num frames: 952369152. Throughput: 0: 45473.4. Samples: 953725100. Policy #0 lag: (min: 0.0, avg: 33.0, max: 83.0) [2024-03-21 04:25:05,522][03784] Avg episode reward: [(0, '1.292')] [2024-03-21 04:25:10,521][03784] Fps is (10 sec: 26214.9, 60 sec: 42052.3, 300 sec: 46208.4). Total num frames: 952467456. Throughput: 0: 45973.6. Samples: 954008000. Policy #0 lag: (min: 0.0, avg: 33.0, max: 83.0) [2024-03-21 04:25:10,522][03784] Avg episode reward: [(0, '1.459')] [2024-03-21 04:25:11,913][04017] Updated weights for policy 0, policy_version 29070 (0.0011) [2024-03-21 04:25:14,528][03995] Signal inference workers to stop experience collection... (19150 times) [2024-03-21 04:25:14,640][04017] InferenceWorker_p0-w0: stopping experience collection (19150 times) [2024-03-21 04:25:14,723][03995] Signal inference workers to resume experience collection... (19150 times) [2024-03-21 04:25:14,724][04017] InferenceWorker_p0-w0: resuming experience collection (19150 times) [2024-03-21 04:25:15,503][04017] Updated weights for policy 0, policy_version 29080 (0.0016) [2024-03-21 04:25:15,521][03784] Fps is (10 sec: 52427.8, 60 sec: 48059.7, 300 sec: 47319.2). Total num frames: 952893440. Throughput: 0: 46604.3. Samples: 954150900. Policy #0 lag: (min: 0.0, avg: 33.0, max: 83.0) [2024-03-21 04:25:15,522][03784] Avg episode reward: [(0, '0.857')] [2024-03-21 04:25:20,521][03784] Fps is (10 sec: 72088.7, 60 sec: 49152.0, 300 sec: 47763.5). Total num frames: 953188352. Throughput: 0: 46837.8. Samples: 954421700. Policy #0 lag: (min: 0.0, avg: 33.0, max: 83.0) [2024-03-21 04:25:20,522][03784] Avg episode reward: [(0, '0.938')] [2024-03-21 04:25:20,588][04017] Updated weights for policy 0, policy_version 29090 (0.0011) [2024-03-21 04:25:25,521][03784] Fps is (10 sec: 58983.9, 60 sec: 48606.1, 300 sec: 47430.3). Total num frames: 953483264. Throughput: 0: 46255.6. Samples: 954682400. Policy #0 lag: (min: 2.0, avg: 45.7, max: 95.0) [2024-03-21 04:25:25,522][03784] Avg episode reward: [(0, '1.223')] [2024-03-21 04:25:26,338][04017] Updated weights for policy 0, policy_version 29100 (0.0016) [2024-03-21 04:25:30,521][03784] Fps is (10 sec: 49152.0, 60 sec: 45875.2, 300 sec: 47430.3). Total num frames: 953679872. Throughput: 0: 46022.2. Samples: 954817900. Policy #0 lag: (min: 2.0, avg: 45.7, max: 95.0) [2024-03-21 04:25:30,522][03784] Avg episode reward: [(0, '0.685')] [2024-03-21 04:25:33,133][04017] Updated weights for policy 0, policy_version 29110 (0.0011) [2024-03-21 04:25:35,521][03784] Fps is (10 sec: 45874.8, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 953942016. Throughput: 0: 46071.1. Samples: 955098600. Policy #0 lag: (min: 2.0, avg: 45.7, max: 95.0) [2024-03-21 04:25:35,522][03784] Avg episode reward: [(0, '1.196')] [2024-03-21 04:25:39,405][04017] Updated weights for policy 0, policy_version 29120 (0.0011) [2024-03-21 04:25:40,521][03784] Fps is (10 sec: 52429.0, 60 sec: 46967.5, 300 sec: 46652.8). Total num frames: 954204160. Throughput: 0: 45751.1. Samples: 955364500. Policy #0 lag: (min: 2.0, avg: 45.7, max: 95.0) [2024-03-21 04:25:40,522][03784] Avg episode reward: [(0, '1.129')] [2024-03-21 04:25:45,521][03784] Fps is (10 sec: 55706.0, 60 sec: 47513.6, 300 sec: 47541.4). Total num frames: 954499072. Throughput: 0: 45913.5. Samples: 955515100. Policy #0 lag: (min: 0.0, avg: 37.4, max: 70.0) [2024-03-21 04:25:45,522][03784] Avg episode reward: [(0, '1.175')] [2024-03-21 04:25:48,040][04017] Updated weights for policy 0, policy_version 29130 (0.0010) [2024-03-21 04:25:50,521][03784] Fps is (10 sec: 36044.5, 60 sec: 44236.7, 300 sec: 46763.8). Total num frames: 954564608. Throughput: 0: 45855.5. Samples: 955788600. Policy #0 lag: (min: 0.0, avg: 37.4, max: 70.0) [2024-03-21 04:25:50,522][03784] Avg episode reward: [(0, '1.092')] [2024-03-21 04:25:55,521][03784] Fps is (10 sec: 26214.3, 60 sec: 44236.8, 300 sec: 46763.9). Total num frames: 954761216. Throughput: 0: 45755.5. Samples: 956067000. Policy #0 lag: (min: 0.0, avg: 37.4, max: 70.0) [2024-03-21 04:25:55,522][03784] Avg episode reward: [(0, '1.224')] [2024-03-21 04:25:58,001][04017] Updated weights for policy 0, policy_version 29140 (0.0015) [2024-03-21 04:26:00,521][03784] Fps is (10 sec: 39321.2, 60 sec: 45875.1, 300 sec: 46874.9). Total num frames: 954957824. Throughput: 0: 45955.6. Samples: 956218900. Policy #0 lag: (min: 0.0, avg: 37.4, max: 70.0) [2024-03-21 04:26:00,522][03784] Avg episode reward: [(0, '0.949')] [2024-03-21 04:26:00,768][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000029144_954990592.pth... [2024-03-21 04:26:00,869][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000028796_943587328.pth [2024-03-21 04:26:03,780][04017] Updated weights for policy 0, policy_version 29150 (0.0011) [2024-03-21 04:26:05,521][03784] Fps is (10 sec: 42597.9, 60 sec: 46967.4, 300 sec: 46874.9). Total num frames: 955187200. Throughput: 0: 46426.6. Samples: 956510900. Policy #0 lag: (min: 0.0, avg: 44.6, max: 115.0) [2024-03-21 04:26:05,522][03784] Avg episode reward: [(0, '1.385')] [2024-03-21 04:26:10,521][03784] Fps is (10 sec: 45876.0, 60 sec: 49151.9, 300 sec: 46874.9). Total num frames: 955416576. Throughput: 0: 47606.6. Samples: 956824700. Policy #0 lag: (min: 0.0, avg: 44.6, max: 115.0) [2024-03-21 04:26:10,522][03784] Avg episode reward: [(0, '1.385')] [2024-03-21 04:26:12,391][04017] Updated weights for policy 0, policy_version 29160 (0.0014) [2024-03-21 04:26:15,208][03995] Signal inference workers to stop experience collection... (19200 times) [2024-03-21 04:26:15,280][03995] Signal inference workers to resume experience collection... (19200 times) [2024-03-21 04:26:15,287][04017] InferenceWorker_p0-w0: stopping experience collection (19200 times) [2024-03-21 04:26:15,352][04017] InferenceWorker_p0-w0: resuming experience collection (19200 times) [2024-03-21 04:26:15,521][03784] Fps is (10 sec: 42598.8, 60 sec: 45329.2, 300 sec: 46319.5). Total num frames: 955613184. Throughput: 0: 47755.6. Samples: 956966900. Policy #0 lag: (min: 0.0, avg: 44.6, max: 115.0) [2024-03-21 04:26:15,522][03784] Avg episode reward: [(0, '0.858')] [2024-03-21 04:26:17,734][04017] Updated weights for policy 0, policy_version 29170 (0.0010) [2024-03-21 04:26:20,521][03784] Fps is (10 sec: 52427.9, 60 sec: 45875.1, 300 sec: 46319.5). Total num frames: 955940864. Throughput: 0: 47904.3. Samples: 957254300. Policy #0 lag: (min: 0.0, avg: 44.6, max: 115.0) [2024-03-21 04:26:20,522][03784] Avg episode reward: [(0, '0.858')] [2024-03-21 04:26:23,780][04017] Updated weights for policy 0, policy_version 29180 (0.0014) [2024-03-21 04:26:25,521][03784] Fps is (10 sec: 68811.2, 60 sec: 46967.2, 300 sec: 46874.9). Total num frames: 956301312. Throughput: 0: 48319.7. Samples: 957538900. Policy #0 lag: (min: 2.0, avg: 31.0, max: 58.0) [2024-03-21 04:26:25,522][03784] Avg episode reward: [(0, '0.584')] [2024-03-21 04:26:28,024][04017] Updated weights for policy 0, policy_version 29190 (0.0014) [2024-03-21 04:26:30,521][03784] Fps is (10 sec: 65536.6, 60 sec: 48605.8, 300 sec: 47208.1). Total num frames: 956596224. Throughput: 0: 48111.0. Samples: 957680100. Policy #0 lag: (min: 2.0, avg: 31.0, max: 58.0) [2024-03-21 04:26:30,522][03784] Avg episode reward: [(0, '0.584')] [2024-03-21 04:26:35,521][03784] Fps is (10 sec: 49153.1, 60 sec: 47513.6, 300 sec: 47541.4). Total num frames: 956792832. Throughput: 0: 48640.1. Samples: 957977400. Policy #0 lag: (min: 2.0, avg: 31.0, max: 58.0) [2024-03-21 04:26:35,522][03784] Avg episode reward: [(0, '1.347')] [2024-03-21 04:26:35,552][04017] Updated weights for policy 0, policy_version 29200 (0.0011) [2024-03-21 04:26:40,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45875.2, 300 sec: 47319.2). Total num frames: 956956672. Throughput: 0: 48848.8. Samples: 958265200. Policy #0 lag: (min: 2.0, avg: 31.0, max: 58.0) [2024-03-21 04:26:40,522][03784] Avg episode reward: [(0, '1.228')] [2024-03-21 04:26:43,835][04017] Updated weights for policy 0, policy_version 29210 (0.0011) [2024-03-21 04:26:45,521][03784] Fps is (10 sec: 42598.3, 60 sec: 45329.0, 300 sec: 47874.6). Total num frames: 957218816. Throughput: 0: 48762.4. Samples: 958413200. Policy #0 lag: (min: 1.0, avg: 39.3, max: 73.0) [2024-03-21 04:26:45,522][03784] Avg episode reward: [(0, '1.291')] [2024-03-21 04:26:49,593][04017] Updated weights for policy 0, policy_version 29220 (0.0010) [2024-03-21 04:26:50,521][03784] Fps is (10 sec: 52428.9, 60 sec: 48605.9, 300 sec: 47874.6). Total num frames: 957480960. Throughput: 0: 48789.0. Samples: 958706400. Policy #0 lag: (min: 1.0, avg: 39.3, max: 73.0) [2024-03-21 04:26:50,522][03784] Avg episode reward: [(0, '0.638')] [2024-03-21 04:26:55,521][03784] Fps is (10 sec: 42598.6, 60 sec: 48059.7, 300 sec: 47763.6). Total num frames: 957644800. Throughput: 0: 47880.0. Samples: 958979300. Policy #0 lag: (min: 1.0, avg: 39.3, max: 73.0) [2024-03-21 04:26:55,522][03784] Avg episode reward: [(0, '1.166')] [2024-03-21 04:27:00,348][04017] Updated weights for policy 0, policy_version 29230 (0.0010) [2024-03-21 04:27:00,521][03784] Fps is (10 sec: 32767.7, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 957808640. Throughput: 0: 47879.9. Samples: 959121500. Policy #0 lag: (min: 0.0, avg: 28.5, max: 70.0) [2024-03-21 04:27:00,522][03784] Avg episode reward: [(0, '0.814')] [2024-03-21 04:27:05,521][03784] Fps is (10 sec: 39321.5, 60 sec: 47513.7, 300 sec: 46541.7). Total num frames: 958038016. Throughput: 0: 47509.0. Samples: 959392200. Policy #0 lag: (min: 0.0, avg: 28.5, max: 70.0) [2024-03-21 04:27:05,522][03784] Avg episode reward: [(0, '0.494')] [2024-03-21 04:27:06,780][04017] Updated weights for policy 0, policy_version 29240 (0.0014) [2024-03-21 04:27:08,056][03995] Signal inference workers to stop experience collection... (19250 times) [2024-03-21 04:27:08,057][03995] Signal inference workers to resume experience collection... (19250 times) [2024-03-21 04:27:08,134][04017] InferenceWorker_p0-w0: stopping experience collection (19250 times) [2024-03-21 04:27:08,135][04017] InferenceWorker_p0-w0: resuming experience collection (19250 times) [2024-03-21 04:27:10,521][03784] Fps is (10 sec: 58982.2, 60 sec: 49698.0, 300 sec: 46541.6). Total num frames: 958398464. Throughput: 0: 47793.4. Samples: 959689600. Policy #0 lag: (min: 0.0, avg: 28.5, max: 70.0) [2024-03-21 04:27:10,523][03784] Avg episode reward: [(0, '1.033')] [2024-03-21 04:27:11,187][04017] Updated weights for policy 0, policy_version 29250 (0.0019) [2024-03-21 04:27:14,408][04017] Updated weights for policy 0, policy_version 29260 (0.0020) [2024-03-21 04:27:15,521][03784] Fps is (10 sec: 75366.0, 60 sec: 52974.9, 300 sec: 47541.4). Total num frames: 958791680. Throughput: 0: 47535.5. Samples: 959819200. Policy #0 lag: (min: 0.0, avg: 28.5, max: 70.0) [2024-03-21 04:27:15,522][03784] Avg episode reward: [(0, '0.798')] [2024-03-21 04:27:20,521][03784] Fps is (10 sec: 52429.1, 60 sec: 49698.2, 300 sec: 46874.9). Total num frames: 958922752. Throughput: 0: 47739.9. Samples: 960125700. Policy #0 lag: (min: 1.0, avg: 36.5, max: 70.0) [2024-03-21 04:27:20,522][03784] Avg episode reward: [(0, '0.928')] [2024-03-21 04:27:25,521][03784] Fps is (10 sec: 22937.6, 60 sec: 45329.2, 300 sec: 46763.8). Total num frames: 959021056. Throughput: 0: 48211.1. Samples: 960434700. Policy #0 lag: (min: 1.0, avg: 36.5, max: 70.0) [2024-03-21 04:27:25,522][03784] Avg episode reward: [(0, '0.928')] [2024-03-21 04:27:27,452][04017] Updated weights for policy 0, policy_version 29270 (0.0014) [2024-03-21 04:27:30,521][03784] Fps is (10 sec: 45875.9, 60 sec: 46421.4, 300 sec: 47541.4). Total num frames: 959381504. Throughput: 0: 47717.9. Samples: 960560500. Policy #0 lag: (min: 1.0, avg: 36.5, max: 70.0) [2024-03-21 04:27:30,522][03784] Avg episode reward: [(0, '0.830')] [2024-03-21 04:27:31,805][04017] Updated weights for policy 0, policy_version 29280 (0.0021) [2024-03-21 04:27:35,521][03784] Fps is (10 sec: 55705.4, 60 sec: 46421.3, 300 sec: 47874.6). Total num frames: 959578112. Throughput: 0: 47595.5. Samples: 960848200. Policy #0 lag: (min: 1.0, avg: 36.5, max: 70.0) [2024-03-21 04:27:35,522][03784] Avg episode reward: [(0, '0.830')] [2024-03-21 04:27:37,431][04017] Updated weights for policy 0, policy_version 29290 (0.0019) [2024-03-21 04:27:40,521][03784] Fps is (10 sec: 39321.2, 60 sec: 46967.5, 300 sec: 47874.6). Total num frames: 959774720. Throughput: 0: 47973.3. Samples: 961138100. Policy #0 lag: (min: 1.0, avg: 36.5, max: 70.0) [2024-03-21 04:27:40,522][03784] Avg episode reward: [(0, '0.944')] [2024-03-21 04:27:45,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45329.1, 300 sec: 47430.3). Total num frames: 959938560. Throughput: 0: 47913.4. Samples: 961277600. Policy #0 lag: (min: 0.0, avg: 55.4, max: 129.0) [2024-03-21 04:27:45,522][03784] Avg episode reward: [(0, '1.606')] [2024-03-21 04:27:48,744][04017] Updated weights for policy 0, policy_version 29300 (0.0017) [2024-03-21 04:27:50,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45329.1, 300 sec: 47097.1). Total num frames: 960200704. Throughput: 0: 47846.7. Samples: 961545300. Policy #0 lag: (min: 0.0, avg: 55.4, max: 129.0) [2024-03-21 04:27:50,522][03784] Avg episode reward: [(0, '1.172')] [2024-03-21 04:27:52,533][03995] Signal inference workers to stop experience collection... (19300 times) [2024-03-21 04:27:52,607][04017] InferenceWorker_p0-w0: stopping experience collection (19300 times) [2024-03-21 04:27:52,800][03995] Signal inference workers to resume experience collection... (19300 times) [2024-03-21 04:27:52,800][04017] InferenceWorker_p0-w0: resuming experience collection (19300 times) [2024-03-21 04:27:53,611][04017] Updated weights for policy 0, policy_version 29310 (0.0014) [2024-03-21 04:27:55,521][03784] Fps is (10 sec: 62258.6, 60 sec: 48605.8, 300 sec: 47097.0). Total num frames: 960561152. Throughput: 0: 46826.7. Samples: 961796800. Policy #0 lag: (min: 0.0, avg: 55.4, max: 129.0) [2024-03-21 04:27:55,522][03784] Avg episode reward: [(0, '1.178')] [2024-03-21 04:27:59,685][04017] Updated weights for policy 0, policy_version 29320 (0.0012) [2024-03-21 04:28:00,521][03784] Fps is (10 sec: 55705.4, 60 sec: 49152.0, 300 sec: 46763.8). Total num frames: 960757760. Throughput: 0: 47131.1. Samples: 961940100. Policy #0 lag: (min: 0.0, avg: 55.4, max: 129.0) [2024-03-21 04:28:00,522][03784] Avg episode reward: [(0, '1.527')] [2024-03-21 04:28:00,825][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000029322_960823296.pth... [2024-03-21 04:28:00,933][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000028976_949485568.pth [2024-03-21 04:28:05,521][03784] Fps is (10 sec: 45875.7, 60 sec: 49698.2, 300 sec: 46874.9). Total num frames: 961019904. Throughput: 0: 46235.7. Samples: 962206300. Policy #0 lag: (min: 0.0, avg: 42.2, max: 87.0) [2024-03-21 04:28:05,522][03784] Avg episode reward: [(0, '0.894')] [2024-03-21 04:28:06,422][04017] Updated weights for policy 0, policy_version 29330 (0.0011) [2024-03-21 04:28:10,521][03784] Fps is (10 sec: 58982.4, 60 sec: 49152.1, 300 sec: 47097.0). Total num frames: 961347584. Throughput: 0: 45935.6. Samples: 962501800. Policy #0 lag: (min: 0.0, avg: 42.2, max: 87.0) [2024-03-21 04:28:10,522][03784] Avg episode reward: [(0, '0.894')] [2024-03-21 04:28:11,371][04017] Updated weights for policy 0, policy_version 29340 (0.0011) [2024-03-21 04:28:15,521][03784] Fps is (10 sec: 65535.9, 60 sec: 48059.8, 300 sec: 47763.5). Total num frames: 961675264. Throughput: 0: 46242.2. Samples: 962641400. Policy #0 lag: (min: 0.0, avg: 42.2, max: 87.0) [2024-03-21 04:28:15,522][03784] Avg episode reward: [(0, '0.911')] [2024-03-21 04:28:16,578][04017] Updated weights for policy 0, policy_version 29350 (0.0018) [2024-03-21 04:28:20,521][03784] Fps is (10 sec: 42598.6, 60 sec: 47513.7, 300 sec: 47541.4). Total num frames: 961773568. Throughput: 0: 46377.9. Samples: 962935200. Policy #0 lag: (min: 0.0, avg: 42.2, max: 87.0) [2024-03-21 04:28:20,522][03784] Avg episode reward: [(0, '0.982')] [2024-03-21 04:28:25,521][03784] Fps is (10 sec: 29491.1, 60 sec: 49152.0, 300 sec: 47874.6). Total num frames: 961970176. Throughput: 0: 46437.8. Samples: 963227800. Policy #0 lag: (min: 0.0, avg: 39.1, max: 84.0) [2024-03-21 04:28:25,522][03784] Avg episode reward: [(0, '1.367')] [2024-03-21 04:28:29,608][04017] Updated weights for policy 0, policy_version 29360 (0.0010) [2024-03-21 04:28:30,521][03784] Fps is (10 sec: 29491.1, 60 sec: 44782.9, 300 sec: 47208.1). Total num frames: 962068480. Throughput: 0: 46837.8. Samples: 963385300. Policy #0 lag: (min: 0.0, avg: 39.1, max: 84.0) [2024-03-21 04:28:30,522][03784] Avg episode reward: [(0, '1.601')] [2024-03-21 04:28:35,521][03784] Fps is (10 sec: 26214.2, 60 sec: 44236.8, 300 sec: 46319.5). Total num frames: 962232320. Throughput: 0: 47428.8. Samples: 963679600. Policy #0 lag: (min: 0.0, avg: 39.1, max: 84.0) [2024-03-21 04:28:35,522][03784] Avg episode reward: [(0, '0.869')] [2024-03-21 04:28:38,124][04017] Updated weights for policy 0, policy_version 29370 (0.0011) [2024-03-21 04:28:40,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 962560000. Throughput: 0: 47795.6. Samples: 963947600. Policy #0 lag: (min: 0.0, avg: 39.1, max: 84.0) [2024-03-21 04:28:40,522][03784] Avg episode reward: [(0, '0.915')] [2024-03-21 04:28:43,527][04017] Updated weights for policy 0, policy_version 29380 (0.0018) [2024-03-21 04:28:45,521][03784] Fps is (10 sec: 68813.7, 60 sec: 49698.2, 300 sec: 46874.9). Total num frames: 962920448. Throughput: 0: 47495.6. Samples: 964077400. Policy #0 lag: (min: 0.0, avg: 31.3, max: 76.0) [2024-03-21 04:28:45,521][03784] Avg episode reward: [(0, '0.970')] [2024-03-21 04:28:48,110][04017] Updated weights for policy 0, policy_version 29390 (0.0015) [2024-03-21 04:28:48,838][03995] Signal inference workers to stop experience collection... (19350 times) [2024-03-21 04:28:48,897][04017] InferenceWorker_p0-w0: stopping experience collection (19350 times) [2024-03-21 04:28:48,905][03995] Signal inference workers to resume experience collection... (19350 times) [2024-03-21 04:28:48,948][04017] InferenceWorker_p0-w0: resuming experience collection (19350 times) [2024-03-21 04:28:50,521][03784] Fps is (10 sec: 65535.5, 60 sec: 50244.2, 300 sec: 47208.1). Total num frames: 963215360. Throughput: 0: 47088.7. Samples: 964325300. Policy #0 lag: (min: 0.0, avg: 31.3, max: 76.0) [2024-03-21 04:28:50,522][03784] Avg episode reward: [(0, '1.161')] [2024-03-21 04:28:53,593][04017] Updated weights for policy 0, policy_version 29400 (0.0014) [2024-03-21 04:28:55,521][03784] Fps is (10 sec: 55705.2, 60 sec: 48605.9, 300 sec: 47430.3). Total num frames: 963477504. Throughput: 0: 46497.8. Samples: 964594200. Policy #0 lag: (min: 0.0, avg: 31.3, max: 76.0) [2024-03-21 04:28:55,522][03784] Avg episode reward: [(0, '1.287')] [2024-03-21 04:28:58,897][04017] Updated weights for policy 0, policy_version 29410 (0.0015) [2024-03-21 04:29:00,521][03784] Fps is (10 sec: 55706.0, 60 sec: 50244.3, 300 sec: 47430.3). Total num frames: 963772416. Throughput: 0: 46551.1. Samples: 964736200. Policy #0 lag: (min: 0.0, avg: 31.3, max: 76.0) [2024-03-21 04:29:00,522][03784] Avg episode reward: [(0, '0.707')] [2024-03-21 04:29:05,521][03784] Fps is (10 sec: 45875.7, 60 sec: 48605.9, 300 sec: 47430.3). Total num frames: 963936256. Throughput: 0: 46653.4. Samples: 965034600. Policy #0 lag: (min: 0.0, avg: 33.2, max: 70.0) [2024-03-21 04:29:05,522][03784] Avg episode reward: [(0, '1.415')] [2024-03-21 04:29:09,354][04017] Updated weights for policy 0, policy_version 29420 (0.0011) [2024-03-21 04:29:10,521][03784] Fps is (10 sec: 29491.0, 60 sec: 45329.0, 300 sec: 47652.4). Total num frames: 964067328. Throughput: 0: 46868.8. Samples: 965336900. Policy #0 lag: (min: 0.0, avg: 33.2, max: 70.0) [2024-03-21 04:29:10,522][03784] Avg episode reward: [(0, '1.415')] [2024-03-21 04:29:15,521][03784] Fps is (10 sec: 36044.7, 60 sec: 43690.7, 300 sec: 47652.5). Total num frames: 964296704. Throughput: 0: 46491.2. Samples: 965477400. Policy #0 lag: (min: 0.0, avg: 33.2, max: 70.0) [2024-03-21 04:29:15,522][03784] Avg episode reward: [(0, '1.183')] [2024-03-21 04:29:18,628][04017] Updated weights for policy 0, policy_version 29430 (0.0011) [2024-03-21 04:29:20,521][03784] Fps is (10 sec: 42598.9, 60 sec: 45329.1, 300 sec: 47208.2). Total num frames: 964493312. Throughput: 0: 46817.9. Samples: 965786400. Policy #0 lag: (min: 0.0, avg: 33.2, max: 70.0) [2024-03-21 04:29:20,522][03784] Avg episode reward: [(0, '0.871')] [2024-03-21 04:29:24,184][04017] Updated weights for policy 0, policy_version 29440 (0.0010) [2024-03-21 04:29:25,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46421.4, 300 sec: 46874.9). Total num frames: 964755456. Throughput: 0: 47233.4. Samples: 966073100. Policy #0 lag: (min: 1.0, avg: 26.6, max: 63.0) [2024-03-21 04:29:25,522][03784] Avg episode reward: [(0, '1.196')] [2024-03-21 04:29:30,521][03784] Fps is (10 sec: 45874.8, 60 sec: 48059.7, 300 sec: 46541.7). Total num frames: 964952064. Throughput: 0: 47842.1. Samples: 966230300. Policy #0 lag: (min: 1.0, avg: 26.6, max: 63.0) [2024-03-21 04:29:30,522][03784] Avg episode reward: [(0, '1.196')] [2024-03-21 04:29:31,170][04017] Updated weights for policy 0, policy_version 29450 (0.0011) [2024-03-21 04:29:35,521][03784] Fps is (10 sec: 45874.9, 60 sec: 49698.2, 300 sec: 46874.9). Total num frames: 965214208. Throughput: 0: 48649.0. Samples: 966514500. Policy #0 lag: (min: 1.0, avg: 26.6, max: 63.0) [2024-03-21 04:29:35,522][03784] Avg episode reward: [(0, '0.686')] [2024-03-21 04:29:37,205][04017] Updated weights for policy 0, policy_version 29460 (0.0014) [2024-03-21 04:29:40,521][03784] Fps is (10 sec: 55705.7, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 965509120. Throughput: 0: 49108.9. Samples: 966804100. Policy #0 lag: (min: 1.0, avg: 26.6, max: 63.0) [2024-03-21 04:29:40,522][03784] Avg episode reward: [(0, '0.961')] [2024-03-21 04:29:44,337][04017] Updated weights for policy 0, policy_version 29470 (0.0011) [2024-03-21 04:29:45,199][03995] Signal inference workers to stop experience collection... (19400 times) [2024-03-21 04:29:45,231][04017] InferenceWorker_p0-w0: stopping experience collection (19400 times) [2024-03-21 04:29:45,495][03995] Signal inference workers to resume experience collection... (19400 times) [2024-03-21 04:29:45,495][04017] InferenceWorker_p0-w0: resuming experience collection (19400 times) [2024-03-21 04:29:45,521][03784] Fps is (10 sec: 55705.3, 60 sec: 47513.5, 300 sec: 46986.0). Total num frames: 965771264. Throughput: 0: 48888.8. Samples: 966936200. Policy #0 lag: (min: 0.0, avg: 36.7, max: 69.0) [2024-03-21 04:29:45,522][03784] Avg episode reward: [(0, '0.928')] [2024-03-21 04:29:48,483][04017] Updated weights for policy 0, policy_version 29480 (0.0011) [2024-03-21 04:29:50,521][03784] Fps is (10 sec: 62259.1, 60 sec: 48605.9, 300 sec: 47541.4). Total num frames: 966131712. Throughput: 0: 47884.3. Samples: 967189400. Policy #0 lag: (min: 0.0, avg: 36.7, max: 69.0) [2024-03-21 04:29:50,522][03784] Avg episode reward: [(0, '1.306')] [2024-03-21 04:29:54,428][04017] Updated weights for policy 0, policy_version 29490 (0.0015) [2024-03-21 04:29:55,521][03784] Fps is (10 sec: 55705.6, 60 sec: 47513.6, 300 sec: 47874.6). Total num frames: 966328320. Throughput: 0: 47808.9. Samples: 967488300. Policy #0 lag: (min: 0.0, avg: 36.7, max: 69.0) [2024-03-21 04:29:55,522][03784] Avg episode reward: [(0, '1.130')] [2024-03-21 04:30:00,521][03784] Fps is (10 sec: 29491.1, 60 sec: 44236.7, 300 sec: 47652.4). Total num frames: 966426624. Throughput: 0: 48035.4. Samples: 967639000. Policy #0 lag: (min: 0.0, avg: 36.7, max: 69.0) [2024-03-21 04:30:00,522][03784] Avg episode reward: [(0, '1.057')] [2024-03-21 04:30:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000029493_966426624.pth... [2024-03-21 04:30:00,671][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000029144_954990592.pth [2024-03-21 04:30:04,904][04017] Updated weights for policy 0, policy_version 29500 (0.0017) [2024-03-21 04:30:05,521][03784] Fps is (10 sec: 36045.3, 60 sec: 45875.2, 300 sec: 48207.8). Total num frames: 966688768. Throughput: 0: 47204.5. Samples: 967910600. Policy #0 lag: (min: 0.0, avg: 29.3, max: 67.0) [2024-03-21 04:30:05,522][03784] Avg episode reward: [(0, '1.238')] [2024-03-21 04:30:10,521][03784] Fps is (10 sec: 52428.9, 60 sec: 48059.8, 300 sec: 47652.5). Total num frames: 966950912. Throughput: 0: 46613.2. Samples: 968170700. Policy #0 lag: (min: 0.0, avg: 29.3, max: 67.0) [2024-03-21 04:30:10,522][03784] Avg episode reward: [(0, '0.979')] [2024-03-21 04:30:10,756][04017] Updated weights for policy 0, policy_version 29510 (0.0024) [2024-03-21 04:30:15,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46421.3, 300 sec: 47097.1). Total num frames: 967081984. Throughput: 0: 46360.0. Samples: 968316500. Policy #0 lag: (min: 0.0, avg: 29.3, max: 67.0) [2024-03-21 04:30:15,522][03784] Avg episode reward: [(0, '0.642')] [2024-03-21 04:30:20,382][04017] Updated weights for policy 0, policy_version 29520 (0.0010) [2024-03-21 04:30:20,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46967.4, 300 sec: 46874.9). Total num frames: 967311360. Throughput: 0: 46706.6. Samples: 968616300. Policy #0 lag: (min: 0.0, avg: 29.3, max: 67.0) [2024-03-21 04:30:20,522][03784] Avg episode reward: [(0, '1.529')] [2024-03-21 04:30:24,790][04017] Updated weights for policy 0, policy_version 29530 (0.0020) [2024-03-21 04:30:25,521][03784] Fps is (10 sec: 62259.0, 60 sec: 49151.9, 300 sec: 47541.4). Total num frames: 967704576. Throughput: 0: 46111.1. Samples: 968879100. Policy #0 lag: (min: 0.0, avg: 34.6, max: 74.0) [2024-03-21 04:30:25,522][03784] Avg episode reward: [(0, '1.251')] [2024-03-21 04:30:30,521][03784] Fps is (10 sec: 55706.7, 60 sec: 48606.0, 300 sec: 47208.1). Total num frames: 967868416. Throughput: 0: 46306.8. Samples: 969020000. Policy #0 lag: (min: 0.0, avg: 34.6, max: 74.0) [2024-03-21 04:30:30,522][03784] Avg episode reward: [(0, '1.429')] [2024-03-21 04:30:31,234][04017] Updated weights for policy 0, policy_version 29540 (0.0011) [2024-03-21 04:30:35,396][04017] Updated weights for policy 0, policy_version 29550 (0.0015) [2024-03-21 04:30:35,521][03784] Fps is (10 sec: 58983.1, 60 sec: 51336.6, 300 sec: 47763.5). Total num frames: 968294400. Throughput: 0: 46362.3. Samples: 969275700. Policy #0 lag: (min: 0.0, avg: 34.6, max: 74.0) [2024-03-21 04:30:35,522][03784] Avg episode reward: [(0, '0.800')] [2024-03-21 04:30:38,323][03995] Signal inference workers to stop experience collection... (19450 times) [2024-03-21 04:30:38,387][04017] InferenceWorker_p0-w0: stopping experience collection (19450 times) [2024-03-21 04:30:38,630][03995] Signal inference workers to resume experience collection... (19450 times) [2024-03-21 04:30:38,630][04017] InferenceWorker_p0-w0: resuming experience collection (19450 times) [2024-03-21 04:30:40,521][03784] Fps is (10 sec: 55705.0, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 968425472. Throughput: 0: 46168.9. Samples: 969565900. Policy #0 lag: (min: 0.0, avg: 34.6, max: 74.0) [2024-03-21 04:30:40,522][03784] Avg episode reward: [(0, '1.270')] [2024-03-21 04:30:45,521][03784] Fps is (10 sec: 29490.9, 60 sec: 46967.5, 300 sec: 47541.4). Total num frames: 968589312. Throughput: 0: 46137.8. Samples: 969715200. Policy #0 lag: (min: 0.0, avg: 51.5, max: 117.0) [2024-03-21 04:30:45,522][03784] Avg episode reward: [(0, '0.812')] [2024-03-21 04:30:48,875][04017] Updated weights for policy 0, policy_version 29560 (0.0017) [2024-03-21 04:30:50,521][03784] Fps is (10 sec: 32767.8, 60 sec: 43690.7, 300 sec: 47430.3). Total num frames: 968753152. Throughput: 0: 46815.4. Samples: 970017300. Policy #0 lag: (min: 0.0, avg: 51.5, max: 117.0) [2024-03-21 04:30:50,522][03784] Avg episode reward: [(0, '1.231')] [2024-03-21 04:30:55,521][03784] Fps is (10 sec: 29491.5, 60 sec: 42598.5, 300 sec: 47208.2). Total num frames: 968884224. Throughput: 0: 47477.9. Samples: 970307200. Policy #0 lag: (min: 0.0, avg: 51.5, max: 117.0) [2024-03-21 04:30:55,522][03784] Avg episode reward: [(0, '1.142')] [2024-03-21 04:30:57,006][04017] Updated weights for policy 0, policy_version 29570 (0.0012) [2024-03-21 04:31:00,521][03784] Fps is (10 sec: 26214.6, 60 sec: 43144.6, 300 sec: 46874.9). Total num frames: 969015296. Throughput: 0: 47424.5. Samples: 970450600. Policy #0 lag: (min: 0.0, avg: 51.5, max: 117.0) [2024-03-21 04:31:00,522][03784] Avg episode reward: [(0, '1.133')] [2024-03-21 04:31:05,329][04017] Updated weights for policy 0, policy_version 29580 (0.0011) [2024-03-21 04:31:05,521][03784] Fps is (10 sec: 39320.8, 60 sec: 43144.4, 300 sec: 46985.9). Total num frames: 969277440. Throughput: 0: 47022.2. Samples: 970732300. Policy #0 lag: (min: 0.0, avg: 31.3, max: 83.0) [2024-03-21 04:31:05,522][03784] Avg episode reward: [(0, '0.599')] [2024-03-21 04:31:10,492][04017] Updated weights for policy 0, policy_version 29590 (0.0012) [2024-03-21 04:31:10,521][03784] Fps is (10 sec: 58982.2, 60 sec: 44236.8, 300 sec: 47430.3). Total num frames: 969605120. Throughput: 0: 47353.4. Samples: 971010000. Policy #0 lag: (min: 0.0, avg: 31.3, max: 83.0) [2024-03-21 04:31:10,522][03784] Avg episode reward: [(0, '1.383')] [2024-03-21 04:31:14,634][04017] Updated weights for policy 0, policy_version 29600 (0.0011) [2024-03-21 04:31:15,521][03784] Fps is (10 sec: 72090.6, 60 sec: 48605.9, 300 sec: 47652.5). Total num frames: 969998336. Throughput: 0: 47239.9. Samples: 971145800. Policy #0 lag: (min: 0.0, avg: 31.3, max: 83.0) [2024-03-21 04:31:15,522][03784] Avg episode reward: [(0, '1.362')] [2024-03-21 04:31:19,697][04017] Updated weights for policy 0, policy_version 29610 (0.0014) [2024-03-21 04:31:20,521][03784] Fps is (10 sec: 65536.5, 60 sec: 49152.1, 300 sec: 47319.3). Total num frames: 970260480. Throughput: 0: 47377.8. Samples: 971407700. Policy #0 lag: (min: 0.0, avg: 31.3, max: 83.0) [2024-03-21 04:31:20,522][03784] Avg episode reward: [(0, '0.573')] [2024-03-21 04:31:25,521][03784] Fps is (10 sec: 52429.0, 60 sec: 46967.5, 300 sec: 47208.1). Total num frames: 970522624. Throughput: 0: 47506.7. Samples: 971703700. Policy #0 lag: (min: 1.0, avg: 43.7, max: 79.0) [2024-03-21 04:31:25,522][03784] Avg episode reward: [(0, '0.573')] [2024-03-21 04:31:26,690][04017] Updated weights for policy 0, policy_version 29620 (0.0011) [2024-03-21 04:31:28,640][03995] Signal inference workers to stop experience collection... (19500 times) [2024-03-21 04:31:28,640][03995] Signal inference workers to resume experience collection... (19500 times) [2024-03-21 04:31:28,769][04017] InferenceWorker_p0-w0: stopping experience collection (19500 times) [2024-03-21 04:31:28,770][04017] InferenceWorker_p0-w0: resuming experience collection (19500 times) [2024-03-21 04:31:30,521][03784] Fps is (10 sec: 52428.3, 60 sec: 48605.8, 300 sec: 47430.3). Total num frames: 970784768. Throughput: 0: 47068.9. Samples: 971833300. Policy #0 lag: (min: 1.0, avg: 43.7, max: 79.0) [2024-03-21 04:31:30,522][03784] Avg episode reward: [(0, '1.028')] [2024-03-21 04:31:32,254][04017] Updated weights for policy 0, policy_version 29630 (0.0010) [2024-03-21 04:31:35,521][03784] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 47541.4). Total num frames: 970981376. Throughput: 0: 46664.5. Samples: 972117200. Policy #0 lag: (min: 1.0, avg: 43.7, max: 79.0) [2024-03-21 04:31:35,522][03784] Avg episode reward: [(0, '1.162')] [2024-03-21 04:31:40,521][03784] Fps is (10 sec: 36044.6, 60 sec: 45329.0, 300 sec: 47208.1). Total num frames: 971145216. Throughput: 0: 46699.8. Samples: 972408700. Policy #0 lag: (min: 1.0, avg: 43.7, max: 79.0) [2024-03-21 04:31:40,522][03784] Avg episode reward: [(0, '1.439')] [2024-03-21 04:31:41,528][04017] Updated weights for policy 0, policy_version 29640 (0.0011) [2024-03-21 04:31:45,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45875.3, 300 sec: 46986.0). Total num frames: 971341824. Throughput: 0: 46726.7. Samples: 972553300. Policy #0 lag: (min: 0.0, avg: 30.6, max: 88.0) [2024-03-21 04:31:45,522][03784] Avg episode reward: [(0, '1.213')] [2024-03-21 04:31:49,756][04017] Updated weights for policy 0, policy_version 29650 (0.0010) [2024-03-21 04:31:50,521][03784] Fps is (10 sec: 45875.8, 60 sec: 47513.7, 300 sec: 47319.2). Total num frames: 971603968. Throughput: 0: 47155.7. Samples: 972854300. Policy #0 lag: (min: 0.0, avg: 30.6, max: 88.0) [2024-03-21 04:31:50,522][03784] Avg episode reward: [(0, '1.234')] [2024-03-21 04:31:54,293][04017] Updated weights for policy 0, policy_version 29660 (0.0011) [2024-03-21 04:31:55,521][03784] Fps is (10 sec: 55704.7, 60 sec: 50244.1, 300 sec: 47763.5). Total num frames: 971898880. Throughput: 0: 47175.5. Samples: 973132900. Policy #0 lag: (min: 0.0, avg: 30.6, max: 88.0) [2024-03-21 04:31:55,522][03784] Avg episode reward: [(0, '0.585')] [2024-03-21 04:32:00,521][03784] Fps is (10 sec: 42598.2, 60 sec: 50244.2, 300 sec: 47430.3). Total num frames: 972029952. Throughput: 0: 47577.8. Samples: 973286800. Policy #0 lag: (min: 0.0, avg: 30.6, max: 88.0) [2024-03-21 04:32:00,522][03784] Avg episode reward: [(0, '1.081')] [2024-03-21 04:32:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000029664_972029952.pth... [2024-03-21 04:32:00,664][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000029322_960823296.pth [2024-03-21 04:32:05,521][03784] Fps is (10 sec: 19661.0, 60 sec: 46967.6, 300 sec: 46430.6). Total num frames: 972095488. Throughput: 0: 48455.5. Samples: 973588200. Policy #0 lag: (min: 0.0, avg: 26.4, max: 58.0) [2024-03-21 04:32:05,522][03784] Avg episode reward: [(0, '1.521')] [2024-03-21 04:32:06,675][04017] Updated weights for policy 0, policy_version 29670 (0.0015) [2024-03-21 04:32:10,521][03784] Fps is (10 sec: 49151.5, 60 sec: 48605.8, 300 sec: 46541.7). Total num frames: 972521472. Throughput: 0: 48024.3. Samples: 973864800. Policy #0 lag: (min: 0.0, avg: 26.4, max: 58.0) [2024-03-21 04:32:10,523][03784] Avg episode reward: [(0, '1.330')] [2024-03-21 04:32:10,785][04017] Updated weights for policy 0, policy_version 29680 (0.0020) [2024-03-21 04:32:14,574][04017] Updated weights for policy 0, policy_version 29690 (0.0012) [2024-03-21 04:32:15,521][03784] Fps is (10 sec: 78643.2, 60 sec: 48059.7, 300 sec: 47319.2). Total num frames: 972881920. Throughput: 0: 48191.1. Samples: 974001900. Policy #0 lag: (min: 0.0, avg: 26.4, max: 58.0) [2024-03-21 04:32:15,522][03784] Avg episode reward: [(0, '1.330')] [2024-03-21 04:32:20,521][03784] Fps is (10 sec: 45875.8, 60 sec: 45329.0, 300 sec: 47319.2). Total num frames: 972980224. Throughput: 0: 48622.2. Samples: 974305200. Policy #0 lag: (min: 0.0, avg: 26.4, max: 58.0) [2024-03-21 04:32:20,522][03784] Avg episode reward: [(0, '1.254')] [2024-03-21 04:32:24,322][03995] Signal inference workers to stop experience collection... (19550 times) [2024-03-21 04:32:24,436][04017] InferenceWorker_p0-w0: stopping experience collection (19550 times) [2024-03-21 04:32:24,523][03995] Signal inference workers to resume experience collection... (19550 times) [2024-03-21 04:32:24,524][04017] InferenceWorker_p0-w0: resuming experience collection (19550 times) [2024-03-21 04:32:24,878][04017] Updated weights for policy 0, policy_version 29700 (0.0014) [2024-03-21 04:32:25,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45329.1, 300 sec: 46986.0). Total num frames: 973242368. Throughput: 0: 48517.9. Samples: 974592000. Policy #0 lag: (min: 0.0, avg: 26.2, max: 54.0) [2024-03-21 04:32:25,522][03784] Avg episode reward: [(0, '1.449')] [2024-03-21 04:32:30,521][03784] Fps is (10 sec: 52428.9, 60 sec: 45329.1, 300 sec: 47208.1). Total num frames: 973504512. Throughput: 0: 48362.2. Samples: 974729600. Policy #0 lag: (min: 0.0, avg: 26.2, max: 54.0) [2024-03-21 04:32:30,522][03784] Avg episode reward: [(0, '1.258')] [2024-03-21 04:32:30,773][04017] Updated weights for policy 0, policy_version 29710 (0.0012) [2024-03-21 04:32:35,140][04017] Updated weights for policy 0, policy_version 29720 (0.0012) [2024-03-21 04:32:35,521][03784] Fps is (10 sec: 62258.7, 60 sec: 48059.7, 300 sec: 47763.5). Total num frames: 973864960. Throughput: 0: 48155.5. Samples: 975021300. Policy #0 lag: (min: 0.0, avg: 26.2, max: 54.0) [2024-03-21 04:32:35,522][03784] Avg episode reward: [(0, '1.123')] [2024-03-21 04:32:40,521][03784] Fps is (10 sec: 58982.3, 60 sec: 49152.1, 300 sec: 47985.7). Total num frames: 974094336. Throughput: 0: 48126.8. Samples: 975298600. Policy #0 lag: (min: 0.0, avg: 26.2, max: 54.0) [2024-03-21 04:32:40,522][03784] Avg episode reward: [(0, '1.306')] [2024-03-21 04:32:41,392][04017] Updated weights for policy 0, policy_version 29730 (0.0014) [2024-03-21 04:32:45,521][03784] Fps is (10 sec: 42599.1, 60 sec: 49152.1, 300 sec: 47763.5). Total num frames: 974290944. Throughput: 0: 47920.2. Samples: 975443200. Policy #0 lag: (min: 0.0, avg: 39.1, max: 92.0) [2024-03-21 04:32:45,521][03784] Avg episode reward: [(0, '1.429')] [2024-03-21 04:32:48,625][04017] Updated weights for policy 0, policy_version 29740 (0.0017) [2024-03-21 04:32:50,521][03784] Fps is (10 sec: 52428.9, 60 sec: 50244.3, 300 sec: 47652.5). Total num frames: 974618624. Throughput: 0: 47444.5. Samples: 975723200. Policy #0 lag: (min: 0.0, avg: 39.1, max: 92.0) [2024-03-21 04:32:50,522][03784] Avg episode reward: [(0, '1.422')] [2024-03-21 04:32:55,521][03784] Fps is (10 sec: 42597.9, 60 sec: 46967.6, 300 sec: 47319.2). Total num frames: 974716928. Throughput: 0: 47795.7. Samples: 976015600. Policy #0 lag: (min: 0.0, avg: 39.1, max: 92.0) [2024-03-21 04:32:55,530][03784] Avg episode reward: [(0, '1.154')] [2024-03-21 04:32:57,015][04017] Updated weights for policy 0, policy_version 29750 (0.0015) [2024-03-21 04:33:00,521][03784] Fps is (10 sec: 42598.4, 60 sec: 50244.3, 300 sec: 47541.4). Total num frames: 975044608. Throughput: 0: 47533.4. Samples: 976140900. Policy #0 lag: (min: 0.0, avg: 39.1, max: 92.0) [2024-03-21 04:33:00,530][03784] Avg episode reward: [(0, '0.685')] [2024-03-21 04:33:03,432][04017] Updated weights for policy 0, policy_version 29760 (0.0011) [2024-03-21 04:33:05,521][03784] Fps is (10 sec: 52428.6, 60 sec: 52428.8, 300 sec: 47097.1). Total num frames: 975241216. Throughput: 0: 47582.2. Samples: 976446400. Policy #0 lag: (min: 0.0, avg: 37.6, max: 85.0) [2024-03-21 04:33:05,531][03784] Avg episode reward: [(0, '0.685')] [2024-03-21 04:33:10,521][03784] Fps is (10 sec: 39321.6, 60 sec: 48606.0, 300 sec: 46652.7). Total num frames: 975437824. Throughput: 0: 47313.3. Samples: 976721100. Policy #0 lag: (min: 0.0, avg: 37.6, max: 85.0) [2024-03-21 04:33:10,522][03784] Avg episode reward: [(0, '0.891')] [2024-03-21 04:33:13,309][04017] Updated weights for policy 0, policy_version 29770 (0.0013) [2024-03-21 04:33:15,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45329.1, 300 sec: 46874.9). Total num frames: 975601664. Throughput: 0: 47697.7. Samples: 976876000. Policy #0 lag: (min: 0.0, avg: 37.6, max: 85.0) [2024-03-21 04:33:15,522][03784] Avg episode reward: [(0, '1.316')] [2024-03-21 04:33:16,835][03995] Signal inference workers to stop experience collection... (19600 times) [2024-03-21 04:33:16,890][04017] InferenceWorker_p0-w0: stopping experience collection (19600 times) [2024-03-21 04:33:16,903][03995] Signal inference workers to resume experience collection... (19600 times) [2024-03-21 04:33:16,935][04017] InferenceWorker_p0-w0: resuming experience collection (19600 times) [2024-03-21 04:33:20,458][04017] Updated weights for policy 0, policy_version 29780 (0.0011) [2024-03-21 04:33:20,521][03784] Fps is (10 sec: 39320.9, 60 sec: 47513.5, 300 sec: 46986.0). Total num frames: 975831040. Throughput: 0: 47595.5. Samples: 977163100. Policy #0 lag: (min: 0.0, avg: 37.6, max: 85.0) [2024-03-21 04:33:20,522][03784] Avg episode reward: [(0, '1.017')] [2024-03-21 04:33:25,521][03784] Fps is (10 sec: 39321.3, 60 sec: 45875.1, 300 sec: 47208.1). Total num frames: 975994880. Throughput: 0: 47573.2. Samples: 977439400. Policy #0 lag: (min: 0.0, avg: 37.6, max: 85.0) [2024-03-21 04:33:25,522][03784] Avg episode reward: [(0, '1.252')] [2024-03-21 04:33:26,862][04017] Updated weights for policy 0, policy_version 29790 (0.0021) [2024-03-21 04:33:30,521][03784] Fps is (10 sec: 62260.6, 60 sec: 49152.1, 300 sec: 48207.9). Total num frames: 976453632. Throughput: 0: 47042.2. Samples: 977560100. Policy #0 lag: (min: 0.0, avg: 34.2, max: 120.0) [2024-03-21 04:33:30,522][03784] Avg episode reward: [(0, '1.041')] [2024-03-21 04:33:30,579][04017] Updated weights for policy 0, policy_version 29800 (0.0022) [2024-03-21 04:33:35,521][03784] Fps is (10 sec: 58983.6, 60 sec: 45329.2, 300 sec: 47541.4). Total num frames: 976584704. Throughput: 0: 46926.8. Samples: 977834900. Policy #0 lag: (min: 0.0, avg: 34.2, max: 120.0) [2024-03-21 04:33:35,521][03784] Avg episode reward: [(0, '1.317')] [2024-03-21 04:33:37,653][04017] Updated weights for policy 0, policy_version 29810 (0.0013) [2024-03-21 04:33:40,521][03784] Fps is (10 sec: 49151.6, 60 sec: 47513.6, 300 sec: 47541.4). Total num frames: 976945152. Throughput: 0: 45880.0. Samples: 978080200. Policy #0 lag: (min: 0.0, avg: 34.2, max: 120.0) [2024-03-21 04:33:40,522][03784] Avg episode reward: [(0, '1.208')] [2024-03-21 04:33:45,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 46874.9). Total num frames: 977043456. Throughput: 0: 46577.8. Samples: 978236900. Policy #0 lag: (min: 0.0, avg: 34.2, max: 120.0) [2024-03-21 04:33:45,522][03784] Avg episode reward: [(0, '1.508')] [2024-03-21 04:33:47,001][04017] Updated weights for policy 0, policy_version 29820 (0.0024) [2024-03-21 04:33:50,521][03784] Fps is (10 sec: 32768.1, 60 sec: 44236.8, 300 sec: 46763.8). Total num frames: 977272832. Throughput: 0: 46542.3. Samples: 978540800. Policy #0 lag: (min: 0.0, avg: 32.3, max: 68.0) [2024-03-21 04:33:50,522][03784] Avg episode reward: [(0, '1.336')] [2024-03-21 04:33:53,111][04017] Updated weights for policy 0, policy_version 29830 (0.0012) [2024-03-21 04:33:55,521][03784] Fps is (10 sec: 52428.5, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 977567744. Throughput: 0: 46326.6. Samples: 978805800. Policy #0 lag: (min: 0.0, avg: 32.3, max: 68.0) [2024-03-21 04:33:55,522][03784] Avg episode reward: [(0, '0.769')] [2024-03-21 04:33:58,324][04017] Updated weights for policy 0, policy_version 29840 (0.0010) [2024-03-21 04:34:00,521][03784] Fps is (10 sec: 58982.2, 60 sec: 46967.5, 300 sec: 47208.1). Total num frames: 977862656. Throughput: 0: 45771.1. Samples: 978935700. Policy #0 lag: (min: 0.0, avg: 32.3, max: 68.0) [2024-03-21 04:34:00,522][03784] Avg episode reward: [(0, '0.889')] [2024-03-21 04:34:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000029842_977862656.pth... [2024-03-21 04:34:00,651][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000029493_966426624.pth [2024-03-21 04:34:02,751][03995] Signal inference workers to stop experience collection... (19650 times) [2024-03-21 04:34:02,833][04017] InferenceWorker_p0-w0: stopping experience collection (19650 times) [2024-03-21 04:34:02,975][03995] Signal inference workers to resume experience collection... (19650 times) [2024-03-21 04:34:02,975][04017] InferenceWorker_p0-w0: resuming experience collection (19650 times) [2024-03-21 04:34:05,521][03784] Fps is (10 sec: 45874.7, 60 sec: 46421.3, 300 sec: 47319.2). Total num frames: 978026496. Throughput: 0: 45882.3. Samples: 979227800. Policy #0 lag: (min: 0.0, avg: 32.3, max: 68.0) [2024-03-21 04:34:05,522][03784] Avg episode reward: [(0, '0.971')] [2024-03-21 04:34:06,642][04017] Updated weights for policy 0, policy_version 29850 (0.0010) [2024-03-21 04:34:10,521][03784] Fps is (10 sec: 36045.0, 60 sec: 46421.3, 300 sec: 47208.1). Total num frames: 978223104. Throughput: 0: 46035.7. Samples: 979511000. Policy #0 lag: (min: 1.0, avg: 46.7, max: 108.0) [2024-03-21 04:34:10,522][03784] Avg episode reward: [(0, '0.619')] [2024-03-21 04:34:15,464][04017] Updated weights for policy 0, policy_version 29860 (0.0012) [2024-03-21 04:34:15,521][03784] Fps is (10 sec: 42598.9, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 978452480. Throughput: 0: 46668.8. Samples: 979660200. Policy #0 lag: (min: 1.0, avg: 46.7, max: 108.0) [2024-03-21 04:34:15,522][03784] Avg episode reward: [(0, '1.343')] [2024-03-21 04:34:20,521][03784] Fps is (10 sec: 42597.9, 60 sec: 46967.5, 300 sec: 47097.0). Total num frames: 978649088. Throughput: 0: 46468.7. Samples: 979926000. Policy #0 lag: (min: 1.0, avg: 46.7, max: 108.0) [2024-03-21 04:34:20,522][03784] Avg episode reward: [(0, '1.258')] [2024-03-21 04:34:22,229][04017] Updated weights for policy 0, policy_version 29870 (0.0010) [2024-03-21 04:34:25,521][03784] Fps is (10 sec: 36044.4, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 978812928. Throughput: 0: 47613.2. Samples: 980222800. Policy #0 lag: (min: 1.0, avg: 46.7, max: 108.0) [2024-03-21 04:34:25,522][03784] Avg episode reward: [(0, '0.897')] [2024-03-21 04:34:30,521][03784] Fps is (10 sec: 32768.3, 60 sec: 42052.2, 300 sec: 46652.8). Total num frames: 978976768. Throughput: 0: 47299.9. Samples: 980365400. Policy #0 lag: (min: 1.0, avg: 36.8, max: 71.0) [2024-03-21 04:34:30,522][03784] Avg episode reward: [(0, '1.462')] [2024-03-21 04:34:31,796][04017] Updated weights for policy 0, policy_version 29880 (0.0014) [2024-03-21 04:34:35,521][03784] Fps is (10 sec: 55706.3, 60 sec: 46421.3, 300 sec: 46986.0). Total num frames: 979369984. Throughput: 0: 46984.5. Samples: 980655100. Policy #0 lag: (min: 1.0, avg: 36.8, max: 71.0) [2024-03-21 04:34:35,522][03784] Avg episode reward: [(0, '1.638')] [2024-03-21 04:34:35,981][04017] Updated weights for policy 0, policy_version 29890 (0.0017) [2024-03-21 04:34:40,521][03784] Fps is (10 sec: 62258.8, 60 sec: 44236.8, 300 sec: 46874.9). Total num frames: 979599360. Throughput: 0: 47599.9. Samples: 980947800. Policy #0 lag: (min: 1.0, avg: 36.8, max: 71.0) [2024-03-21 04:34:40,522][03784] Avg episode reward: [(0, '1.638')] [2024-03-21 04:34:43,295][04017] Updated weights for policy 0, policy_version 29900 (0.0019) [2024-03-21 04:34:45,521][03784] Fps is (10 sec: 58982.3, 60 sec: 48605.8, 300 sec: 46874.9). Total num frames: 979959808. Throughput: 0: 47580.0. Samples: 981076800. Policy #0 lag: (min: 1.0, avg: 36.8, max: 71.0) [2024-03-21 04:34:45,522][03784] Avg episode reward: [(0, '0.883')] [2024-03-21 04:34:46,484][04017] Updated weights for policy 0, policy_version 29910 (0.0011) [2024-03-21 04:34:50,521][03784] Fps is (10 sec: 65535.8, 60 sec: 49698.0, 300 sec: 47208.1). Total num frames: 980254720. Throughput: 0: 46364.5. Samples: 981314200. Policy #0 lag: (min: 4.0, avg: 45.1, max: 122.0) [2024-03-21 04:34:50,522][03784] Avg episode reward: [(0, '0.658')] [2024-03-21 04:34:54,671][03995] Signal inference workers to stop experience collection... (19700 times) [2024-03-21 04:34:54,671][03995] Signal inference workers to resume experience collection... (19700 times) [2024-03-21 04:34:54,677][04017] Updated weights for policy 0, policy_version 29920 (0.0013) [2024-03-21 04:34:54,712][04017] InferenceWorker_p0-w0: stopping experience collection (19700 times) [2024-03-21 04:34:54,712][04017] InferenceWorker_p0-w0: resuming experience collection (19700 times) [2024-03-21 04:34:55,521][03784] Fps is (10 sec: 49151.8, 60 sec: 48059.7, 300 sec: 47541.4). Total num frames: 980451328. Throughput: 0: 46673.3. Samples: 981611300. Policy #0 lag: (min: 4.0, avg: 45.1, max: 122.0) [2024-03-21 04:34:55,522][03784] Avg episode reward: [(0, '1.233')] [2024-03-21 04:35:00,521][03784] Fps is (10 sec: 42598.8, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 980680704. Throughput: 0: 46453.3. Samples: 981750600. Policy #0 lag: (min: 4.0, avg: 45.1, max: 122.0) [2024-03-21 04:35:00,522][03784] Avg episode reward: [(0, '1.233')] [2024-03-21 04:35:03,714][04017] Updated weights for policy 0, policy_version 29930 (0.0010) [2024-03-21 04:35:05,521][03784] Fps is (10 sec: 39321.0, 60 sec: 46967.4, 300 sec: 47097.0). Total num frames: 980844544. Throughput: 0: 46962.1. Samples: 982039300. Policy #0 lag: (min: 4.0, avg: 45.1, max: 122.0) [2024-03-21 04:35:05,522][03784] Avg episode reward: [(0, '1.467')] [2024-03-21 04:35:10,521][03784] Fps is (10 sec: 36045.3, 60 sec: 46967.5, 300 sec: 47319.2). Total num frames: 981041152. Throughput: 0: 46306.9. Samples: 982306600. Policy #0 lag: (min: 2.0, avg: 33.3, max: 66.0) [2024-03-21 04:35:10,521][03784] Avg episode reward: [(0, '0.798')] [2024-03-21 04:35:10,824][04017] Updated weights for policy 0, policy_version 29940 (0.0012) [2024-03-21 04:35:15,521][03784] Fps is (10 sec: 42598.7, 60 sec: 46967.4, 300 sec: 47319.2). Total num frames: 981270528. Throughput: 0: 46331.0. Samples: 982450300. Policy #0 lag: (min: 2.0, avg: 33.3, max: 66.0) [2024-03-21 04:35:15,522][03784] Avg episode reward: [(0, '1.706')] [2024-03-21 04:35:16,852][04017] Updated weights for policy 0, policy_version 29950 (0.0032) [2024-03-21 04:35:20,521][03784] Fps is (10 sec: 49151.2, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 981532672. Throughput: 0: 46513.3. Samples: 982748200. Policy #0 lag: (min: 2.0, avg: 33.3, max: 66.0) [2024-03-21 04:35:20,522][03784] Avg episode reward: [(0, '0.688')] [2024-03-21 04:35:25,521][03784] Fps is (10 sec: 42598.7, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 981696512. Throughput: 0: 47037.8. Samples: 983064500. Policy #0 lag: (min: 2.0, avg: 33.3, max: 66.0) [2024-03-21 04:35:25,522][03784] Avg episode reward: [(0, '0.927')] [2024-03-21 04:35:26,727][04017] Updated weights for policy 0, policy_version 29960 (0.0012) [2024-03-21 04:35:30,521][03784] Fps is (10 sec: 42598.3, 60 sec: 49698.1, 300 sec: 46319.5). Total num frames: 981958656. Throughput: 0: 47433.3. Samples: 983211300. Policy #0 lag: (min: 0.0, avg: 34.5, max: 117.0) [2024-03-21 04:35:30,522][03784] Avg episode reward: [(0, '0.770')] [2024-03-21 04:35:33,110][04017] Updated weights for policy 0, policy_version 29970 (0.0010) [2024-03-21 04:35:35,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46421.3, 300 sec: 46541.7). Total num frames: 982155264. Throughput: 0: 48557.9. Samples: 983499300. Policy #0 lag: (min: 0.0, avg: 34.5, max: 117.0) [2024-03-21 04:35:35,522][03784] Avg episode reward: [(0, '0.770')] [2024-03-21 04:35:40,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45329.1, 300 sec: 46541.7). Total num frames: 982319104. Throughput: 0: 48366.6. Samples: 983787800. Policy #0 lag: (min: 0.0, avg: 34.5, max: 117.0) [2024-03-21 04:35:40,522][03784] Avg episode reward: [(0, '0.614')] [2024-03-21 04:35:41,173][04017] Updated weights for policy 0, policy_version 29980 (0.0014) [2024-03-21 04:35:44,719][04017] Updated weights for policy 0, policy_version 29990 (0.0022) [2024-03-21 04:35:45,521][03784] Fps is (10 sec: 62259.4, 60 sec: 46967.5, 300 sec: 47541.4). Total num frames: 982777856. Throughput: 0: 48237.8. Samples: 983921300. Policy #0 lag: (min: 0.0, avg: 34.5, max: 117.0) [2024-03-21 04:35:45,522][03784] Avg episode reward: [(0, '0.971')] [2024-03-21 04:35:45,653][03995] Signal inference workers to stop experience collection... (19750 times) [2024-03-21 04:35:45,697][04017] InferenceWorker_p0-w0: stopping experience collection (19750 times) [2024-03-21 04:35:45,956][03995] Signal inference workers to resume experience collection... (19750 times) [2024-03-21 04:35:45,956][04017] InferenceWorker_p0-w0: resuming experience collection (19750 times) [2024-03-21 04:35:48,841][04017] Updated weights for policy 0, policy_version 30000 (0.0025) [2024-03-21 04:35:50,521][03784] Fps is (10 sec: 78643.7, 60 sec: 47513.7, 300 sec: 48207.8). Total num frames: 983105536. Throughput: 0: 47622.4. Samples: 984182300. Policy #0 lag: (min: 2.0, avg: 50.8, max: 105.0) [2024-03-21 04:35:50,522][03784] Avg episode reward: [(0, '0.723')] [2024-03-21 04:35:55,521][03784] Fps is (10 sec: 39321.5, 60 sec: 45329.1, 300 sec: 47985.7). Total num frames: 983171072. Throughput: 0: 48699.9. Samples: 984498100. Policy #0 lag: (min: 2.0, avg: 50.8, max: 105.0) [2024-03-21 04:35:55,522][03784] Avg episode reward: [(0, '0.723')] [2024-03-21 04:35:59,307][04017] Updated weights for policy 0, policy_version 30010 (0.0019) [2024-03-21 04:36:00,521][03784] Fps is (10 sec: 32768.0, 60 sec: 45875.2, 300 sec: 47985.7). Total num frames: 983433216. Throughput: 0: 48560.1. Samples: 984635500. Policy #0 lag: (min: 2.0, avg: 50.8, max: 105.0) [2024-03-21 04:36:00,522][03784] Avg episode reward: [(0, '0.944')] [2024-03-21 04:36:00,857][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000030013_983465984.pth... [2024-03-21 04:36:00,976][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000029664_972029952.pth [2024-03-21 04:36:05,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45875.4, 300 sec: 47430.3). Total num frames: 983597056. Throughput: 0: 47742.3. Samples: 984896600. Policy #0 lag: (min: 2.0, avg: 50.8, max: 105.0) [2024-03-21 04:36:05,522][03784] Avg episode reward: [(0, '1.293')] [2024-03-21 04:36:07,965][04017] Updated weights for policy 0, policy_version 30020 (0.0012) [2024-03-21 04:36:10,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46421.2, 300 sec: 46874.9). Total num frames: 983826432. Throughput: 0: 46662.2. Samples: 985164300. Policy #0 lag: (min: 0.0, avg: 48.1, max: 119.0) [2024-03-21 04:36:10,522][03784] Avg episode reward: [(0, '1.746')] [2024-03-21 04:36:11,039][03995] Saving new best policy, reward=1.746! [2024-03-21 04:36:12,954][04017] Updated weights for policy 0, policy_version 30030 (0.0011) [2024-03-21 04:36:15,521][03784] Fps is (10 sec: 65535.6, 60 sec: 49698.2, 300 sec: 47430.3). Total num frames: 984252416. Throughput: 0: 46128.9. Samples: 985287100. Policy #0 lag: (min: 0.0, avg: 48.1, max: 119.0) [2024-03-21 04:36:15,522][03784] Avg episode reward: [(0, '0.501')] [2024-03-21 04:36:18,874][04017] Updated weights for policy 0, policy_version 30040 (0.0010) [2024-03-21 04:36:20,521][03784] Fps is (10 sec: 65535.9, 60 sec: 49152.0, 300 sec: 47319.2). Total num frames: 984481792. Throughput: 0: 46004.4. Samples: 985569500. Policy #0 lag: (min: 0.0, avg: 48.1, max: 119.0) [2024-03-21 04:36:20,522][03784] Avg episode reward: [(0, '1.227')] [2024-03-21 04:36:25,521][03784] Fps is (10 sec: 42598.6, 60 sec: 49698.2, 300 sec: 47097.1). Total num frames: 984678400. Throughput: 0: 46069.0. Samples: 985860900. Policy #0 lag: (min: 0.0, avg: 48.1, max: 119.0) [2024-03-21 04:36:25,522][03784] Avg episode reward: [(0, '1.429')] [2024-03-21 04:36:25,606][04017] Updated weights for policy 0, policy_version 30051 (0.0018) [2024-03-21 04:36:30,521][03784] Fps is (10 sec: 42598.8, 60 sec: 49152.1, 300 sec: 47208.1). Total num frames: 984907776. Throughput: 0: 46057.8. Samples: 985993900. Policy #0 lag: (min: 1.0, avg: 35.1, max: 81.0) [2024-03-21 04:36:30,522][03784] Avg episode reward: [(0, '0.920')] [2024-03-21 04:36:34,091][04017] Updated weights for policy 0, policy_version 30061 (0.0016) [2024-03-21 04:36:35,521][03784] Fps is (10 sec: 39321.4, 60 sec: 48605.8, 300 sec: 47208.2). Total num frames: 985071616. Throughput: 0: 46566.6. Samples: 986277800. Policy #0 lag: (min: 1.0, avg: 35.1, max: 81.0) [2024-03-21 04:36:35,522][03784] Avg episode reward: [(0, '1.426')] [2024-03-21 04:36:40,521][03784] Fps is (10 sec: 32767.6, 60 sec: 48605.8, 300 sec: 47097.0). Total num frames: 985235456. Throughput: 0: 46242.1. Samples: 986579000. Policy #0 lag: (min: 1.0, avg: 35.1, max: 81.0) [2024-03-21 04:36:40,522][03784] Avg episode reward: [(0, '0.708')] [2024-03-21 04:36:42,745][04017] Updated weights for policy 0, policy_version 30071 (0.0020) [2024-03-21 04:36:45,521][03784] Fps is (10 sec: 29491.3, 60 sec: 43144.5, 300 sec: 46652.7). Total num frames: 985366528. Throughput: 0: 46288.9. Samples: 986718500. Policy #0 lag: (min: 1.0, avg: 35.1, max: 81.0) [2024-03-21 04:36:45,522][03784] Avg episode reward: [(0, '1.256')] [2024-03-21 04:36:46,961][03995] Signal inference workers to stop experience collection... (19800 times) [2024-03-21 04:36:47,033][04017] InferenceWorker_p0-w0: stopping experience collection (19800 times) [2024-03-21 04:36:47,038][03995] Signal inference workers to resume experience collection... (19800 times) [2024-03-21 04:36:47,091][04017] InferenceWorker_p0-w0: resuming experience collection (19800 times) [2024-03-21 04:36:49,766][04017] Updated weights for policy 0, policy_version 30081 (0.0011) [2024-03-21 04:36:50,521][03784] Fps is (10 sec: 49152.1, 60 sec: 43690.6, 300 sec: 46874.9). Total num frames: 985726976. Throughput: 0: 46966.6. Samples: 987010100. Policy #0 lag: (min: 3.0, avg: 41.8, max: 107.0) [2024-03-21 04:36:50,522][03784] Avg episode reward: [(0, '0.486')] [2024-03-21 04:36:55,521][03784] Fps is (10 sec: 58982.0, 60 sec: 46421.3, 300 sec: 47208.1). Total num frames: 985956352. Throughput: 0: 47071.1. Samples: 987282500. Policy #0 lag: (min: 3.0, avg: 41.8, max: 107.0) [2024-03-21 04:36:55,522][03784] Avg episode reward: [(0, '1.135')] [2024-03-21 04:36:57,528][04017] Updated weights for policy 0, policy_version 30091 (0.0011) [2024-03-21 04:37:00,521][03784] Fps is (10 sec: 42598.6, 60 sec: 45329.1, 300 sec: 47652.5). Total num frames: 986152960. Throughput: 0: 47524.5. Samples: 987425700. Policy #0 lag: (min: 3.0, avg: 41.8, max: 107.0) [2024-03-21 04:37:00,522][03784] Avg episode reward: [(0, '0.647')] [2024-03-21 04:37:04,229][04017] Updated weights for policy 0, policy_version 30101 (0.0011) [2024-03-21 04:37:05,521][03784] Fps is (10 sec: 52428.9, 60 sec: 48059.7, 300 sec: 47319.2). Total num frames: 986480640. Throughput: 0: 47684.5. Samples: 987715300. Policy #0 lag: (min: 3.0, avg: 41.8, max: 107.0) [2024-03-21 04:37:05,522][03784] Avg episode reward: [(0, '0.956')] [2024-03-21 04:37:08,880][04017] Updated weights for policy 0, policy_version 30111 (0.0019) [2024-03-21 04:37:10,521][03784] Fps is (10 sec: 62259.0, 60 sec: 49152.0, 300 sec: 47097.1). Total num frames: 986775552. Throughput: 0: 47251.0. Samples: 987987200. Policy #0 lag: (min: 0.0, avg: 53.2, max: 128.0) [2024-03-21 04:37:10,522][03784] Avg episode reward: [(0, '1.211')] [2024-03-21 04:37:15,122][04017] Updated weights for policy 0, policy_version 30121 (0.0015) [2024-03-21 04:37:15,521][03784] Fps is (10 sec: 52429.1, 60 sec: 45875.2, 300 sec: 47541.4). Total num frames: 987004928. Throughput: 0: 47802.2. Samples: 988145000. Policy #0 lag: (min: 0.0, avg: 53.2, max: 128.0) [2024-03-21 04:37:15,522][03784] Avg episode reward: [(0, '1.211')] [2024-03-21 04:37:20,521][03784] Fps is (10 sec: 32767.5, 60 sec: 43690.6, 300 sec: 46985.9). Total num frames: 987103232. Throughput: 0: 47828.7. Samples: 988430100. Policy #0 lag: (min: 0.0, avg: 53.2, max: 128.0) [2024-03-21 04:37:20,522][03784] Avg episode reward: [(0, '1.017')] [2024-03-21 04:37:24,421][04017] Updated weights for policy 0, policy_version 30131 (0.0019) [2024-03-21 04:37:25,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44782.9, 300 sec: 46986.0). Total num frames: 987365376. Throughput: 0: 46749.0. Samples: 988682700. Policy #0 lag: (min: 0.0, avg: 53.2, max: 128.0) [2024-03-21 04:37:25,522][03784] Avg episode reward: [(0, '1.214')] [2024-03-21 04:37:30,521][03784] Fps is (10 sec: 52429.6, 60 sec: 45329.0, 300 sec: 46652.8). Total num frames: 987627520. Throughput: 0: 46848.8. Samples: 988826700. Policy #0 lag: (min: 2.0, avg: 37.4, max: 78.0) [2024-03-21 04:37:30,522][03784] Avg episode reward: [(0, '1.502')] [2024-03-21 04:37:31,078][04017] Updated weights for policy 0, policy_version 30141 (0.0010) [2024-03-21 04:37:35,521][03784] Fps is (10 sec: 49151.6, 60 sec: 46421.3, 300 sec: 46652.7). Total num frames: 987856896. Throughput: 0: 46817.8. Samples: 989116900. Policy #0 lag: (min: 2.0, avg: 37.4, max: 78.0) [2024-03-21 04:37:35,522][03784] Avg episode reward: [(0, '1.224')] [2024-03-21 04:37:38,156][04017] Updated weights for policy 0, policy_version 30151 (0.0014) [2024-03-21 04:37:38,581][03995] Signal inference workers to stop experience collection... (19850 times) [2024-03-21 04:37:38,652][04017] InferenceWorker_p0-w0: stopping experience collection (19850 times) [2024-03-21 04:37:38,813][03995] Signal inference workers to resume experience collection... (19850 times) [2024-03-21 04:37:38,813][04017] InferenceWorker_p0-w0: resuming experience collection (19850 times) [2024-03-21 04:37:40,521][03784] Fps is (10 sec: 49152.1, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 988119040. Throughput: 0: 47246.7. Samples: 989408600. Policy #0 lag: (min: 2.0, avg: 37.4, max: 78.0) [2024-03-21 04:37:40,522][03784] Avg episode reward: [(0, '1.009')] [2024-03-21 04:37:44,656][04017] Updated weights for policy 0, policy_version 30161 (0.0010) [2024-03-21 04:37:45,521][03784] Fps is (10 sec: 52428.0, 60 sec: 50244.1, 300 sec: 46652.7). Total num frames: 988381184. Throughput: 0: 47319.8. Samples: 989555100. Policy #0 lag: (min: 2.0, avg: 37.4, max: 78.0) [2024-03-21 04:37:45,531][03784] Avg episode reward: [(0, '0.955')] [2024-03-21 04:37:50,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 988545024. Throughput: 0: 47686.7. Samples: 989861200. Policy #0 lag: (min: 0.0, avg: 39.2, max: 116.0) [2024-03-21 04:37:50,522][03784] Avg episode reward: [(0, '1.145')] [2024-03-21 04:37:52,553][04017] Updated weights for policy 0, policy_version 30171 (0.0014) [2024-03-21 04:37:55,521][03784] Fps is (10 sec: 42599.5, 60 sec: 47513.7, 300 sec: 46652.7). Total num frames: 988807168. Throughput: 0: 48184.5. Samples: 990155500. Policy #0 lag: (min: 0.0, avg: 39.2, max: 116.0) [2024-03-21 04:37:55,531][03784] Avg episode reward: [(0, '1.145')] [2024-03-21 04:38:00,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 988905472. Throughput: 0: 47973.3. Samples: 990303800. Policy #0 lag: (min: 0.0, avg: 39.2, max: 116.0) [2024-03-21 04:38:00,530][03784] Avg episode reward: [(0, '1.145')] [2024-03-21 04:38:00,543][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000030179_988905472.pth... [2024-03-21 04:38:00,663][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000029842_977862656.pth [2024-03-21 04:38:02,997][04017] Updated weights for policy 0, policy_version 30181 (0.0019) [2024-03-21 04:38:05,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44783.0, 300 sec: 46541.7). Total num frames: 989167616. Throughput: 0: 47713.5. Samples: 990577200. Policy #0 lag: (min: 0.0, avg: 39.2, max: 116.0) [2024-03-21 04:38:05,522][03784] Avg episode reward: [(0, '0.746')] [2024-03-21 04:38:08,489][04017] Updated weights for policy 0, policy_version 30191 (0.0012) [2024-03-21 04:38:10,521][03784] Fps is (10 sec: 55705.9, 60 sec: 44783.0, 300 sec: 46986.0). Total num frames: 989462528. Throughput: 0: 47826.7. Samples: 990834900. Policy #0 lag: (min: 1.0, avg: 46.6, max: 113.0) [2024-03-21 04:38:10,522][03784] Avg episode reward: [(0, '0.686')] [2024-03-21 04:38:15,521][03784] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 46652.8). Total num frames: 989593600. Throughput: 0: 47662.3. Samples: 990971500. Policy #0 lag: (min: 1.0, avg: 46.6, max: 113.0) [2024-03-21 04:38:15,522][03784] Avg episode reward: [(0, '1.415')] [2024-03-21 04:38:15,643][04017] Updated weights for policy 0, policy_version 30201 (0.0011) [2024-03-21 04:38:19,648][04017] Updated weights for policy 0, policy_version 30211 (0.0011) [2024-03-21 04:38:20,521][03784] Fps is (10 sec: 52428.3, 60 sec: 48059.8, 300 sec: 47430.3). Total num frames: 989986816. Throughput: 0: 47755.6. Samples: 991265900. Policy #0 lag: (min: 1.0, avg: 46.6, max: 113.0) [2024-03-21 04:38:20,522][03784] Avg episode reward: [(0, '1.153')] [2024-03-21 04:38:25,521][03784] Fps is (10 sec: 58982.3, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 990183424. Throughput: 0: 47728.9. Samples: 991556400. Policy #0 lag: (min: 1.0, avg: 46.6, max: 113.0) [2024-03-21 04:38:25,522][03784] Avg episode reward: [(0, '1.323')] [2024-03-21 04:38:26,853][04017] Updated weights for policy 0, policy_version 30221 (0.0011) [2024-03-21 04:38:30,521][03784] Fps is (10 sec: 52428.4, 60 sec: 48059.6, 300 sec: 47208.1). Total num frames: 990511104. Throughput: 0: 47626.8. Samples: 991698300. Policy #0 lag: (min: 0.0, avg: 40.5, max: 76.0) [2024-03-21 04:38:30,522][03784] Avg episode reward: [(0, '0.680')] [2024-03-21 04:38:31,518][04017] Updated weights for policy 0, policy_version 30231 (0.0020) [2024-03-21 04:38:34,081][03995] Signal inference workers to stop experience collection... (19900 times) [2024-03-21 04:38:34,128][04017] InferenceWorker_p0-w0: stopping experience collection (19900 times) [2024-03-21 04:38:34,366][03995] Signal inference workers to resume experience collection... (19900 times) [2024-03-21 04:38:34,366][04017] InferenceWorker_p0-w0: resuming experience collection (19900 times) [2024-03-21 04:38:35,521][03784] Fps is (10 sec: 58982.4, 60 sec: 48605.9, 300 sec: 46874.9). Total num frames: 990773248. Throughput: 0: 47140.1. Samples: 991982500. Policy #0 lag: (min: 0.0, avg: 40.5, max: 76.0) [2024-03-21 04:38:35,522][03784] Avg episode reward: [(0, '1.053')] [2024-03-21 04:38:37,544][04017] Updated weights for policy 0, policy_version 30241 (0.0018) [2024-03-21 04:38:40,521][03784] Fps is (10 sec: 49152.4, 60 sec: 48059.7, 300 sec: 47319.2). Total num frames: 991002624. Throughput: 0: 46606.6. Samples: 992252800. Policy #0 lag: (min: 0.0, avg: 40.5, max: 76.0) [2024-03-21 04:38:40,522][03784] Avg episode reward: [(0, '1.315')] [2024-03-21 04:38:45,521][03784] Fps is (10 sec: 45874.9, 60 sec: 47513.7, 300 sec: 47319.2). Total num frames: 991232000. Throughput: 0: 46531.1. Samples: 992397700. Policy #0 lag: (min: 0.0, avg: 40.5, max: 76.0) [2024-03-21 04:38:45,522][03784] Avg episode reward: [(0, '1.315')] [2024-03-21 04:38:46,109][04017] Updated weights for policy 0, policy_version 30251 (0.0011) [2024-03-21 04:38:50,521][03784] Fps is (10 sec: 45874.9, 60 sec: 48605.8, 300 sec: 47097.0). Total num frames: 991461376. Throughput: 0: 47364.3. Samples: 992708600. Policy #0 lag: (min: 0.0, avg: 29.2, max: 71.0) [2024-03-21 04:38:50,522][03784] Avg episode reward: [(0, '1.138')] [2024-03-21 04:38:54,292][04017] Updated weights for policy 0, policy_version 30261 (0.0017) [2024-03-21 04:38:55,521][03784] Fps is (10 sec: 39321.9, 60 sec: 46967.5, 300 sec: 46652.7). Total num frames: 991625216. Throughput: 0: 47720.0. Samples: 992982300. Policy #0 lag: (min: 0.0, avg: 29.2, max: 71.0) [2024-03-21 04:38:55,522][03784] Avg episode reward: [(0, '1.338')] [2024-03-21 04:39:00,521][03784] Fps is (10 sec: 26214.4, 60 sec: 46967.4, 300 sec: 46430.6). Total num frames: 991723520. Throughput: 0: 48297.6. Samples: 993144900. Policy #0 lag: (min: 0.0, avg: 29.2, max: 71.0) [2024-03-21 04:39:00,523][03784] Avg episode reward: [(0, '1.151')] [2024-03-21 04:39:03,287][04017] Updated weights for policy 0, policy_version 30271 (0.0018) [2024-03-21 04:39:05,521][03784] Fps is (10 sec: 45874.9, 60 sec: 48605.8, 300 sec: 46986.0). Total num frames: 992083968. Throughput: 0: 48153.3. Samples: 993432800. Policy #0 lag: (min: 0.0, avg: 29.2, max: 71.0) [2024-03-21 04:39:05,522][03784] Avg episode reward: [(0, '0.585')] [2024-03-21 04:39:10,521][03784] Fps is (10 sec: 42599.0, 60 sec: 44782.9, 300 sec: 46430.6). Total num frames: 992149504. Throughput: 0: 48253.3. Samples: 993727800. Policy #0 lag: (min: 0.0, avg: 38.1, max: 111.0) [2024-03-21 04:39:10,522][03784] Avg episode reward: [(0, '1.396')] [2024-03-21 04:39:11,302][04017] Updated weights for policy 0, policy_version 30281 (0.0012) [2024-03-21 04:39:15,521][03784] Fps is (10 sec: 36045.0, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 992444416. Throughput: 0: 47737.9. Samples: 993846500. Policy #0 lag: (min: 0.0, avg: 38.1, max: 111.0) [2024-03-21 04:39:15,522][03784] Avg episode reward: [(0, '1.228')] [2024-03-21 04:39:20,521][03784] Fps is (10 sec: 36044.7, 60 sec: 42052.3, 300 sec: 46430.6). Total num frames: 992509952. Throughput: 0: 47706.6. Samples: 994129300. Policy #0 lag: (min: 0.0, avg: 38.1, max: 111.0) [2024-03-21 04:39:20,522][03784] Avg episode reward: [(0, '1.228')] [2024-03-21 04:39:22,496][04017] Updated weights for policy 0, policy_version 30291 (0.0016) [2024-03-21 04:39:25,345][03995] Signal inference workers to stop experience collection... (19950 times) [2024-03-21 04:39:25,417][04017] InferenceWorker_p0-w0: stopping experience collection (19950 times) [2024-03-21 04:39:25,521][03784] Fps is (10 sec: 42598.5, 60 sec: 44782.9, 300 sec: 47097.1). Total num frames: 992870400. Throughput: 0: 46889.0. Samples: 994362800. Policy #0 lag: (min: 0.0, avg: 38.1, max: 111.0) [2024-03-21 04:39:25,522][03784] Avg episode reward: [(0, '0.823')] [2024-03-21 04:39:25,605][03995] Signal inference workers to resume experience collection... (19950 times) [2024-03-21 04:39:25,606][04017] InferenceWorker_p0-w0: resuming experience collection (19950 times) [2024-03-21 04:39:25,611][04017] Updated weights for policy 0, policy_version 30301 (0.0025) [2024-03-21 04:39:28,825][04017] Updated weights for policy 0, policy_version 30311 (0.0018) [2024-03-21 04:39:30,521][03784] Fps is (10 sec: 88472.5, 60 sec: 48059.7, 300 sec: 47541.3). Total num frames: 993394688. Throughput: 0: 46126.6. Samples: 994473400. Policy #0 lag: (min: 9.0, avg: 54.8, max: 100.0) [2024-03-21 04:39:30,522][03784] Avg episode reward: [(0, '1.034')] [2024-03-21 04:39:35,348][04017] Updated weights for policy 0, policy_version 30321 (0.0017) [2024-03-21 04:39:35,521][03784] Fps is (10 sec: 68813.0, 60 sec: 46421.4, 300 sec: 47319.2). Total num frames: 993558528. Throughput: 0: 45484.6. Samples: 994755400. Policy #0 lag: (min: 9.0, avg: 54.8, max: 100.0) [2024-03-21 04:39:35,522][03784] Avg episode reward: [(0, '1.034')] [2024-03-21 04:39:39,625][04017] Updated weights for policy 0, policy_version 30331 (0.0016) [2024-03-21 04:39:40,521][03784] Fps is (10 sec: 52429.3, 60 sec: 48605.9, 300 sec: 47319.2). Total num frames: 993918976. Throughput: 0: 45271.1. Samples: 995019500. Policy #0 lag: (min: 9.0, avg: 54.8, max: 100.0) [2024-03-21 04:39:40,522][03784] Avg episode reward: [(0, '0.967')] [2024-03-21 04:39:45,521][03784] Fps is (10 sec: 52428.7, 60 sec: 47513.7, 300 sec: 46874.9). Total num frames: 994082816. Throughput: 0: 45069.0. Samples: 995173000. Policy #0 lag: (min: 9.0, avg: 54.8, max: 100.0) [2024-03-21 04:39:45,522][03784] Avg episode reward: [(0, '0.967')] [2024-03-21 04:39:47,573][04017] Updated weights for policy 0, policy_version 30341 (0.0016) [2024-03-21 04:39:50,521][03784] Fps is (10 sec: 36044.5, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 994279424. Throughput: 0: 45566.6. Samples: 995483300. Policy #0 lag: (min: 0.0, avg: 38.2, max: 82.0) [2024-03-21 04:39:50,522][03784] Avg episode reward: [(0, '1.013')] [2024-03-21 04:39:55,521][03784] Fps is (10 sec: 42598.3, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 994508800. Throughput: 0: 45713.3. Samples: 995784900. Policy #0 lag: (min: 0.0, avg: 38.2, max: 82.0) [2024-03-21 04:39:55,522][03784] Avg episode reward: [(0, '1.013')] [2024-03-21 04:39:56,908][04017] Updated weights for policy 0, policy_version 30351 (0.0019) [2024-03-21 04:40:00,521][03784] Fps is (10 sec: 36044.8, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 994639872. Throughput: 0: 46651.0. Samples: 995945800. Policy #0 lag: (min: 0.0, avg: 38.2, max: 82.0) [2024-03-21 04:40:00,522][03784] Avg episode reward: [(0, '0.842')] [2024-03-21 04:40:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000030354_994639872.pth... [2024-03-21 04:40:00,697][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000030013_983465984.pth [2024-03-21 04:40:05,521][03784] Fps is (10 sec: 22937.6, 60 sec: 44236.8, 300 sec: 46430.6). Total num frames: 994738176. Throughput: 0: 46815.5. Samples: 996236000. Policy #0 lag: (min: 0.0, avg: 38.2, max: 82.0) [2024-03-21 04:40:05,522][03784] Avg episode reward: [(0, '1.068')] [2024-03-21 04:40:07,507][04017] Updated weights for policy 0, policy_version 30361 (0.0010) [2024-03-21 04:40:10,521][03784] Fps is (10 sec: 42599.1, 60 sec: 48605.9, 300 sec: 46763.9). Total num frames: 995065856. Throughput: 0: 47584.5. Samples: 996504100. Policy #0 lag: (min: 1.0, avg: 30.7, max: 75.0) [2024-03-21 04:40:10,522][03784] Avg episode reward: [(0, '1.667')] [2024-03-21 04:40:11,590][04017] Updated weights for policy 0, policy_version 30371 (0.0011) [2024-03-21 04:40:14,876][03995] Signal inference workers to stop experience collection... (20000 times) [2024-03-21 04:40:14,939][03995] Signal inference workers to resume experience collection... (20000 times) [2024-03-21 04:40:14,948][04017] InferenceWorker_p0-w0: stopping experience collection (20000 times) [2024-03-21 04:40:14,985][04017] InferenceWorker_p0-w0: resuming experience collection (20000 times) [2024-03-21 04:40:15,521][03784] Fps is (10 sec: 58982.0, 60 sec: 48059.7, 300 sec: 46763.8). Total num frames: 995328000. Throughput: 0: 48100.1. Samples: 996637900. Policy #0 lag: (min: 1.0, avg: 30.7, max: 75.0) [2024-03-21 04:40:15,522][03784] Avg episode reward: [(0, '1.649')] [2024-03-21 04:40:18,505][04017] Updated weights for policy 0, policy_version 30381 (0.0021) [2024-03-21 04:40:20,521][03784] Fps is (10 sec: 52428.2, 60 sec: 51336.5, 300 sec: 47097.1). Total num frames: 995590144. Throughput: 0: 48108.8. Samples: 996920300. Policy #0 lag: (min: 1.0, avg: 30.7, max: 75.0) [2024-03-21 04:40:20,522][03784] Avg episode reward: [(0, '1.300')] [2024-03-21 04:40:25,521][03784] Fps is (10 sec: 45875.6, 60 sec: 48605.9, 300 sec: 46874.9). Total num frames: 995786752. Throughput: 0: 48702.3. Samples: 997211100. Policy #0 lag: (min: 1.0, avg: 30.7, max: 75.0) [2024-03-21 04:40:25,522][03784] Avg episode reward: [(0, '0.841')] [2024-03-21 04:40:27,031][04017] Updated weights for policy 0, policy_version 30391 (0.0013) [2024-03-21 04:40:30,521][03784] Fps is (10 sec: 49151.9, 60 sec: 44783.0, 300 sec: 47208.1). Total num frames: 996081664. Throughput: 0: 48528.8. Samples: 997356800. Policy #0 lag: (min: 1.0, avg: 33.4, max: 90.0) [2024-03-21 04:40:30,522][03784] Avg episode reward: [(0, '1.509')] [2024-03-21 04:40:31,587][04017] Updated weights for policy 0, policy_version 30401 (0.0021) [2024-03-21 04:40:35,521][03784] Fps is (10 sec: 62259.1, 60 sec: 47513.6, 300 sec: 47763.5). Total num frames: 996409344. Throughput: 0: 47653.4. Samples: 997627700. Policy #0 lag: (min: 1.0, avg: 33.4, max: 90.0) [2024-03-21 04:40:35,522][03784] Avg episode reward: [(0, '1.413')] [2024-03-21 04:40:37,594][04017] Updated weights for policy 0, policy_version 30411 (0.0013) [2024-03-21 04:40:40,521][03784] Fps is (10 sec: 62258.9, 60 sec: 46421.3, 300 sec: 47208.1). Total num frames: 996704256. Throughput: 0: 46477.7. Samples: 997876400. Policy #0 lag: (min: 1.0, avg: 33.4, max: 90.0) [2024-03-21 04:40:40,522][03784] Avg episode reward: [(0, '0.521')] [2024-03-21 04:40:43,381][04017] Updated weights for policy 0, policy_version 30421 (0.0011) [2024-03-21 04:40:45,521][03784] Fps is (10 sec: 42598.8, 60 sec: 45875.3, 300 sec: 46541.7). Total num frames: 996835328. Throughput: 0: 46442.4. Samples: 998035700. Policy #0 lag: (min: 1.0, avg: 33.4, max: 90.0) [2024-03-21 04:40:45,521][03784] Avg episode reward: [(0, '0.521')] [2024-03-21 04:40:50,521][03784] Fps is (10 sec: 29491.3, 60 sec: 45329.1, 300 sec: 46874.9). Total num frames: 996999168. Throughput: 0: 46891.0. Samples: 998346100. Policy #0 lag: (min: 0.0, avg: 45.5, max: 88.0) [2024-03-21 04:40:50,522][03784] Avg episode reward: [(0, '1.203')] [2024-03-21 04:40:52,394][04017] Updated weights for policy 0, policy_version 30431 (0.0018) [2024-03-21 04:40:55,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46421.4, 300 sec: 46986.0). Total num frames: 997294080. Throughput: 0: 46993.3. Samples: 998618800. Policy #0 lag: (min: 0.0, avg: 45.5, max: 88.0) [2024-03-21 04:40:55,522][03784] Avg episode reward: [(0, '1.061')] [2024-03-21 04:40:58,916][04017] Updated weights for policy 0, policy_version 30441 (0.0014) [2024-03-21 04:41:00,521][03784] Fps is (10 sec: 65536.2, 60 sec: 50244.3, 300 sec: 47652.4). Total num frames: 997654528. Throughput: 0: 47006.7. Samples: 998753200. Policy #0 lag: (min: 0.0, avg: 45.5, max: 88.0) [2024-03-21 04:41:00,522][03784] Avg episode reward: [(0, '0.908')] [2024-03-21 04:41:04,366][04017] Updated weights for policy 0, policy_version 30451 (0.0010) [2024-03-21 04:41:05,521][03784] Fps is (10 sec: 52428.9, 60 sec: 51336.6, 300 sec: 47430.3). Total num frames: 997818368. Throughput: 0: 46946.8. Samples: 999032900. Policy #0 lag: (min: 0.0, avg: 45.5, max: 88.0) [2024-03-21 04:41:05,522][03784] Avg episode reward: [(0, '0.908')] [2024-03-21 04:41:06,587][03995] Signal inference workers to stop experience collection... (20050 times) [2024-03-21 04:41:06,699][04017] InferenceWorker_p0-w0: stopping experience collection (20050 times) [2024-03-21 04:41:06,783][03995] Signal inference workers to resume experience collection... (20050 times) [2024-03-21 04:41:06,784][04017] InferenceWorker_p0-w0: resuming experience collection (20050 times) [2024-03-21 04:41:10,521][03784] Fps is (10 sec: 36045.0, 60 sec: 49151.9, 300 sec: 46652.7). Total num frames: 998014976. Throughput: 0: 47302.2. Samples: 999339700. Policy #0 lag: (min: 0.0, avg: 39.2, max: 92.0) [2024-03-21 04:41:10,522][03784] Avg episode reward: [(0, '0.908')] [2024-03-21 04:41:15,521][03784] Fps is (10 sec: 29490.9, 60 sec: 46421.4, 300 sec: 46208.4). Total num frames: 998113280. Throughput: 0: 47444.5. Samples: 999491800. Policy #0 lag: (min: 0.0, avg: 39.2, max: 92.0) [2024-03-21 04:41:15,522][03784] Avg episode reward: [(0, '1.166')] [2024-03-21 04:41:16,107][04017] Updated weights for policy 0, policy_version 30461 (0.0015) [2024-03-21 04:41:20,394][04017] Updated weights for policy 0, policy_version 30471 (0.0014) [2024-03-21 04:41:20,521][03784] Fps is (10 sec: 45875.0, 60 sec: 48059.7, 300 sec: 46763.8). Total num frames: 998473728. Throughput: 0: 47682.2. Samples: 999773400. Policy #0 lag: (min: 0.0, avg: 39.2, max: 92.0) [2024-03-21 04:41:20,522][03784] Avg episode reward: [(0, '0.524')] [2024-03-21 04:41:25,521][03784] Fps is (10 sec: 62259.8, 60 sec: 49152.0, 300 sec: 46874.9). Total num frames: 998735872. Throughput: 0: 48631.3. Samples: 1000064800. Policy #0 lag: (min: 0.0, avg: 39.2, max: 92.0) [2024-03-21 04:41:25,522][03784] Avg episode reward: [(0, '1.186')] [2024-03-21 04:41:26,708][04017] Updated weights for policy 0, policy_version 30481 (0.0011) [2024-03-21 04:41:30,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 998899712. Throughput: 0: 48339.8. Samples: 1000211000. Policy #0 lag: (min: 0.0, avg: 39.2, max: 92.0) [2024-03-21 04:41:30,522][03784] Avg episode reward: [(0, '1.617')] [2024-03-21 04:41:35,521][03784] Fps is (10 sec: 26214.1, 60 sec: 43144.5, 300 sec: 46652.7). Total num frames: 998998016. Throughput: 0: 48042.2. Samples: 1000508000. Policy #0 lag: (min: 0.0, avg: 39.7, max: 94.0) [2024-03-21 04:41:35,522][03784] Avg episode reward: [(0, '0.503')] [2024-03-21 04:41:36,783][04017] Updated weights for policy 0, policy_version 30491 (0.0017) [2024-03-21 04:41:40,521][03784] Fps is (10 sec: 52428.7, 60 sec: 45329.1, 300 sec: 47652.4). Total num frames: 999424000. Throughput: 0: 47264.3. Samples: 1000745700. Policy #0 lag: (min: 0.0, avg: 39.7, max: 94.0) [2024-03-21 04:41:40,522][03784] Avg episode reward: [(0, '1.208')] [2024-03-21 04:41:40,749][04017] Updated weights for policy 0, policy_version 30501 (0.0017) [2024-03-21 04:41:45,521][03784] Fps is (10 sec: 65536.4, 60 sec: 46967.4, 300 sec: 47208.1). Total num frames: 999653376. Throughput: 0: 47433.4. Samples: 1000887700. Policy #0 lag: (min: 0.0, avg: 39.7, max: 94.0) [2024-03-21 04:41:45,522][03784] Avg episode reward: [(0, '0.699')] [2024-03-21 04:41:47,360][04017] Updated weights for policy 0, policy_version 30511 (0.0011) [2024-03-21 04:41:50,521][03784] Fps is (10 sec: 52428.9, 60 sec: 49152.0, 300 sec: 47430.3). Total num frames: 999948288. Throughput: 0: 47615.4. Samples: 1001175600. Policy #0 lag: (min: 0.0, avg: 39.7, max: 94.0) [2024-03-21 04:41:50,522][03784] Avg episode reward: [(0, '1.189')] [2024-03-21 04:41:53,800][04017] Updated weights for policy 0, policy_version 30521 (0.0019) [2024-03-21 04:41:55,521][03784] Fps is (10 sec: 55705.3, 60 sec: 48605.8, 300 sec: 47652.4). Total num frames: 1000210432. Throughput: 0: 47237.7. Samples: 1001465400. Policy #0 lag: (min: 0.0, avg: 36.2, max: 70.0) [2024-03-21 04:41:55,522][03784] Avg episode reward: [(0, '0.957')] [2024-03-21 04:41:55,867][03995] Signal inference workers to stop experience collection... (20100 times) [2024-03-21 04:41:55,868][03995] Signal inference workers to resume experience collection... (20100 times) [2024-03-21 04:41:55,929][04017] InferenceWorker_p0-w0: stopping experience collection (20100 times) [2024-03-21 04:41:55,929][04017] InferenceWorker_p0-w0: resuming experience collection (20100 times) [2024-03-21 04:42:00,521][03784] Fps is (10 sec: 42598.0, 60 sec: 45329.0, 300 sec: 47097.0). Total num frames: 1000374272. Throughput: 0: 47206.6. Samples: 1001616100. Policy #0 lag: (min: 0.0, avg: 36.2, max: 70.0) [2024-03-21 04:42:00,522][03784] Avg episode reward: [(0, '0.951')] [2024-03-21 04:42:00,689][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000030530_1000407040.pth... [2024-03-21 04:42:00,805][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000030179_988905472.pth [2024-03-21 04:42:01,146][04017] Updated weights for policy 0, policy_version 30531 (0.0011) [2024-03-21 04:42:05,521][03784] Fps is (10 sec: 49152.2, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 1000701952. Throughput: 0: 47444.5. Samples: 1001908400. Policy #0 lag: (min: 0.0, avg: 36.2, max: 70.0) [2024-03-21 04:42:05,522][03784] Avg episode reward: [(0, '0.951')] [2024-03-21 04:42:07,718][04017] Updated weights for policy 0, policy_version 30541 (0.0016) [2024-03-21 04:42:10,521][03784] Fps is (10 sec: 52429.5, 60 sec: 48059.7, 300 sec: 47097.1). Total num frames: 1000898560. Throughput: 0: 47282.2. Samples: 1002192500. Policy #0 lag: (min: 0.0, avg: 36.2, max: 70.0) [2024-03-21 04:42:10,522][03784] Avg episode reward: [(0, '1.232')] [2024-03-21 04:42:15,521][03784] Fps is (10 sec: 26214.5, 60 sec: 47513.7, 300 sec: 46986.0). Total num frames: 1000964096. Throughput: 0: 47553.4. Samples: 1002350900. Policy #0 lag: (min: 0.0, avg: 36.2, max: 70.0) [2024-03-21 04:42:15,522][03784] Avg episode reward: [(0, '1.391')] [2024-03-21 04:42:19,654][04017] Updated weights for policy 0, policy_version 30551 (0.0010) [2024-03-21 04:42:20,521][03784] Fps is (10 sec: 26214.6, 60 sec: 44783.0, 300 sec: 46763.8). Total num frames: 1001160704. Throughput: 0: 47309.0. Samples: 1002636900. Policy #0 lag: (min: 1.0, avg: 24.8, max: 71.0) [2024-03-21 04:42:20,522][03784] Avg episode reward: [(0, '1.151')] [2024-03-21 04:42:23,153][04017] Updated weights for policy 0, policy_version 30561 (0.0018) [2024-03-21 04:42:25,521][03784] Fps is (10 sec: 55705.8, 60 sec: 46421.3, 300 sec: 47097.1). Total num frames: 1001521152. Throughput: 0: 47597.9. Samples: 1002887600. Policy #0 lag: (min: 1.0, avg: 24.8, max: 71.0) [2024-03-21 04:42:25,522][03784] Avg episode reward: [(0, '0.854')] [2024-03-21 04:42:29,105][04017] Updated weights for policy 0, policy_version 30571 (0.0012) [2024-03-21 04:42:30,521][03784] Fps is (10 sec: 65534.7, 60 sec: 48605.8, 300 sec: 47319.2). Total num frames: 1001816064. Throughput: 0: 47546.5. Samples: 1003027300. Policy #0 lag: (min: 1.0, avg: 24.8, max: 71.0) [2024-03-21 04:42:30,522][03784] Avg episode reward: [(0, '1.446')] [2024-03-21 04:42:35,521][03784] Fps is (10 sec: 45875.1, 60 sec: 49698.2, 300 sec: 46986.0). Total num frames: 1001979904. Throughput: 0: 47191.2. Samples: 1003299200. Policy #0 lag: (min: 1.0, avg: 24.8, max: 71.0) [2024-03-21 04:42:35,522][03784] Avg episode reward: [(0, '1.135')] [2024-03-21 04:42:36,771][04017] Updated weights for policy 0, policy_version 30581 (0.0016) [2024-03-21 04:42:40,521][03784] Fps is (10 sec: 45875.6, 60 sec: 47513.6, 300 sec: 47097.1). Total num frames: 1002274816. Throughput: 0: 46933.4. Samples: 1003577400. Policy #0 lag: (min: 0.0, avg: 30.8, max: 66.0) [2024-03-21 04:42:40,523][03784] Avg episode reward: [(0, '0.625')] [2024-03-21 04:42:44,252][04017] Updated weights for policy 0, policy_version 30591 (0.0010) [2024-03-21 04:42:45,521][03784] Fps is (10 sec: 49151.6, 60 sec: 46967.4, 300 sec: 47208.1). Total num frames: 1002471424. Throughput: 0: 46706.8. Samples: 1003717900. Policy #0 lag: (min: 0.0, avg: 30.8, max: 66.0) [2024-03-21 04:42:45,522][03784] Avg episode reward: [(0, '0.994')] [2024-03-21 04:42:50,521][03784] Fps is (10 sec: 32767.5, 60 sec: 44236.7, 300 sec: 46763.8). Total num frames: 1002602496. Throughput: 0: 46891.0. Samples: 1004018500. Policy #0 lag: (min: 0.0, avg: 30.8, max: 66.0) [2024-03-21 04:42:50,523][03784] Avg episode reward: [(0, '0.844')] [2024-03-21 04:42:51,763][04017] Updated weights for policy 0, policy_version 30601 (0.0010) [2024-03-21 04:42:52,791][03995] Signal inference workers to stop experience collection... (20150 times) [2024-03-21 04:42:52,792][03995] Signal inference workers to resume experience collection... (20150 times) [2024-03-21 04:42:52,862][04017] InferenceWorker_p0-w0: stopping experience collection (20150 times) [2024-03-21 04:42:52,862][04017] InferenceWorker_p0-w0: resuming experience collection (20150 times) [2024-03-21 04:42:55,521][03784] Fps is (10 sec: 45875.5, 60 sec: 45329.1, 300 sec: 47541.4). Total num frames: 1002930176. Throughput: 0: 46642.2. Samples: 1004291400. Policy #0 lag: (min: 0.0, avg: 30.8, max: 66.0) [2024-03-21 04:42:55,522][03784] Avg episode reward: [(0, '1.123')] [2024-03-21 04:42:58,197][04017] Updated weights for policy 0, policy_version 30611 (0.0023) [2024-03-21 04:43:00,521][03784] Fps is (10 sec: 49152.7, 60 sec: 45329.1, 300 sec: 47208.1). Total num frames: 1003094016. Throughput: 0: 46195.5. Samples: 1004429700. Policy #0 lag: (min: 0.0, avg: 39.1, max: 79.0) [2024-03-21 04:43:00,522][03784] Avg episode reward: [(0, '0.564')] [2024-03-21 04:43:05,521][03784] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 47097.1). Total num frames: 1003356160. Throughput: 0: 46437.7. Samples: 1004726600. Policy #0 lag: (min: 0.0, avg: 39.1, max: 79.0) [2024-03-21 04:43:05,522][03784] Avg episode reward: [(0, '1.210')] [2024-03-21 04:43:06,049][04017] Updated weights for policy 0, policy_version 30621 (0.0009) [2024-03-21 04:43:10,274][04017] Updated weights for policy 0, policy_version 30631 (0.0015) [2024-03-21 04:43:10,521][03784] Fps is (10 sec: 62259.3, 60 sec: 46967.4, 300 sec: 47874.6). Total num frames: 1003716608. Throughput: 0: 46937.7. Samples: 1004999800. Policy #0 lag: (min: 0.0, avg: 39.1, max: 79.0) [2024-03-21 04:43:10,522][03784] Avg episode reward: [(0, '1.210')] [2024-03-21 04:43:14,541][04017] Updated weights for policy 0, policy_version 30641 (0.0011) [2024-03-21 04:43:15,521][03784] Fps is (10 sec: 72089.5, 60 sec: 51882.7, 300 sec: 47763.5). Total num frames: 1004077056. Throughput: 0: 46726.8. Samples: 1005130000. Policy #0 lag: (min: 0.0, avg: 39.1, max: 79.0) [2024-03-21 04:43:15,522][03784] Avg episode reward: [(0, '0.921')] [2024-03-21 04:43:20,498][04017] Updated weights for policy 0, policy_version 30651 (0.0010) [2024-03-21 04:43:20,521][03784] Fps is (10 sec: 65536.1, 60 sec: 53521.0, 300 sec: 48096.8). Total num frames: 1004371968. Throughput: 0: 46408.8. Samples: 1005387600. Policy #0 lag: (min: 0.0, avg: 38.7, max: 77.0) [2024-03-21 04:43:20,522][03784] Avg episode reward: [(0, '1.465')] [2024-03-21 04:43:25,521][03784] Fps is (10 sec: 36044.6, 60 sec: 48605.8, 300 sec: 47208.2). Total num frames: 1004437504. Throughput: 0: 47237.8. Samples: 1005703100. Policy #0 lag: (min: 0.0, avg: 38.7, max: 77.0) [2024-03-21 04:43:25,522][03784] Avg episode reward: [(0, '1.342')] [2024-03-21 04:43:30,521][03784] Fps is (10 sec: 19660.6, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 1004568576. Throughput: 0: 47317.7. Samples: 1005847200. Policy #0 lag: (min: 0.0, avg: 38.7, max: 77.0) [2024-03-21 04:43:30,522][03784] Avg episode reward: [(0, '1.228')] [2024-03-21 04:43:35,521][03784] Fps is (10 sec: 22937.7, 60 sec: 44782.9, 300 sec: 46319.5). Total num frames: 1004666880. Throughput: 0: 47366.9. Samples: 1006150000. Policy #0 lag: (min: 0.0, avg: 38.7, max: 77.0) [2024-03-21 04:43:35,522][03784] Avg episode reward: [(0, '1.748')] [2024-03-21 04:43:35,523][03995] Saving new best policy, reward=1.748! [2024-03-21 04:43:36,941][04017] Updated weights for policy 0, policy_version 30661 (0.0015) [2024-03-21 04:43:40,521][03784] Fps is (10 sec: 36045.2, 60 sec: 44236.8, 300 sec: 46430.6). Total num frames: 1004929024. Throughput: 0: 47308.9. Samples: 1006420300. Policy #0 lag: (min: 1.0, avg: 60.8, max: 113.0) [2024-03-21 04:43:40,522][03784] Avg episode reward: [(0, '1.162')] [2024-03-21 04:43:41,852][04017] Updated weights for policy 0, policy_version 30671 (0.0017) [2024-03-21 04:43:45,521][03784] Fps is (10 sec: 36045.0, 60 sec: 42598.5, 300 sec: 45986.3). Total num frames: 1005027328. Throughput: 0: 47115.7. Samples: 1006549900. Policy #0 lag: (min: 1.0, avg: 60.8, max: 113.0) [2024-03-21 04:43:45,522][03784] Avg episode reward: [(0, '1.428')] [2024-03-21 04:43:50,521][03784] Fps is (10 sec: 32768.0, 60 sec: 44236.9, 300 sec: 46208.4). Total num frames: 1005256704. Throughput: 0: 46655.5. Samples: 1006826100. Policy #0 lag: (min: 1.0, avg: 60.8, max: 113.0) [2024-03-21 04:43:50,522][03784] Avg episode reward: [(0, '1.093')] [2024-03-21 04:43:51,895][03995] Signal inference workers to stop experience collection... (20200 times) [2024-03-21 04:43:51,971][04017] InferenceWorker_p0-w0: stopping experience collection (20200 times) [2024-03-21 04:43:52,239][03995] Signal inference workers to resume experience collection... (20200 times) [2024-03-21 04:43:52,240][04017] InferenceWorker_p0-w0: resuming experience collection (20200 times) [2024-03-21 04:43:52,241][04017] Updated weights for policy 0, policy_version 30681 (0.0016) [2024-03-21 04:43:55,521][03784] Fps is (10 sec: 62258.3, 60 sec: 45329.0, 300 sec: 47208.1). Total num frames: 1005649920. Throughput: 0: 46339.9. Samples: 1007085100. Policy #0 lag: (min: 1.0, avg: 60.8, max: 113.0) [2024-03-21 04:43:55,522][03784] Avg episode reward: [(0, '1.459')] [2024-03-21 04:43:55,704][04017] Updated weights for policy 0, policy_version 30691 (0.0024) [2024-03-21 04:44:00,123][04017] Updated weights for policy 0, policy_version 30701 (0.0014) [2024-03-21 04:44:00,521][03784] Fps is (10 sec: 78642.2, 60 sec: 49151.9, 300 sec: 47319.2). Total num frames: 1006043136. Throughput: 0: 46375.4. Samples: 1007216900. Policy #0 lag: (min: 0.0, avg: 33.0, max: 70.0) [2024-03-21 04:44:00,522][03784] Avg episode reward: [(0, '0.624')] [2024-03-21 04:44:00,784][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000030703_1006075904.pth... [2024-03-21 04:44:00,899][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000030354_994639872.pth [2024-03-21 04:44:04,889][04017] Updated weights for policy 0, policy_version 30711 (0.0015) [2024-03-21 04:44:05,521][03784] Fps is (10 sec: 75366.0, 60 sec: 50790.2, 300 sec: 48318.9). Total num frames: 1006403584. Throughput: 0: 46591.0. Samples: 1007484200. Policy #0 lag: (min: 0.0, avg: 33.0, max: 70.0) [2024-03-21 04:44:05,522][03784] Avg episode reward: [(0, '0.751')] [2024-03-21 04:44:10,521][03784] Fps is (10 sec: 58983.2, 60 sec: 48605.9, 300 sec: 48096.8). Total num frames: 1006632960. Throughput: 0: 45577.8. Samples: 1007754100. Policy #0 lag: (min: 0.0, avg: 33.0, max: 70.0) [2024-03-21 04:44:10,522][03784] Avg episode reward: [(0, '1.674')] [2024-03-21 04:44:11,251][04017] Updated weights for policy 0, policy_version 30721 (0.0015) [2024-03-21 04:44:15,521][03784] Fps is (10 sec: 52429.6, 60 sec: 47513.6, 300 sec: 48874.3). Total num frames: 1006927872. Throughput: 0: 45626.8. Samples: 1007900400. Policy #0 lag: (min: 0.0, avg: 33.0, max: 70.0) [2024-03-21 04:44:15,522][03784] Avg episode reward: [(0, '1.000')] [2024-03-21 04:44:18,297][04017] Updated weights for policy 0, policy_version 30731 (0.0014) [2024-03-21 04:44:20,521][03784] Fps is (10 sec: 39321.6, 60 sec: 44236.8, 300 sec: 47985.7). Total num frames: 1007026176. Throughput: 0: 45168.9. Samples: 1008182600. Policy #0 lag: (min: 0.0, avg: 48.7, max: 98.0) [2024-03-21 04:44:20,522][03784] Avg episode reward: [(0, '1.367')] [2024-03-21 04:44:25,521][03784] Fps is (10 sec: 36044.7, 60 sec: 47513.6, 300 sec: 47097.1). Total num frames: 1007288320. Throughput: 0: 45515.5. Samples: 1008468500. Policy #0 lag: (min: 0.0, avg: 48.7, max: 98.0) [2024-03-21 04:44:25,522][03784] Avg episode reward: [(0, '0.797')] [2024-03-21 04:44:26,893][04017] Updated weights for policy 0, policy_version 30741 (0.0016) [2024-03-21 04:44:30,521][03784] Fps is (10 sec: 32768.0, 60 sec: 46421.4, 300 sec: 46763.8). Total num frames: 1007353856. Throughput: 0: 46086.6. Samples: 1008623800. Policy #0 lag: (min: 0.0, avg: 48.7, max: 98.0) [2024-03-21 04:44:30,522][03784] Avg episode reward: [(0, '0.797')] [2024-03-21 04:44:35,521][03784] Fps is (10 sec: 19660.7, 60 sec: 46967.4, 300 sec: 45986.3). Total num frames: 1007484928. Throughput: 0: 46644.4. Samples: 1008925100. Policy #0 lag: (min: 0.0, avg: 48.7, max: 98.0) [2024-03-21 04:44:35,522][03784] Avg episode reward: [(0, '0.920')] [2024-03-21 04:44:36,667][03995] Signal inference workers to stop experience collection... (20250 times) [2024-03-21 04:44:36,667][03995] Signal inference workers to resume experience collection... (20250 times) [2024-03-21 04:44:36,747][04017] InferenceWorker_p0-w0: stopping experience collection (20250 times) [2024-03-21 04:44:36,748][04017] InferenceWorker_p0-w0: resuming experience collection (20250 times) [2024-03-21 04:44:37,668][04017] Updated weights for policy 0, policy_version 30751 (0.0011) [2024-03-21 04:44:40,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46967.4, 300 sec: 46319.5). Total num frames: 1007747072. Throughput: 0: 47380.1. Samples: 1009217200. Policy #0 lag: (min: 3.0, avg: 28.6, max: 73.0) [2024-03-21 04:44:40,522][03784] Avg episode reward: [(0, '1.247')] [2024-03-21 04:44:45,506][04017] Updated weights for policy 0, policy_version 30761 (0.0010) [2024-03-21 04:44:45,521][03784] Fps is (10 sec: 49151.9, 60 sec: 49151.9, 300 sec: 46430.6). Total num frames: 1007976448. Throughput: 0: 47444.5. Samples: 1009351900. Policy #0 lag: (min: 3.0, avg: 28.6, max: 73.0) [2024-03-21 04:44:45,522][03784] Avg episode reward: [(0, '1.078')] [2024-03-21 04:44:50,518][04017] Updated weights for policy 0, policy_version 30771 (0.0010) [2024-03-21 04:44:50,521][03784] Fps is (10 sec: 55705.7, 60 sec: 50790.4, 300 sec: 46763.8). Total num frames: 1008304128. Throughput: 0: 47737.9. Samples: 1009632400. Policy #0 lag: (min: 3.0, avg: 28.6, max: 73.0) [2024-03-21 04:44:50,522][03784] Avg episode reward: [(0, '0.446')] [2024-03-21 04:44:55,521][03784] Fps is (10 sec: 55705.4, 60 sec: 48059.7, 300 sec: 47097.1). Total num frames: 1008533504. Throughput: 0: 48364.3. Samples: 1009930500. Policy #0 lag: (min: 3.0, avg: 28.6, max: 73.0) [2024-03-21 04:44:55,522][03784] Avg episode reward: [(0, '0.446')] [2024-03-21 04:44:56,838][04017] Updated weights for policy 0, policy_version 30781 (0.0020) [2024-03-21 04:45:00,521][03784] Fps is (10 sec: 58982.2, 60 sec: 47513.7, 300 sec: 47985.7). Total num frames: 1008893952. Throughput: 0: 47977.7. Samples: 1010059400. Policy #0 lag: (min: 1.0, avg: 46.4, max: 101.0) [2024-03-21 04:45:00,522][03784] Avg episode reward: [(0, '1.316')] [2024-03-21 04:45:01,694][04017] Updated weights for policy 0, policy_version 30791 (0.0010) [2024-03-21 04:45:05,521][03784] Fps is (10 sec: 55706.6, 60 sec: 44783.1, 300 sec: 47541.4). Total num frames: 1009090560. Throughput: 0: 47955.6. Samples: 1010340600. Policy #0 lag: (min: 1.0, avg: 46.4, max: 101.0) [2024-03-21 04:45:05,522][03784] Avg episode reward: [(0, '1.251')] [2024-03-21 04:45:10,315][04017] Updated weights for policy 0, policy_version 30801 (0.0015) [2024-03-21 04:45:10,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 47319.2). Total num frames: 1009287168. Throughput: 0: 47680.0. Samples: 1010614100. Policy #0 lag: (min: 1.0, avg: 46.4, max: 101.0) [2024-03-21 04:45:10,523][03784] Avg episode reward: [(0, '1.358')] [2024-03-21 04:45:15,521][03784] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 47097.1). Total num frames: 1009483776. Throughput: 0: 46982.2. Samples: 1010738000. Policy #0 lag: (min: 1.0, avg: 46.4, max: 101.0) [2024-03-21 04:45:15,522][03784] Avg episode reward: [(0, '0.681')] [2024-03-21 04:45:17,250][04017] Updated weights for policy 0, policy_version 30811 (0.0016) [2024-03-21 04:45:20,521][03784] Fps is (10 sec: 55705.3, 60 sec: 46967.4, 300 sec: 47652.4). Total num frames: 1009844224. Throughput: 0: 46966.6. Samples: 1011038600. Policy #0 lag: (min: 1.0, avg: 63.9, max: 115.0) [2024-03-21 04:45:20,522][03784] Avg episode reward: [(0, '0.499')] [2024-03-21 04:45:21,445][04017] Updated weights for policy 0, policy_version 30821 (0.0010) [2024-03-21 04:45:25,521][03784] Fps is (10 sec: 52428.6, 60 sec: 45329.1, 300 sec: 47208.1). Total num frames: 1010008064. Throughput: 0: 46624.4. Samples: 1011315300. Policy #0 lag: (min: 1.0, avg: 63.9, max: 115.0) [2024-03-21 04:45:25,522][03784] Avg episode reward: [(0, '0.931')] [2024-03-21 04:45:30,521][03784] Fps is (10 sec: 32768.3, 60 sec: 46967.5, 300 sec: 46652.8). Total num frames: 1010171904. Throughput: 0: 46809.0. Samples: 1011458300. Policy #0 lag: (min: 1.0, avg: 63.9, max: 115.0) [2024-03-21 04:45:30,522][03784] Avg episode reward: [(0, '1.609')] [2024-03-21 04:45:31,583][03995] Signal inference workers to stop experience collection... (20300 times) [2024-03-21 04:45:31,584][03995] Signal inference workers to resume experience collection... (20300 times) [2024-03-21 04:45:31,792][04017] InferenceWorker_p0-w0: stopping experience collection (20300 times) [2024-03-21 04:45:31,792][04017] InferenceWorker_p0-w0: resuming experience collection (20300 times) [2024-03-21 04:45:31,937][04017] Updated weights for policy 0, policy_version 30831 (0.0015) [2024-03-21 04:45:35,521][03784] Fps is (10 sec: 45875.4, 60 sec: 49698.2, 300 sec: 46652.8). Total num frames: 1010466816. Throughput: 0: 46413.4. Samples: 1011721000. Policy #0 lag: (min: 1.0, avg: 63.9, max: 115.0) [2024-03-21 04:45:35,522][03784] Avg episode reward: [(0, '1.590')] [2024-03-21 04:45:39,512][04017] Updated weights for policy 0, policy_version 30841 (0.0017) [2024-03-21 04:45:40,521][03784] Fps is (10 sec: 45875.0, 60 sec: 48059.8, 300 sec: 46763.8). Total num frames: 1010630656. Throughput: 0: 46644.6. Samples: 1012029500. Policy #0 lag: (min: 0.0, avg: 52.6, max: 109.0) [2024-03-21 04:45:40,522][03784] Avg episode reward: [(0, '1.162')] [2024-03-21 04:45:44,365][04017] Updated weights for policy 0, policy_version 30851 (0.0016) [2024-03-21 04:45:45,521][03784] Fps is (10 sec: 49151.2, 60 sec: 49698.1, 300 sec: 47319.2). Total num frames: 1010958336. Throughput: 0: 46564.3. Samples: 1012154800. Policy #0 lag: (min: 0.0, avg: 52.6, max: 109.0) [2024-03-21 04:45:45,522][03784] Avg episode reward: [(0, '0.631')] [2024-03-21 04:45:49,494][04017] Updated weights for policy 0, policy_version 30861 (0.0011) [2024-03-21 04:45:50,521][03784] Fps is (10 sec: 68812.8, 60 sec: 50244.3, 300 sec: 47541.4). Total num frames: 1011318784. Throughput: 0: 46753.3. Samples: 1012444500. Policy #0 lag: (min: 0.0, avg: 52.6, max: 109.0) [2024-03-21 04:45:50,522][03784] Avg episode reward: [(0, '0.631')] [2024-03-21 04:45:55,521][03784] Fps is (10 sec: 49152.8, 60 sec: 48606.0, 300 sec: 46763.8). Total num frames: 1011449856. Throughput: 0: 47242.3. Samples: 1012740000. Policy #0 lag: (min: 0.0, avg: 52.6, max: 109.0) [2024-03-21 04:45:55,522][03784] Avg episode reward: [(0, '1.210')] [2024-03-21 04:45:58,175][04017] Updated weights for policy 0, policy_version 30871 (0.0010) [2024-03-21 04:46:00,521][03784] Fps is (10 sec: 26214.2, 60 sec: 44782.9, 300 sec: 46652.7). Total num frames: 1011580928. Throughput: 0: 47842.1. Samples: 1012890900. Policy #0 lag: (min: 0.0, avg: 52.6, max: 109.0) [2024-03-21 04:46:00,522][03784] Avg episode reward: [(0, '0.859')] [2024-03-21 04:46:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000030871_1011580928.pth... [2024-03-21 04:46:00,659][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000030530_1000407040.pth [2024-03-21 04:46:05,521][03784] Fps is (10 sec: 39321.2, 60 sec: 45875.1, 300 sec: 46874.9). Total num frames: 1011843072. Throughput: 0: 48033.3. Samples: 1013200100. Policy #0 lag: (min: 0.0, avg: 39.4, max: 81.0) [2024-03-21 04:46:05,522][03784] Avg episode reward: [(0, '0.859')] [2024-03-21 04:46:07,280][04017] Updated weights for policy 0, policy_version 30881 (0.0021) [2024-03-21 04:46:10,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45329.0, 300 sec: 47097.1). Total num frames: 1012006912. Throughput: 0: 48260.0. Samples: 1013487000. Policy #0 lag: (min: 0.0, avg: 39.4, max: 81.0) [2024-03-21 04:46:10,522][03784] Avg episode reward: [(0, '1.586')] [2024-03-21 04:46:14,809][04017] Updated weights for policy 0, policy_version 30891 (0.0011) [2024-03-21 04:46:15,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46967.4, 300 sec: 46874.9). Total num frames: 1012301824. Throughput: 0: 48004.4. Samples: 1013618500. Policy #0 lag: (min: 0.0, avg: 39.4, max: 81.0) [2024-03-21 04:46:15,522][03784] Avg episode reward: [(0, '1.099')] [2024-03-21 04:46:20,521][03784] Fps is (10 sec: 52428.6, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 1012531200. Throughput: 0: 48657.7. Samples: 1013910600. Policy #0 lag: (min: 0.0, avg: 39.4, max: 81.0) [2024-03-21 04:46:20,522][03784] Avg episode reward: [(0, '1.298')] [2024-03-21 04:46:22,148][04017] Updated weights for policy 0, policy_version 30901 (0.0016) [2024-03-21 04:46:22,981][03995] Signal inference workers to stop experience collection... (20350 times) [2024-03-21 04:46:23,035][04017] InferenceWorker_p0-w0: stopping experience collection (20350 times) [2024-03-21 04:46:23,049][03995] Signal inference workers to resume experience collection... (20350 times) [2024-03-21 04:46:23,079][04017] InferenceWorker_p0-w0: resuming experience collection (20350 times) [2024-03-21 04:46:25,521][03784] Fps is (10 sec: 52428.7, 60 sec: 46967.5, 300 sec: 47208.1). Total num frames: 1012826112. Throughput: 0: 47715.5. Samples: 1014176700. Policy #0 lag: (min: 0.0, avg: 27.6, max: 60.0) [2024-03-21 04:46:25,522][03784] Avg episode reward: [(0, '1.112')] [2024-03-21 04:46:27,396][04017] Updated weights for policy 0, policy_version 30911 (0.0012) [2024-03-21 04:46:30,521][03784] Fps is (10 sec: 58982.4, 60 sec: 49151.9, 300 sec: 47874.6). Total num frames: 1013121024. Throughput: 0: 48142.3. Samples: 1014321200. Policy #0 lag: (min: 0.0, avg: 27.6, max: 60.0) [2024-03-21 04:46:30,522][03784] Avg episode reward: [(0, '1.110')] [2024-03-21 04:46:31,628][04017] Updated weights for policy 0, policy_version 30921 (0.0018) [2024-03-21 04:46:35,119][04017] Updated weights for policy 0, policy_version 30931 (0.0031) [2024-03-21 04:46:35,521][03784] Fps is (10 sec: 75366.5, 60 sec: 51882.6, 300 sec: 47985.7). Total num frames: 1013579776. Throughput: 0: 47220.0. Samples: 1014569400. Policy #0 lag: (min: 0.0, avg: 27.6, max: 60.0) [2024-03-21 04:46:35,522][03784] Avg episode reward: [(0, '0.905')] [2024-03-21 04:46:40,521][03784] Fps is (10 sec: 55705.7, 60 sec: 50790.3, 300 sec: 47541.4). Total num frames: 1013678080. Throughput: 0: 47113.2. Samples: 1014860100. Policy #0 lag: (min: 0.0, avg: 27.6, max: 60.0) [2024-03-21 04:46:40,522][03784] Avg episode reward: [(0, '0.905')] [2024-03-21 04:46:45,521][03784] Fps is (10 sec: 26214.1, 60 sec: 48059.7, 300 sec: 47097.0). Total num frames: 1013841920. Throughput: 0: 47255.5. Samples: 1015017400. Policy #0 lag: (min: 0.0, avg: 37.6, max: 77.0) [2024-03-21 04:46:45,522][03784] Avg episode reward: [(0, '1.012')] [2024-03-21 04:46:45,818][04017] Updated weights for policy 0, policy_version 30941 (0.0015) [2024-03-21 04:46:50,521][03784] Fps is (10 sec: 32768.1, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 1014005760. Throughput: 0: 46757.8. Samples: 1015304200. Policy #0 lag: (min: 0.0, avg: 37.6, max: 77.0) [2024-03-21 04:46:50,522][03784] Avg episode reward: [(0, '0.675')] [2024-03-21 04:46:54,691][04017] Updated weights for policy 0, policy_version 30951 (0.0011) [2024-03-21 04:46:55,521][03784] Fps is (10 sec: 42599.1, 60 sec: 46967.5, 300 sec: 47097.1). Total num frames: 1014267904. Throughput: 0: 46482.3. Samples: 1015578700. Policy #0 lag: (min: 0.0, avg: 37.6, max: 77.0) [2024-03-21 04:46:55,522][03784] Avg episode reward: [(0, '1.343')] [2024-03-21 04:47:00,521][03784] Fps is (10 sec: 39321.6, 60 sec: 46967.5, 300 sec: 46430.6). Total num frames: 1014398976. Throughput: 0: 47064.4. Samples: 1015736400. Policy #0 lag: (min: 0.0, avg: 37.6, max: 77.0) [2024-03-21 04:47:00,522][03784] Avg episode reward: [(0, '1.397')] [2024-03-21 04:47:05,521][03784] Fps is (10 sec: 22937.7, 60 sec: 44236.9, 300 sec: 46097.4). Total num frames: 1014497280. Throughput: 0: 46746.8. Samples: 1016014200. Policy #0 lag: (min: 0.0, avg: 28.7, max: 69.0) [2024-03-21 04:47:05,522][03784] Avg episode reward: [(0, '1.443')] [2024-03-21 04:47:05,921][04017] Updated weights for policy 0, policy_version 30961 (0.0010) [2024-03-21 04:47:10,521][03784] Fps is (10 sec: 42597.8, 60 sec: 46967.4, 300 sec: 46985.9). Total num frames: 1014824960. Throughput: 0: 47455.4. Samples: 1016312200. Policy #0 lag: (min: 0.0, avg: 28.7, max: 69.0) [2024-03-21 04:47:10,522][03784] Avg episode reward: [(0, '1.443')] [2024-03-21 04:47:10,712][04017] Updated weights for policy 0, policy_version 30971 (0.0016) [2024-03-21 04:47:13,432][03995] Signal inference workers to stop experience collection... (20400 times) [2024-03-21 04:47:13,433][03995] Signal inference workers to resume experience collection... (20400 times) [2024-03-21 04:47:13,498][04017] InferenceWorker_p0-w0: stopping experience collection (20400 times) [2024-03-21 04:47:13,498][04017] InferenceWorker_p0-w0: resuming experience collection (20400 times) [2024-03-21 04:47:15,521][03784] Fps is (10 sec: 62259.0, 60 sec: 46967.5, 300 sec: 47319.2). Total num frames: 1015119872. Throughput: 0: 47320.1. Samples: 1016450600. Policy #0 lag: (min: 0.0, avg: 28.7, max: 69.0) [2024-03-21 04:47:15,522][03784] Avg episode reward: [(0, '0.811')] [2024-03-21 04:47:15,923][04017] Updated weights for policy 0, policy_version 30981 (0.0010) [2024-03-21 04:47:20,482][04017] Updated weights for policy 0, policy_version 30991 (0.0012) [2024-03-21 04:47:20,521][03784] Fps is (10 sec: 68813.7, 60 sec: 49698.2, 300 sec: 47430.3). Total num frames: 1015513088. Throughput: 0: 47891.1. Samples: 1016724500. Policy #0 lag: (min: 0.0, avg: 28.7, max: 69.0) [2024-03-21 04:47:20,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 04:47:25,521][03784] Fps is (10 sec: 55705.6, 60 sec: 47513.7, 300 sec: 46986.0). Total num frames: 1015676928. Throughput: 0: 47051.2. Samples: 1016977400. Policy #0 lag: (min: 0.0, avg: 28.7, max: 69.0) [2024-03-21 04:47:25,522][03784] Avg episode reward: [(0, '1.179')] [2024-03-21 04:47:27,611][04017] Updated weights for policy 0, policy_version 31001 (0.0017) [2024-03-21 04:47:30,521][03784] Fps is (10 sec: 42597.9, 60 sec: 46967.4, 300 sec: 47319.2). Total num frames: 1015939072. Throughput: 0: 46557.7. Samples: 1017112500. Policy #0 lag: (min: 0.0, avg: 33.4, max: 95.0) [2024-03-21 04:47:30,522][03784] Avg episode reward: [(0, '1.005')] [2024-03-21 04:47:34,679][04017] Updated weights for policy 0, policy_version 31011 (0.0010) [2024-03-21 04:47:35,521][03784] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 47208.1). Total num frames: 1016201216. Throughput: 0: 46348.9. Samples: 1017389900. Policy #0 lag: (min: 0.0, avg: 33.4, max: 95.0) [2024-03-21 04:47:35,522][03784] Avg episode reward: [(0, '1.408')] [2024-03-21 04:47:40,521][03784] Fps is (10 sec: 45876.0, 60 sec: 45329.1, 300 sec: 47208.1). Total num frames: 1016397824. Throughput: 0: 46655.5. Samples: 1017678200. Policy #0 lag: (min: 0.0, avg: 33.4, max: 95.0) [2024-03-21 04:47:40,522][03784] Avg episode reward: [(0, '0.943')] [2024-03-21 04:47:42,027][04017] Updated weights for policy 0, policy_version 31021 (0.0015) [2024-03-21 04:47:45,521][03784] Fps is (10 sec: 42598.0, 60 sec: 46421.3, 300 sec: 47541.4). Total num frames: 1016627200. Throughput: 0: 46248.8. Samples: 1017817600. Policy #0 lag: (min: 0.0, avg: 33.4, max: 95.0) [2024-03-21 04:47:45,522][03784] Avg episode reward: [(0, '0.923')] [2024-03-21 04:47:49,732][04017] Updated weights for policy 0, policy_version 31031 (0.0011) [2024-03-21 04:47:50,521][03784] Fps is (10 sec: 49151.8, 60 sec: 48059.7, 300 sec: 47319.2). Total num frames: 1016889344. Throughput: 0: 46522.1. Samples: 1018107700. Policy #0 lag: (min: 0.0, avg: 52.4, max: 117.0) [2024-03-21 04:47:50,522][03784] Avg episode reward: [(0, '0.964')] [2024-03-21 04:47:55,521][03784] Fps is (10 sec: 49152.2, 60 sec: 47513.5, 300 sec: 47541.4). Total num frames: 1017118720. Throughput: 0: 46084.5. Samples: 1018386000. Policy #0 lag: (min: 0.0, avg: 52.4, max: 117.0) [2024-03-21 04:47:55,522][03784] Avg episode reward: [(0, '1.289')] [2024-03-21 04:47:55,984][04017] Updated weights for policy 0, policy_version 31041 (0.0011) [2024-03-21 04:48:00,521][03784] Fps is (10 sec: 32768.1, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 1017217024. Throughput: 0: 46346.6. Samples: 1018536200. Policy #0 lag: (min: 0.0, avg: 52.4, max: 117.0) [2024-03-21 04:48:00,522][03784] Avg episode reward: [(0, '1.303')] [2024-03-21 04:48:00,531][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000031043_1017217024.pth... [2024-03-21 04:48:00,659][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000030703_1006075904.pth [2024-03-21 04:48:05,327][04017] Updated weights for policy 0, policy_version 31051 (0.0015) [2024-03-21 04:48:05,372][03995] Signal inference workers to stop experience collection... (20450 times) [2024-03-21 04:48:05,372][03995] Signal inference workers to resume experience collection... (20450 times) [2024-03-21 04:48:05,464][04017] InferenceWorker_p0-w0: stopping experience collection (20450 times) [2024-03-21 04:48:05,464][04017] InferenceWorker_p0-w0: resuming experience collection (20450 times) [2024-03-21 04:48:05,521][03784] Fps is (10 sec: 36044.7, 60 sec: 49698.0, 300 sec: 46652.7). Total num frames: 1017479168. Throughput: 0: 46584.4. Samples: 1018820800. Policy #0 lag: (min: 0.0, avg: 52.4, max: 117.0) [2024-03-21 04:48:05,522][03784] Avg episode reward: [(0, '1.515')] [2024-03-21 04:48:10,521][03784] Fps is (10 sec: 55705.7, 60 sec: 49152.2, 300 sec: 46430.6). Total num frames: 1017774080. Throughput: 0: 47551.1. Samples: 1019117200. Policy #0 lag: (min: 1.0, avg: 38.1, max: 106.0) [2024-03-21 04:48:10,522][03784] Avg episode reward: [(0, '1.515')] [2024-03-21 04:48:11,240][04017] Updated weights for policy 0, policy_version 31061 (0.0015) [2024-03-21 04:48:15,521][03784] Fps is (10 sec: 52429.5, 60 sec: 48059.7, 300 sec: 46208.4). Total num frames: 1018003456. Throughput: 0: 47522.4. Samples: 1019251000. Policy #0 lag: (min: 1.0, avg: 38.1, max: 106.0) [2024-03-21 04:48:15,522][03784] Avg episode reward: [(0, '1.322')] [2024-03-21 04:48:19,252][04017] Updated weights for policy 0, policy_version 31071 (0.0010) [2024-03-21 04:48:20,521][03784] Fps is (10 sec: 45874.7, 60 sec: 45329.0, 300 sec: 46763.8). Total num frames: 1018232832. Throughput: 0: 47355.5. Samples: 1019520900. Policy #0 lag: (min: 1.0, avg: 38.1, max: 106.0) [2024-03-21 04:48:20,522][03784] Avg episode reward: [(0, '1.049')] [2024-03-21 04:48:25,521][03784] Fps is (10 sec: 39321.5, 60 sec: 45329.0, 300 sec: 46874.9). Total num frames: 1018396672. Throughput: 0: 46880.0. Samples: 1019787800. Policy #0 lag: (min: 1.0, avg: 38.1, max: 106.0) [2024-03-21 04:48:25,522][03784] Avg episode reward: [(0, '1.090')] [2024-03-21 04:48:26,969][04017] Updated weights for policy 0, policy_version 31081 (0.0012) [2024-03-21 04:48:30,521][03784] Fps is (10 sec: 49152.7, 60 sec: 46421.5, 300 sec: 47652.5). Total num frames: 1018724352. Throughput: 0: 46646.9. Samples: 1019916700. Policy #0 lag: (min: 0.0, avg: 39.8, max: 86.0) [2024-03-21 04:48:30,522][03784] Avg episode reward: [(0, '1.093')] [2024-03-21 04:48:32,908][04017] Updated weights for policy 0, policy_version 31091 (0.0013) [2024-03-21 04:48:35,521][03784] Fps is (10 sec: 55705.7, 60 sec: 45875.2, 300 sec: 47541.4). Total num frames: 1018953728. Throughput: 0: 46508.9. Samples: 1020200600. Policy #0 lag: (min: 0.0, avg: 39.8, max: 86.0) [2024-03-21 04:48:35,522][03784] Avg episode reward: [(0, '1.013')] [2024-03-21 04:48:39,943][04017] Updated weights for policy 0, policy_version 31101 (0.0011) [2024-03-21 04:48:40,521][03784] Fps is (10 sec: 42597.3, 60 sec: 45875.0, 300 sec: 47874.6). Total num frames: 1019150336. Throughput: 0: 46439.9. Samples: 1020475800. Policy #0 lag: (min: 0.0, avg: 39.8, max: 86.0) [2024-03-21 04:48:40,522][03784] Avg episode reward: [(0, '1.363')] [2024-03-21 04:48:45,521][03784] Fps is (10 sec: 42597.7, 60 sec: 45875.2, 300 sec: 47874.6). Total num frames: 1019379712. Throughput: 0: 46153.2. Samples: 1020613100. Policy #0 lag: (min: 0.0, avg: 39.8, max: 86.0) [2024-03-21 04:48:45,522][03784] Avg episode reward: [(0, '0.975')] [2024-03-21 04:48:47,486][04017] Updated weights for policy 0, policy_version 31111 (0.0016) [2024-03-21 04:48:50,521][03784] Fps is (10 sec: 52429.5, 60 sec: 46421.3, 300 sec: 47541.4). Total num frames: 1019674624. Throughput: 0: 46273.4. Samples: 1020903100. Policy #0 lag: (min: 1.0, avg: 49.3, max: 107.0) [2024-03-21 04:48:50,522][03784] Avg episode reward: [(0, '0.983')] [2024-03-21 04:48:52,067][04017] Updated weights for policy 0, policy_version 31121 (0.0010) [2024-03-21 04:48:55,521][03784] Fps is (10 sec: 42599.0, 60 sec: 44783.0, 300 sec: 46652.8). Total num frames: 1019805696. Throughput: 0: 46193.3. Samples: 1021195900. Policy #0 lag: (min: 1.0, avg: 49.3, max: 107.0) [2024-03-21 04:48:55,522][03784] Avg episode reward: [(0, '0.788')] [2024-03-21 04:49:00,244][04017] Updated weights for policy 0, policy_version 31131 (0.0010) [2024-03-21 04:49:00,521][03784] Fps is (10 sec: 42598.6, 60 sec: 48059.7, 300 sec: 46430.6). Total num frames: 1020100608. Throughput: 0: 46388.8. Samples: 1021338500. Policy #0 lag: (min: 1.0, avg: 49.3, max: 107.0) [2024-03-21 04:49:00,522][03784] Avg episode reward: [(0, '0.788')] [2024-03-21 04:49:02,925][03995] Signal inference workers to stop experience collection... (20500 times) [2024-03-21 04:49:02,990][03995] Signal inference workers to resume experience collection... (20500 times) [2024-03-21 04:49:03,010][04017] InferenceWorker_p0-w0: stopping experience collection (20500 times) [2024-03-21 04:49:03,062][04017] InferenceWorker_p0-w0: resuming experience collection (20500 times) [2024-03-21 04:49:05,521][03784] Fps is (10 sec: 58982.8, 60 sec: 48606.0, 300 sec: 46652.8). Total num frames: 1020395520. Throughput: 0: 46671.3. Samples: 1021621100. Policy #0 lag: (min: 1.0, avg: 49.3, max: 107.0) [2024-03-21 04:49:05,522][03784] Avg episode reward: [(0, '0.921')] [2024-03-21 04:49:06,406][04017] Updated weights for policy 0, policy_version 31141 (0.0014) [2024-03-21 04:49:10,521][03784] Fps is (10 sec: 62259.0, 60 sec: 49151.9, 300 sec: 46763.8). Total num frames: 1020723200. Throughput: 0: 46757.7. Samples: 1021891900. Policy #0 lag: (min: 2.0, avg: 36.6, max: 73.0) [2024-03-21 04:49:10,522][03784] Avg episode reward: [(0, '1.241')] [2024-03-21 04:49:12,792][04017] Updated weights for policy 0, policy_version 31151 (0.0024) [2024-03-21 04:49:15,521][03784] Fps is (10 sec: 62258.8, 60 sec: 50244.2, 300 sec: 47430.3). Total num frames: 1021018112. Throughput: 0: 47366.6. Samples: 1022048200. Policy #0 lag: (min: 2.0, avg: 36.6, max: 73.0) [2024-03-21 04:49:15,522][03784] Avg episode reward: [(0, '1.349')] [2024-03-21 04:49:18,534][04017] Updated weights for policy 0, policy_version 31161 (0.0018) [2024-03-21 04:49:20,521][03784] Fps is (10 sec: 39321.8, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 1021116416. Throughput: 0: 47408.8. Samples: 1022334000. Policy #0 lag: (min: 2.0, avg: 36.6, max: 73.0) [2024-03-21 04:49:20,522][03784] Avg episode reward: [(0, '1.614')] [2024-03-21 04:49:25,521][03784] Fps is (10 sec: 26214.1, 60 sec: 48059.6, 300 sec: 47208.1). Total num frames: 1021280256. Throughput: 0: 47991.2. Samples: 1022635400. Policy #0 lag: (min: 2.0, avg: 36.6, max: 73.0) [2024-03-21 04:49:25,522][03784] Avg episode reward: [(0, '0.743')] [2024-03-21 04:49:30,521][03784] Fps is (10 sec: 26214.5, 60 sec: 44236.8, 300 sec: 47097.1). Total num frames: 1021378560. Throughput: 0: 48204.6. Samples: 1022782300. Policy #0 lag: (min: 2.0, avg: 36.6, max: 73.0) [2024-03-21 04:49:30,522][03784] Avg episode reward: [(0, '1.023')] [2024-03-21 04:49:30,557][04017] Updated weights for policy 0, policy_version 31171 (0.0016) [2024-03-21 04:49:35,521][03784] Fps is (10 sec: 32768.3, 60 sec: 44236.8, 300 sec: 46986.0). Total num frames: 1021607936. Throughput: 0: 48548.9. Samples: 1023087800. Policy #0 lag: (min: 0.0, avg: 49.1, max: 105.0) [2024-03-21 04:49:35,522][03784] Avg episode reward: [(0, '0.818')] [2024-03-21 04:49:36,823][04017] Updated weights for policy 0, policy_version 31181 (0.0023) [2024-03-21 04:49:40,521][03784] Fps is (10 sec: 62258.9, 60 sec: 47513.7, 300 sec: 47541.4). Total num frames: 1022001152. Throughput: 0: 47420.0. Samples: 1023329800. Policy #0 lag: (min: 0.0, avg: 49.1, max: 105.0) [2024-03-21 04:49:40,522][03784] Avg episode reward: [(0, '1.024')] [2024-03-21 04:49:42,477][04017] Updated weights for policy 0, policy_version 31191 (0.0012) [2024-03-21 04:49:45,521][03784] Fps is (10 sec: 52428.5, 60 sec: 45875.3, 300 sec: 46874.9). Total num frames: 1022132224. Throughput: 0: 47335.5. Samples: 1023468600. Policy #0 lag: (min: 0.0, avg: 49.1, max: 105.0) [2024-03-21 04:49:45,522][03784] Avg episode reward: [(0, '0.602')] [2024-03-21 04:49:49,089][04017] Updated weights for policy 0, policy_version 31201 (0.0010) [2024-03-21 04:49:50,521][03784] Fps is (10 sec: 39321.5, 60 sec: 45329.1, 300 sec: 46986.0). Total num frames: 1022394368. Throughput: 0: 47275.4. Samples: 1023748500. Policy #0 lag: (min: 0.0, avg: 49.1, max: 105.0) [2024-03-21 04:49:50,522][03784] Avg episode reward: [(0, '0.736')] [2024-03-21 04:49:52,045][03995] Signal inference workers to stop experience collection... (20550 times) [2024-03-21 04:49:52,117][03995] Signal inference workers to resume experience collection... (20550 times) [2024-03-21 04:49:52,125][04017] InferenceWorker_p0-w0: stopping experience collection (20550 times) [2024-03-21 04:49:52,191][04017] InferenceWorker_p0-w0: resuming experience collection (20550 times) [2024-03-21 04:49:55,051][04017] Updated weights for policy 0, policy_version 31211 (0.0010) [2024-03-21 04:49:55,521][03784] Fps is (10 sec: 62259.6, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 1022754816. Throughput: 0: 47555.6. Samples: 1024031900. Policy #0 lag: (min: 1.0, avg: 29.5, max: 59.0) [2024-03-21 04:49:55,522][03784] Avg episode reward: [(0, '0.736')] [2024-03-21 04:49:59,627][04017] Updated weights for policy 0, policy_version 31221 (0.0010) [2024-03-21 04:50:00,521][03784] Fps is (10 sec: 65535.8, 60 sec: 49152.0, 300 sec: 47319.2). Total num frames: 1023049728. Throughput: 0: 47099.9. Samples: 1024167700. Policy #0 lag: (min: 1.0, avg: 29.5, max: 59.0) [2024-03-21 04:50:00,522][03784] Avg episode reward: [(0, '1.541')] [2024-03-21 04:50:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000031221_1023049728.pth... [2024-03-21 04:50:00,651][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000030871_1011580928.pth [2024-03-21 04:50:05,521][03784] Fps is (10 sec: 52428.4, 60 sec: 48059.6, 300 sec: 47430.3). Total num frames: 1023279104. Throughput: 0: 47164.4. Samples: 1024456400. Policy #0 lag: (min: 1.0, avg: 29.5, max: 59.0) [2024-03-21 04:50:05,522][03784] Avg episode reward: [(0, '1.175')] [2024-03-21 04:50:06,502][04017] Updated weights for policy 0, policy_version 31231 (0.0010) [2024-03-21 04:50:10,521][03784] Fps is (10 sec: 32768.2, 60 sec: 44236.8, 300 sec: 47097.0). Total num frames: 1023377408. Throughput: 0: 46873.4. Samples: 1024744700. Policy #0 lag: (min: 1.0, avg: 29.5, max: 59.0) [2024-03-21 04:50:10,522][03784] Avg episode reward: [(0, '1.513')] [2024-03-21 04:50:15,521][03784] Fps is (10 sec: 22937.7, 60 sec: 41506.1, 300 sec: 46319.5). Total num frames: 1023508480. Throughput: 0: 46706.6. Samples: 1024884100. Policy #0 lag: (min: 2.0, avg: 34.6, max: 73.0) [2024-03-21 04:50:15,522][03784] Avg episode reward: [(0, '1.669')] [2024-03-21 04:50:18,582][04017] Updated weights for policy 0, policy_version 31241 (0.0011) [2024-03-21 04:50:20,521][03784] Fps is (10 sec: 49152.2, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 1023868928. Throughput: 0: 46482.3. Samples: 1025179500. Policy #0 lag: (min: 2.0, avg: 34.6, max: 73.0) [2024-03-21 04:50:20,522][03784] Avg episode reward: [(0, '1.619')] [2024-03-21 04:50:23,311][04017] Updated weights for policy 0, policy_version 31251 (0.0010) [2024-03-21 04:50:25,521][03784] Fps is (10 sec: 72090.2, 60 sec: 49152.1, 300 sec: 47652.4). Total num frames: 1024229376. Throughput: 0: 47322.3. Samples: 1025459300. Policy #0 lag: (min: 2.0, avg: 34.6, max: 73.0) [2024-03-21 04:50:25,522][03784] Avg episode reward: [(0, '1.619')] [2024-03-21 04:50:28,334][04017] Updated weights for policy 0, policy_version 31261 (0.0015) [2024-03-21 04:50:30,521][03784] Fps is (10 sec: 58982.4, 60 sec: 51336.5, 300 sec: 47430.3). Total num frames: 1024458752. Throughput: 0: 47537.9. Samples: 1025607800. Policy #0 lag: (min: 2.0, avg: 34.6, max: 73.0) [2024-03-21 04:50:30,522][03784] Avg episode reward: [(0, '1.048')] [2024-03-21 04:50:35,521][03784] Fps is (10 sec: 32768.1, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 1024557056. Throughput: 0: 48024.5. Samples: 1025909600. Policy #0 lag: (min: 1.0, avg: 37.8, max: 75.0) [2024-03-21 04:50:35,522][03784] Avg episode reward: [(0, '1.048')] [2024-03-21 04:50:38,696][04017] Updated weights for policy 0, policy_version 31271 (0.0015) [2024-03-21 04:50:39,344][03995] Signal inference workers to stop experience collection... (20600 times) [2024-03-21 04:50:39,345][03995] Signal inference workers to resume experience collection... (20600 times) [2024-03-21 04:50:39,417][04017] InferenceWorker_p0-w0: stopping experience collection (20600 times) [2024-03-21 04:50:39,417][04017] InferenceWorker_p0-w0: resuming experience collection (20600 times) [2024-03-21 04:50:40,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 1024819200. Throughput: 0: 47800.0. Samples: 1026182900. Policy #0 lag: (min: 1.0, avg: 37.8, max: 75.0) [2024-03-21 04:50:40,522][03784] Avg episode reward: [(0, '1.039')] [2024-03-21 04:50:43,382][04017] Updated weights for policy 0, policy_version 31281 (0.0012) [2024-03-21 04:50:45,521][03784] Fps is (10 sec: 49152.4, 60 sec: 48606.0, 300 sec: 46541.7). Total num frames: 1025048576. Throughput: 0: 47664.6. Samples: 1026312600. Policy #0 lag: (min: 1.0, avg: 37.8, max: 75.0) [2024-03-21 04:50:45,522][03784] Avg episode reward: [(0, '0.674')] [2024-03-21 04:50:50,521][03784] Fps is (10 sec: 39321.1, 60 sec: 46967.4, 300 sec: 46652.7). Total num frames: 1025212416. Throughput: 0: 47551.1. Samples: 1026596200. Policy #0 lag: (min: 1.0, avg: 37.8, max: 75.0) [2024-03-21 04:50:50,522][03784] Avg episode reward: [(0, '1.426')] [2024-03-21 04:50:52,117][04017] Updated weights for policy 0, policy_version 31291 (0.0011) [2024-03-21 04:50:55,521][03784] Fps is (10 sec: 45874.4, 60 sec: 45875.2, 300 sec: 47208.1). Total num frames: 1025507328. Throughput: 0: 47524.4. Samples: 1026883300. Policy #0 lag: (min: 1.0, avg: 34.0, max: 71.0) [2024-03-21 04:50:55,522][03784] Avg episode reward: [(0, '1.268')] [2024-03-21 04:51:00,344][04017] Updated weights for policy 0, policy_version 31301 (0.0010) [2024-03-21 04:51:00,521][03784] Fps is (10 sec: 45875.9, 60 sec: 43690.7, 300 sec: 46874.9). Total num frames: 1025671168. Throughput: 0: 47586.8. Samples: 1027025500. Policy #0 lag: (min: 1.0, avg: 34.0, max: 71.0) [2024-03-21 04:51:00,522][03784] Avg episode reward: [(0, '0.859')] [2024-03-21 04:51:05,521][03784] Fps is (10 sec: 45875.8, 60 sec: 44783.0, 300 sec: 47319.2). Total num frames: 1025966080. Throughput: 0: 47406.7. Samples: 1027312800. Policy #0 lag: (min: 1.0, avg: 34.0, max: 71.0) [2024-03-21 04:51:05,522][03784] Avg episode reward: [(0, '0.681')] [2024-03-21 04:51:05,649][04017] Updated weights for policy 0, policy_version 31311 (0.0015) [2024-03-21 04:51:10,020][04017] Updated weights for policy 0, policy_version 31321 (0.0021) [2024-03-21 04:51:10,521][03784] Fps is (10 sec: 65536.3, 60 sec: 49152.1, 300 sec: 47541.4). Total num frames: 1026326528. Throughput: 0: 47262.3. Samples: 1027586100. Policy #0 lag: (min: 1.0, avg: 34.0, max: 71.0) [2024-03-21 04:51:10,521][03784] Avg episode reward: [(0, '1.106')] [2024-03-21 04:51:15,521][03784] Fps is (10 sec: 58982.3, 60 sec: 50790.5, 300 sec: 47541.4). Total num frames: 1026555904. Throughput: 0: 47124.5. Samples: 1027728400. Policy #0 lag: (min: 1.0, avg: 34.0, max: 71.0) [2024-03-21 04:51:15,522][03784] Avg episode reward: [(0, '0.996')] [2024-03-21 04:51:16,997][04017] Updated weights for policy 0, policy_version 31331 (0.0010) [2024-03-21 04:51:20,521][03784] Fps is (10 sec: 58981.9, 60 sec: 50790.4, 300 sec: 47763.5). Total num frames: 1026916352. Throughput: 0: 46748.9. Samples: 1028013300. Policy #0 lag: (min: 0.0, avg: 45.9, max: 123.0) [2024-03-21 04:51:20,522][03784] Avg episode reward: [(0, '0.701')] [2024-03-21 04:51:25,264][04017] Updated weights for policy 0, policy_version 31341 (0.0019) [2024-03-21 04:51:25,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 1026981888. Throughput: 0: 47282.3. Samples: 1028310600. Policy #0 lag: (min: 0.0, avg: 45.9, max: 123.0) [2024-03-21 04:51:25,521][03784] Avg episode reward: [(0, '1.611')] [2024-03-21 04:51:30,521][03784] Fps is (10 sec: 29491.0, 60 sec: 45875.1, 300 sec: 46208.4). Total num frames: 1027211264. Throughput: 0: 47090.9. Samples: 1028431700. Policy #0 lag: (min: 0.0, avg: 45.9, max: 123.0) [2024-03-21 04:51:30,522][03784] Avg episode reward: [(0, '1.402')] [2024-03-21 04:51:31,080][03995] Signal inference workers to stop experience collection... (20650 times) [2024-03-21 04:51:31,087][03995] Signal inference workers to resume experience collection... (20650 times) [2024-03-21 04:51:31,160][04017] InferenceWorker_p0-w0: stopping experience collection (20650 times) [2024-03-21 04:51:31,161][04017] InferenceWorker_p0-w0: resuming experience collection (20650 times) [2024-03-21 04:51:31,395][04017] Updated weights for policy 0, policy_version 31351 (0.0012) [2024-03-21 04:51:35,521][03784] Fps is (10 sec: 49151.7, 60 sec: 48605.8, 300 sec: 46763.8). Total num frames: 1027473408. Throughput: 0: 46589.0. Samples: 1028692700. Policy #0 lag: (min: 0.0, avg: 45.9, max: 123.0) [2024-03-21 04:51:35,522][03784] Avg episode reward: [(0, '0.963')] [2024-03-21 04:51:38,278][04017] Updated weights for policy 0, policy_version 31361 (0.0009) [2024-03-21 04:51:40,521][03784] Fps is (10 sec: 45875.4, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 1027670016. Throughput: 0: 46626.7. Samples: 1028981500. Policy #0 lag: (min: 0.0, avg: 37.6, max: 90.0) [2024-03-21 04:51:40,522][03784] Avg episode reward: [(0, '1.217')] [2024-03-21 04:51:45,521][03784] Fps is (10 sec: 36044.7, 60 sec: 46421.2, 300 sec: 46874.9). Total num frames: 1027833856. Throughput: 0: 46795.5. Samples: 1029131300. Policy #0 lag: (min: 0.0, avg: 37.6, max: 90.0) [2024-03-21 04:51:45,522][03784] Avg episode reward: [(0, '1.674')] [2024-03-21 04:51:50,521][03784] Fps is (10 sec: 26214.5, 60 sec: 45329.2, 300 sec: 46319.5). Total num frames: 1027932160. Throughput: 0: 47466.6. Samples: 1029448800. Policy #0 lag: (min: 0.0, avg: 37.6, max: 90.0) [2024-03-21 04:51:50,522][03784] Avg episode reward: [(0, '1.238')] [2024-03-21 04:51:51,563][04017] Updated weights for policy 0, policy_version 31371 (0.0033) [2024-03-21 04:51:55,057][04017] Updated weights for policy 0, policy_version 31381 (0.0016) [2024-03-21 04:51:55,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46967.5, 300 sec: 47208.1). Total num frames: 1028325376. Throughput: 0: 47379.9. Samples: 1029718200. Policy #0 lag: (min: 0.0, avg: 37.6, max: 90.0) [2024-03-21 04:51:55,522][03784] Avg episode reward: [(0, '0.790')] [2024-03-21 04:52:00,521][03784] Fps is (10 sec: 62258.6, 60 sec: 48059.6, 300 sec: 47652.4). Total num frames: 1028554752. Throughput: 0: 47035.4. Samples: 1029845000. Policy #0 lag: (min: 0.0, avg: 32.0, max: 67.0) [2024-03-21 04:52:00,522][03784] Avg episode reward: [(0, '1.343')] [2024-03-21 04:52:00,812][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000031390_1028587520.pth... [2024-03-21 04:52:00,965][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000031043_1017217024.pth [2024-03-21 04:52:02,205][04017] Updated weights for policy 0, policy_version 31391 (0.0010) [2024-03-21 04:52:05,521][03784] Fps is (10 sec: 58981.8, 60 sec: 49151.9, 300 sec: 47763.5). Total num frames: 1028915200. Throughput: 0: 46986.5. Samples: 1030127700. Policy #0 lag: (min: 0.0, avg: 32.0, max: 67.0) [2024-03-21 04:52:05,522][03784] Avg episode reward: [(0, '1.033')] [2024-03-21 04:52:05,668][04017] Updated weights for policy 0, policy_version 31401 (0.0013) [2024-03-21 04:52:10,521][03784] Fps is (10 sec: 72089.6, 60 sec: 49151.9, 300 sec: 47985.7). Total num frames: 1029275648. Throughput: 0: 45764.3. Samples: 1030370000. Policy #0 lag: (min: 0.0, avg: 32.0, max: 67.0) [2024-03-21 04:52:10,522][03784] Avg episode reward: [(0, '0.983')] [2024-03-21 04:52:10,720][04017] Updated weights for policy 0, policy_version 31412 (0.0014) [2024-03-21 04:52:15,521][03784] Fps is (10 sec: 49152.6, 60 sec: 47513.6, 300 sec: 47097.1). Total num frames: 1029406720. Throughput: 0: 46100.1. Samples: 1030506200. Policy #0 lag: (min: 0.0, avg: 32.0, max: 67.0) [2024-03-21 04:52:15,522][03784] Avg episode reward: [(0, '0.801')] [2024-03-21 04:52:18,246][03995] Signal inference workers to stop experience collection... (20700 times) [2024-03-21 04:52:18,298][04017] InferenceWorker_p0-w0: stopping experience collection (20700 times) [2024-03-21 04:52:18,374][03995] Signal inference workers to resume experience collection... (20700 times) [2024-03-21 04:52:18,375][04017] InferenceWorker_p0-w0: resuming experience collection (20700 times) [2024-03-21 04:52:18,722][04017] Updated weights for policy 0, policy_version 31422 (0.0013) [2024-03-21 04:52:20,521][03784] Fps is (10 sec: 45875.8, 60 sec: 46967.5, 300 sec: 47652.5). Total num frames: 1029734400. Throughput: 0: 46802.3. Samples: 1030798800. Policy #0 lag: (min: 2.0, avg: 46.0, max: 114.0) [2024-03-21 04:52:20,522][03784] Avg episode reward: [(0, '1.319')] [2024-03-21 04:52:25,521][03784] Fps is (10 sec: 42598.3, 60 sec: 47513.5, 300 sec: 47097.1). Total num frames: 1029832704. Throughput: 0: 47351.1. Samples: 1031112300. Policy #0 lag: (min: 2.0, avg: 46.0, max: 114.0) [2024-03-21 04:52:25,522][03784] Avg episode reward: [(0, '0.786')] [2024-03-21 04:52:27,818][04017] Updated weights for policy 0, policy_version 31432 (0.0010) [2024-03-21 04:52:30,521][03784] Fps is (10 sec: 39321.3, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 1030127616. Throughput: 0: 47000.0. Samples: 1031246300. Policy #0 lag: (min: 2.0, avg: 46.0, max: 114.0) [2024-03-21 04:52:30,522][03784] Avg episode reward: [(0, '0.786')] [2024-03-21 04:52:35,521][03784] Fps is (10 sec: 39322.2, 60 sec: 45875.3, 300 sec: 46874.9). Total num frames: 1030225920. Throughput: 0: 46149.0. Samples: 1031525500. Policy #0 lag: (min: 2.0, avg: 46.0, max: 114.0) [2024-03-21 04:52:35,521][03784] Avg episode reward: [(0, '0.894')] [2024-03-21 04:52:37,780][04017] Updated weights for policy 0, policy_version 31442 (0.0012) [2024-03-21 04:52:40,521][03784] Fps is (10 sec: 22937.7, 60 sec: 44783.0, 300 sec: 46541.7). Total num frames: 1030356992. Throughput: 0: 46928.9. Samples: 1031830000. Policy #0 lag: (min: 0.0, avg: 25.8, max: 69.0) [2024-03-21 04:52:40,522][03784] Avg episode reward: [(0, '1.141')] [2024-03-21 04:52:45,521][03784] Fps is (10 sec: 32767.6, 60 sec: 45329.1, 300 sec: 46319.5). Total num frames: 1030553600. Throughput: 0: 46960.1. Samples: 1031958200. Policy #0 lag: (min: 0.0, avg: 25.8, max: 69.0) [2024-03-21 04:52:45,522][03784] Avg episode reward: [(0, '1.285')] [2024-03-21 04:52:47,006][04017] Updated weights for policy 0, policy_version 31452 (0.0021) [2024-03-21 04:52:50,175][04017] Updated weights for policy 0, policy_version 31462 (0.0017) [2024-03-21 04:52:50,521][03784] Fps is (10 sec: 62258.9, 60 sec: 50790.4, 300 sec: 46986.0). Total num frames: 1030979584. Throughput: 0: 46857.9. Samples: 1032236300. Policy #0 lag: (min: 0.0, avg: 25.8, max: 69.0) [2024-03-21 04:52:50,522][03784] Avg episode reward: [(0, '0.999')] [2024-03-21 04:52:55,399][04017] Updated weights for policy 0, policy_version 31472 (0.0011) [2024-03-21 04:52:55,521][03784] Fps is (10 sec: 72089.7, 60 sec: 49152.0, 300 sec: 47652.5). Total num frames: 1031274496. Throughput: 0: 47864.6. Samples: 1032523900. Policy #0 lag: (min: 0.0, avg: 25.8, max: 69.0) [2024-03-21 04:52:55,522][03784] Avg episode reward: [(0, '0.999')] [2024-03-21 04:53:00,521][03784] Fps is (10 sec: 52429.1, 60 sec: 49152.1, 300 sec: 47541.4). Total num frames: 1031503872. Throughput: 0: 47666.7. Samples: 1032651200. Policy #0 lag: (min: 0.0, avg: 25.8, max: 69.0) [2024-03-21 04:53:00,522][03784] Avg episode reward: [(0, '0.999')] [2024-03-21 04:53:04,913][04017] Updated weights for policy 0, policy_version 31482 (0.0019) [2024-03-21 04:53:05,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45329.2, 300 sec: 46986.0). Total num frames: 1031634944. Throughput: 0: 47535.5. Samples: 1032937900. Policy #0 lag: (min: 0.0, avg: 37.2, max: 84.0) [2024-03-21 04:53:05,522][03784] Avg episode reward: [(0, '1.060')] [2024-03-21 04:53:10,521][03784] Fps is (10 sec: 32768.0, 60 sec: 42598.5, 300 sec: 46874.9). Total num frames: 1031831552. Throughput: 0: 46886.7. Samples: 1033222200. Policy #0 lag: (min: 0.0, avg: 37.2, max: 84.0) [2024-03-21 04:53:10,522][03784] Avg episode reward: [(0, '1.303')] [2024-03-21 04:53:11,815][04017] Updated weights for policy 0, policy_version 31492 (0.0012) [2024-03-21 04:53:15,521][03784] Fps is (10 sec: 32767.8, 60 sec: 42598.4, 300 sec: 46541.7). Total num frames: 1031962624. Throughput: 0: 47111.1. Samples: 1033366300. Policy #0 lag: (min: 0.0, avg: 37.2, max: 84.0) [2024-03-21 04:53:15,522][03784] Avg episode reward: [(0, '1.499')] [2024-03-21 04:53:15,562][03995] Signal inference workers to stop experience collection... (20750 times) [2024-03-21 04:53:15,626][04017] InferenceWorker_p0-w0: stopping experience collection (20750 times) [2024-03-21 04:53:15,637][03995] Signal inference workers to resume experience collection... (20750 times) [2024-03-21 04:53:15,665][04017] InferenceWorker_p0-w0: resuming experience collection (20750 times) [2024-03-21 04:53:19,888][04017] Updated weights for policy 0, policy_version 31502 (0.0025) [2024-03-21 04:53:20,521][03784] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 47097.1). Total num frames: 1032290304. Throughput: 0: 47177.6. Samples: 1033648500. Policy #0 lag: (min: 0.0, avg: 37.2, max: 84.0) [2024-03-21 04:53:20,522][03784] Avg episode reward: [(0, '1.499')] [2024-03-21 04:53:25,521][03784] Fps is (10 sec: 55705.5, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 1032519680. Throughput: 0: 46277.7. Samples: 1033912500. Policy #0 lag: (min: 0.0, avg: 29.0, max: 73.0) [2024-03-21 04:53:25,522][03784] Avg episode reward: [(0, '0.932')] [2024-03-21 04:53:26,492][04017] Updated weights for policy 0, policy_version 31512 (0.0012) [2024-03-21 04:53:30,521][03784] Fps is (10 sec: 45875.0, 60 sec: 43690.7, 300 sec: 46763.8). Total num frames: 1032749056. Throughput: 0: 46479.9. Samples: 1034049800. Policy #0 lag: (min: 0.0, avg: 29.0, max: 73.0) [2024-03-21 04:53:30,522][03784] Avg episode reward: [(0, '0.626')] [2024-03-21 04:53:32,529][04017] Updated weights for policy 0, policy_version 31522 (0.0024) [2024-03-21 04:53:35,521][03784] Fps is (10 sec: 52428.9, 60 sec: 46967.3, 300 sec: 47097.1). Total num frames: 1033043968. Throughput: 0: 46360.0. Samples: 1034322500. Policy #0 lag: (min: 0.0, avg: 29.0, max: 73.0) [2024-03-21 04:53:35,522][03784] Avg episode reward: [(0, '1.382')] [2024-03-21 04:53:38,377][04017] Updated weights for policy 0, policy_version 31532 (0.0024) [2024-03-21 04:53:40,521][03784] Fps is (10 sec: 65535.5, 60 sec: 50790.3, 300 sec: 47541.4). Total num frames: 1033404416. Throughput: 0: 46282.1. Samples: 1034606600. Policy #0 lag: (min: 0.0, avg: 29.0, max: 73.0) [2024-03-21 04:53:40,522][03784] Avg episode reward: [(0, '1.382')] [2024-03-21 04:53:43,781][04017] Updated weights for policy 0, policy_version 31542 (0.0016) [2024-03-21 04:53:45,521][03784] Fps is (10 sec: 62259.4, 60 sec: 51882.6, 300 sec: 47430.3). Total num frames: 1033666560. Throughput: 0: 46882.2. Samples: 1034760900. Policy #0 lag: (min: 1.0, avg: 39.1, max: 70.0) [2024-03-21 04:53:45,522][03784] Avg episode reward: [(0, '1.466')] [2024-03-21 04:53:50,521][03784] Fps is (10 sec: 45875.5, 60 sec: 48059.7, 300 sec: 47652.4). Total num frames: 1033863168. Throughput: 0: 46824.4. Samples: 1035045000. Policy #0 lag: (min: 1.0, avg: 39.1, max: 70.0) [2024-03-21 04:53:50,522][03784] Avg episode reward: [(0, '1.388')] [2024-03-21 04:53:51,392][04017] Updated weights for policy 0, policy_version 31552 (0.0011) [2024-03-21 04:53:55,521][03784] Fps is (10 sec: 45875.2, 60 sec: 47513.6, 300 sec: 47541.4). Total num frames: 1034125312. Throughput: 0: 46788.9. Samples: 1035327700. Policy #0 lag: (min: 1.0, avg: 39.1, max: 70.0) [2024-03-21 04:53:55,522][03784] Avg episode reward: [(0, '1.259')] [2024-03-21 04:53:58,373][04017] Updated weights for policy 0, policy_version 31562 (0.0010) [2024-03-21 04:54:00,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46967.4, 300 sec: 47208.1). Total num frames: 1034321920. Throughput: 0: 47166.7. Samples: 1035488800. Policy #0 lag: (min: 1.0, avg: 39.1, max: 70.0) [2024-03-21 04:54:00,522][03784] Avg episode reward: [(0, '1.221')] [2024-03-21 04:54:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000031565_1034321920.pth... [2024-03-21 04:54:00,652][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000031221_1023049728.pth [2024-03-21 04:54:05,521][03784] Fps is (10 sec: 29491.0, 60 sec: 46421.2, 300 sec: 46430.6). Total num frames: 1034420224. Throughput: 0: 47277.7. Samples: 1035776000. Policy #0 lag: (min: 1.0, avg: 39.1, max: 70.0) [2024-03-21 04:54:05,523][03784] Avg episode reward: [(0, '0.820')] [2024-03-21 04:54:08,755][04017] Updated weights for policy 0, policy_version 31572 (0.0015) [2024-03-21 04:54:09,491][03995] Signal inference workers to stop experience collection... (20800 times) [2024-03-21 04:54:09,492][03995] Signal inference workers to resume experience collection... (20800 times) [2024-03-21 04:54:09,565][04017] InferenceWorker_p0-w0: stopping experience collection (20800 times) [2024-03-21 04:54:09,565][04017] InferenceWorker_p0-w0: resuming experience collection (20800 times) [2024-03-21 04:54:10,521][03784] Fps is (10 sec: 36044.6, 60 sec: 47513.5, 300 sec: 46319.5). Total num frames: 1034682368. Throughput: 0: 47868.8. Samples: 1036066600. Policy #0 lag: (min: 0.0, avg: 28.1, max: 70.0) [2024-03-21 04:54:10,522][03784] Avg episode reward: [(0, '1.158')] [2024-03-21 04:54:12,992][04017] Updated weights for policy 0, policy_version 31582 (0.0016) [2024-03-21 04:54:15,521][03784] Fps is (10 sec: 52429.3, 60 sec: 49698.2, 300 sec: 46874.9). Total num frames: 1034944512. Throughput: 0: 47491.2. Samples: 1036186900. Policy #0 lag: (min: 0.0, avg: 28.1, max: 70.0) [2024-03-21 04:54:15,522][03784] Avg episode reward: [(0, '1.520')] [2024-03-21 04:54:20,521][03784] Fps is (10 sec: 45875.8, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 1035141120. Throughput: 0: 47775.6. Samples: 1036472400. Policy #0 lag: (min: 0.0, avg: 28.1, max: 70.0) [2024-03-21 04:54:20,522][03784] Avg episode reward: [(0, '1.291')] [2024-03-21 04:54:21,550][04017] Updated weights for policy 0, policy_version 31592 (0.0015) [2024-03-21 04:54:25,521][03784] Fps is (10 sec: 45875.5, 60 sec: 48059.8, 300 sec: 47541.4). Total num frames: 1035403264. Throughput: 0: 47504.6. Samples: 1036744300. Policy #0 lag: (min: 0.0, avg: 28.1, max: 70.0) [2024-03-21 04:54:25,522][03784] Avg episode reward: [(0, '0.687')] [2024-03-21 04:54:28,783][04017] Updated weights for policy 0, policy_version 31602 (0.0011) [2024-03-21 04:54:30,521][03784] Fps is (10 sec: 45875.2, 60 sec: 47513.6, 300 sec: 47430.3). Total num frames: 1035599872. Throughput: 0: 46860.0. Samples: 1036869600. Policy #0 lag: (min: 1.0, avg: 58.8, max: 110.0) [2024-03-21 04:54:30,522][03784] Avg episode reward: [(0, '0.515')] [2024-03-21 04:54:35,521][03784] Fps is (10 sec: 42597.0, 60 sec: 46421.2, 300 sec: 46874.9). Total num frames: 1035829248. Throughput: 0: 46813.1. Samples: 1037151600. Policy #0 lag: (min: 1.0, avg: 58.8, max: 110.0) [2024-03-21 04:54:35,522][03784] Avg episode reward: [(0, '1.083')] [2024-03-21 04:54:35,811][04017] Updated weights for policy 0, policy_version 31612 (0.0011) [2024-03-21 04:54:40,521][03784] Fps is (10 sec: 45875.0, 60 sec: 44236.9, 300 sec: 47208.1). Total num frames: 1036058624. Throughput: 0: 46746.7. Samples: 1037431300. Policy #0 lag: (min: 1.0, avg: 58.8, max: 110.0) [2024-03-21 04:54:40,522][03784] Avg episode reward: [(0, '1.515')] [2024-03-21 04:54:44,326][04017] Updated weights for policy 0, policy_version 31622 (0.0010) [2024-03-21 04:54:45,521][03784] Fps is (10 sec: 45876.4, 60 sec: 43690.7, 300 sec: 47097.1). Total num frames: 1036288000. Throughput: 0: 46302.3. Samples: 1037572400. Policy #0 lag: (min: 1.0, avg: 58.8, max: 110.0) [2024-03-21 04:54:45,522][03784] Avg episode reward: [(0, '1.306')] [2024-03-21 04:54:49,973][04017] Updated weights for policy 0, policy_version 31632 (0.0010) [2024-03-21 04:54:50,521][03784] Fps is (10 sec: 49151.8, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 1036550144. Throughput: 0: 46302.2. Samples: 1037859600. Policy #0 lag: (min: 0.0, avg: 34.2, max: 65.0) [2024-03-21 04:54:50,522][03784] Avg episode reward: [(0, '1.250')] [2024-03-21 04:54:53,588][04017] Updated weights for policy 0, policy_version 31642 (0.0010) [2024-03-21 04:54:55,521][03784] Fps is (10 sec: 65535.7, 60 sec: 46967.4, 300 sec: 47097.1). Total num frames: 1036943360. Throughput: 0: 45713.4. Samples: 1038123700. Policy #0 lag: (min: 0.0, avg: 34.2, max: 65.0) [2024-03-21 04:54:55,522][03784] Avg episode reward: [(0, '1.250')] [2024-03-21 04:55:00,521][03784] Fps is (10 sec: 52427.6, 60 sec: 45875.0, 300 sec: 46763.8). Total num frames: 1037074432. Throughput: 0: 46581.9. Samples: 1038283100. Policy #0 lag: (min: 0.0, avg: 34.2, max: 65.0) [2024-03-21 04:55:00,523][03784] Avg episode reward: [(0, '1.033')] [2024-03-21 04:55:01,470][04017] Updated weights for policy 0, policy_version 31652 (0.0011) [2024-03-21 04:55:02,405][03995] Signal inference workers to stop experience collection... (20850 times) [2024-03-21 04:55:02,473][03995] Signal inference workers to resume experience collection... (20850 times) [2024-03-21 04:55:02,492][04017] InferenceWorker_p0-w0: stopping experience collection (20850 times) [2024-03-21 04:55:02,527][04017] InferenceWorker_p0-w0: resuming experience collection (20850 times) [2024-03-21 04:55:05,521][03784] Fps is (10 sec: 45874.7, 60 sec: 49698.1, 300 sec: 47541.4). Total num frames: 1037402112. Throughput: 0: 46302.0. Samples: 1038556000. Policy #0 lag: (min: 0.0, avg: 34.2, max: 65.0) [2024-03-21 04:55:05,522][03784] Avg episode reward: [(0, '0.928')] [2024-03-21 04:55:06,606][04017] Updated weights for policy 0, policy_version 31662 (0.0023) [2024-03-21 04:55:10,521][03784] Fps is (10 sec: 62260.9, 60 sec: 50244.3, 300 sec: 48096.8). Total num frames: 1037697024. Throughput: 0: 46655.5. Samples: 1038843800. Policy #0 lag: (min: 0.0, avg: 40.6, max: 78.0) [2024-03-21 04:55:10,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 04:55:15,521][03784] Fps is (10 sec: 39322.2, 60 sec: 47513.6, 300 sec: 47208.1). Total num frames: 1037795328. Throughput: 0: 47326.6. Samples: 1038999300. Policy #0 lag: (min: 0.0, avg: 40.6, max: 78.0) [2024-03-21 04:55:15,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 04:55:16,197][04017] Updated weights for policy 0, policy_version 31672 (0.0022) [2024-03-21 04:55:20,521][03784] Fps is (10 sec: 22937.3, 60 sec: 46421.2, 300 sec: 46430.6). Total num frames: 1037926400. Throughput: 0: 47335.7. Samples: 1039281700. Policy #0 lag: (min: 0.0, avg: 40.6, max: 78.0) [2024-03-21 04:55:20,522][03784] Avg episode reward: [(0, '0.880')] [2024-03-21 04:55:23,935][04017] Updated weights for policy 0, policy_version 31682 (0.0015) [2024-03-21 04:55:25,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46967.4, 300 sec: 46652.7). Total num frames: 1038221312. Throughput: 0: 47037.8. Samples: 1039548000. Policy #0 lag: (min: 0.0, avg: 40.6, max: 78.0) [2024-03-21 04:55:25,522][03784] Avg episode reward: [(0, '1.244')] [2024-03-21 04:55:30,521][03784] Fps is (10 sec: 45875.7, 60 sec: 46421.3, 300 sec: 46874.9). Total num frames: 1038385152. Throughput: 0: 46971.1. Samples: 1039686100. Policy #0 lag: (min: 0.0, avg: 31.2, max: 80.0) [2024-03-21 04:55:30,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 04:55:31,321][04017] Updated weights for policy 0, policy_version 31692 (0.0015) [2024-03-21 04:55:35,521][03784] Fps is (10 sec: 29491.3, 60 sec: 44783.1, 300 sec: 46430.6). Total num frames: 1038516224. Throughput: 0: 47724.5. Samples: 1040007200. Policy #0 lag: (min: 0.0, avg: 31.2, max: 80.0) [2024-03-21 04:55:35,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 04:55:39,425][04017] Updated weights for policy 0, policy_version 31702 (0.0018) [2024-03-21 04:55:40,521][03784] Fps is (10 sec: 49152.0, 60 sec: 46967.4, 300 sec: 46874.9). Total num frames: 1038876672. Throughput: 0: 48133.3. Samples: 1040289700. Policy #0 lag: (min: 0.0, avg: 31.2, max: 80.0) [2024-03-21 04:55:40,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 04:55:44,092][04017] Updated weights for policy 0, policy_version 31712 (0.0016) [2024-03-21 04:55:45,521][03784] Fps is (10 sec: 72089.0, 60 sec: 49151.9, 300 sec: 47541.4). Total num frames: 1039237120. Throughput: 0: 47482.4. Samples: 1040419800. Policy #0 lag: (min: 0.0, avg: 31.2, max: 80.0) [2024-03-21 04:55:45,522][03784] Avg episode reward: [(0, '1.233')] [2024-03-21 04:55:50,020][04017] Updated weights for policy 0, policy_version 31722 (0.0011) [2024-03-21 04:55:50,521][03784] Fps is (10 sec: 62259.0, 60 sec: 49152.0, 300 sec: 47430.3). Total num frames: 1039499264. Throughput: 0: 47509.0. Samples: 1040693900. Policy #0 lag: (min: 1.0, avg: 35.7, max: 101.0) [2024-03-21 04:55:50,522][03784] Avg episode reward: [(0, '0.694')] [2024-03-21 04:55:55,521][03784] Fps is (10 sec: 45875.5, 60 sec: 45875.2, 300 sec: 47541.4). Total num frames: 1039695872. Throughput: 0: 47331.1. Samples: 1040973700. Policy #0 lag: (min: 1.0, avg: 35.7, max: 101.0) [2024-03-21 04:55:55,522][03784] Avg episode reward: [(0, '1.227')] [2024-03-21 04:55:56,735][03995] Signal inference workers to stop experience collection... (20900 times) [2024-03-21 04:55:56,789][04017] InferenceWorker_p0-w0: stopping experience collection (20900 times) [2024-03-21 04:55:56,806][03995] Signal inference workers to resume experience collection... (20900 times) [2024-03-21 04:55:56,840][04017] InferenceWorker_p0-w0: resuming experience collection (20900 times) [2024-03-21 04:55:56,847][04017] Updated weights for policy 0, policy_version 31732 (0.0017) [2024-03-21 04:56:00,521][03784] Fps is (10 sec: 45875.2, 60 sec: 48059.9, 300 sec: 47430.3). Total num frames: 1039958016. Throughput: 0: 46891.0. Samples: 1041109400. Policy #0 lag: (min: 1.0, avg: 35.7, max: 101.0) [2024-03-21 04:56:00,522][03784] Avg episode reward: [(0, '1.326')] [2024-03-21 04:56:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000031737_1039958016.pth... [2024-03-21 04:56:00,655][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000031390_1028587520.pth [2024-03-21 04:56:02,975][04017] Updated weights for policy 0, policy_version 31742 (0.0012) [2024-03-21 04:56:05,521][03784] Fps is (10 sec: 42598.8, 60 sec: 45329.2, 300 sec: 46763.8). Total num frames: 1040121856. Throughput: 0: 46849.1. Samples: 1041389900. Policy #0 lag: (min: 1.0, avg: 35.7, max: 101.0) [2024-03-21 04:56:05,522][03784] Avg episode reward: [(0, '0.952')] [2024-03-21 04:56:10,521][03784] Fps is (10 sec: 29491.1, 60 sec: 42598.3, 300 sec: 46430.6). Total num frames: 1040252928. Throughput: 0: 47613.2. Samples: 1041690600. Policy #0 lag: (min: 1.0, avg: 35.7, max: 101.0) [2024-03-21 04:56:10,522][03784] Avg episode reward: [(0, '1.720')] [2024-03-21 04:56:14,414][04017] Updated weights for policy 0, policy_version 31752 (0.0016) [2024-03-21 04:56:15,521][03784] Fps is (10 sec: 42598.2, 60 sec: 45875.2, 300 sec: 46208.4). Total num frames: 1040547840. Throughput: 0: 47626.7. Samples: 1041829300. Policy #0 lag: (min: 2.0, avg: 37.5, max: 97.0) [2024-03-21 04:56:15,522][03784] Avg episode reward: [(0, '0.765')] [2024-03-21 04:56:20,330][04017] Updated weights for policy 0, policy_version 31762 (0.0011) [2024-03-21 04:56:20,521][03784] Fps is (10 sec: 52429.5, 60 sec: 47513.7, 300 sec: 46763.8). Total num frames: 1040777216. Throughput: 0: 47353.3. Samples: 1042138100. Policy #0 lag: (min: 2.0, avg: 37.5, max: 97.0) [2024-03-21 04:56:20,522][03784] Avg episode reward: [(0, '0.765')] [2024-03-21 04:56:24,426][04017] Updated weights for policy 0, policy_version 31772 (0.0026) [2024-03-21 04:56:25,521][03784] Fps is (10 sec: 62258.7, 60 sec: 49152.0, 300 sec: 47319.2). Total num frames: 1041170432. Throughput: 0: 47137.8. Samples: 1042410900. Policy #0 lag: (min: 2.0, avg: 37.5, max: 97.0) [2024-03-21 04:56:25,522][03784] Avg episode reward: [(0, '0.966')] [2024-03-21 04:56:30,521][03784] Fps is (10 sec: 62258.7, 60 sec: 50244.2, 300 sec: 47208.1). Total num frames: 1041399808. Throughput: 0: 47320.0. Samples: 1042549200. Policy #0 lag: (min: 2.0, avg: 37.5, max: 97.0) [2024-03-21 04:56:30,522][03784] Avg episode reward: [(0, '1.154')] [2024-03-21 04:56:30,818][04017] Updated weights for policy 0, policy_version 31782 (0.0017) [2024-03-21 04:56:35,521][03784] Fps is (10 sec: 32767.7, 60 sec: 49698.0, 300 sec: 46874.9). Total num frames: 1041498112. Throughput: 0: 47882.2. Samples: 1042848600. Policy #0 lag: (min: 2.0, avg: 37.5, max: 97.0) [2024-03-21 04:56:35,522][03784] Avg episode reward: [(0, '1.276')] [2024-03-21 04:56:39,384][04017] Updated weights for policy 0, policy_version 31792 (0.0017) [2024-03-21 04:56:40,521][03784] Fps is (10 sec: 42598.8, 60 sec: 49152.0, 300 sec: 47430.3). Total num frames: 1041825792. Throughput: 0: 47662.3. Samples: 1043118500. Policy #0 lag: (min: 0.0, avg: 49.5, max: 108.0) [2024-03-21 04:56:40,522][03784] Avg episode reward: [(0, '1.412')] [2024-03-21 04:56:45,521][03784] Fps is (10 sec: 52429.4, 60 sec: 46421.4, 300 sec: 47763.5). Total num frames: 1042022400. Throughput: 0: 47735.6. Samples: 1043257500. Policy #0 lag: (min: 0.0, avg: 49.5, max: 108.0) [2024-03-21 04:56:45,522][03784] Avg episode reward: [(0, '0.716')] [2024-03-21 04:56:48,725][04017] Updated weights for policy 0, policy_version 31802 (0.0010) [2024-03-21 04:56:49,403][03995] Signal inference workers to stop experience collection... (20950 times) [2024-03-21 04:56:49,467][03995] Signal inference workers to resume experience collection... (20950 times) [2024-03-21 04:56:49,471][04017] InferenceWorker_p0-w0: stopping experience collection (20950 times) [2024-03-21 04:56:49,530][04017] InferenceWorker_p0-w0: resuming experience collection (20950 times) [2024-03-21 04:56:50,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45875.3, 300 sec: 47208.1). Total num frames: 1042251776. Throughput: 0: 47775.5. Samples: 1043539800. Policy #0 lag: (min: 0.0, avg: 49.5, max: 108.0) [2024-03-21 04:56:50,522][03784] Avg episode reward: [(0, '1.152')] [2024-03-21 04:56:54,842][04017] Updated weights for policy 0, policy_version 31812 (0.0011) [2024-03-21 04:56:55,521][03784] Fps is (10 sec: 42598.6, 60 sec: 45875.2, 300 sec: 47097.1). Total num frames: 1042448384. Throughput: 0: 47633.5. Samples: 1043834100. Policy #0 lag: (min: 0.0, avg: 49.5, max: 108.0) [2024-03-21 04:56:55,522][03784] Avg episode reward: [(0, '0.705')] [2024-03-21 04:57:00,521][03784] Fps is (10 sec: 39321.3, 60 sec: 44783.0, 300 sec: 46541.7). Total num frames: 1042644992. Throughput: 0: 47897.7. Samples: 1043984700. Policy #0 lag: (min: 0.0, avg: 38.3, max: 86.0) [2024-03-21 04:57:00,522][03784] Avg episode reward: [(0, '1.126')] [2024-03-21 04:57:01,316][04017] Updated weights for policy 0, policy_version 31822 (0.0020) [2024-03-21 04:57:04,642][04017] Updated weights for policy 0, policy_version 31832 (0.0017) [2024-03-21 04:57:05,521][03784] Fps is (10 sec: 62259.0, 60 sec: 49151.9, 300 sec: 46763.8). Total num frames: 1043070976. Throughput: 0: 46566.6. Samples: 1044233600. Policy #0 lag: (min: 0.0, avg: 38.3, max: 86.0) [2024-03-21 04:57:05,522][03784] Avg episode reward: [(0, '1.404')] [2024-03-21 04:57:10,521][03784] Fps is (10 sec: 58982.5, 60 sec: 49698.2, 300 sec: 46874.9). Total num frames: 1043234816. Throughput: 0: 46633.4. Samples: 1044509400. Policy #0 lag: (min: 0.0, avg: 38.3, max: 86.0) [2024-03-21 04:57:10,522][03784] Avg episode reward: [(0, '1.024')] [2024-03-21 04:57:13,585][04017] Updated weights for policy 0, policy_version 31842 (0.0014) [2024-03-21 04:57:15,521][03784] Fps is (10 sec: 39321.3, 60 sec: 48605.8, 300 sec: 46541.6). Total num frames: 1043464192. Throughput: 0: 46660.0. Samples: 1044648900. Policy #0 lag: (min: 0.0, avg: 38.3, max: 86.0) [2024-03-21 04:57:15,522][03784] Avg episode reward: [(0, '0.809')] [2024-03-21 04:57:20,420][04017] Updated weights for policy 0, policy_version 31852 (0.0013) [2024-03-21 04:57:20,521][03784] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 47097.0). Total num frames: 1043726336. Throughput: 0: 46140.1. Samples: 1044924900. Policy #0 lag: (min: 0.0, avg: 52.6, max: 121.0) [2024-03-21 04:57:20,522][03784] Avg episode reward: [(0, '1.439')] [2024-03-21 04:57:25,521][03784] Fps is (10 sec: 39322.3, 60 sec: 44783.0, 300 sec: 46541.7). Total num frames: 1043857408. Throughput: 0: 46577.8. Samples: 1045214500. Policy #0 lag: (min: 0.0, avg: 52.6, max: 121.0) [2024-03-21 04:57:25,522][03784] Avg episode reward: [(0, '1.130')] [2024-03-21 04:57:27,753][04017] Updated weights for policy 0, policy_version 31862 (0.0014) [2024-03-21 04:57:30,521][03784] Fps is (10 sec: 39321.3, 60 sec: 45329.0, 300 sec: 47097.0). Total num frames: 1044119552. Throughput: 0: 46551.0. Samples: 1045352300. Policy #0 lag: (min: 0.0, avg: 52.6, max: 121.0) [2024-03-21 04:57:30,523][03784] Avg episode reward: [(0, '1.367')] [2024-03-21 04:57:35,521][03784] Fps is (10 sec: 49151.3, 60 sec: 47513.7, 300 sec: 47430.3). Total num frames: 1044348928. Throughput: 0: 46471.0. Samples: 1045631000. Policy #0 lag: (min: 0.0, avg: 52.6, max: 121.0) [2024-03-21 04:57:35,522][03784] Avg episode reward: [(0, '0.747')] [2024-03-21 04:57:36,015][04017] Updated weights for policy 0, policy_version 31872 (0.0015) [2024-03-21 04:57:40,521][03784] Fps is (10 sec: 45875.3, 60 sec: 45875.1, 300 sec: 47541.3). Total num frames: 1044578304. Throughput: 0: 46064.3. Samples: 1045907000. Policy #0 lag: (min: 1.0, avg: 54.6, max: 104.0) [2024-03-21 04:57:40,522][03784] Avg episode reward: [(0, '0.793')] [2024-03-21 04:57:41,295][03995] Signal inference workers to stop experience collection... (21000 times) [2024-03-21 04:57:41,357][04017] InferenceWorker_p0-w0: stopping experience collection (21000 times) [2024-03-21 04:57:41,371][03995] Signal inference workers to resume experience collection... (21000 times) [2024-03-21 04:57:41,393][04017] InferenceWorker_p0-w0: resuming experience collection (21000 times) [2024-03-21 04:57:42,107][04017] Updated weights for policy 0, policy_version 31882 (0.0028) [2024-03-21 04:57:45,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 1044774912. Throughput: 0: 45415.6. Samples: 1046028400. Policy #0 lag: (min: 1.0, avg: 54.6, max: 104.0) [2024-03-21 04:57:45,522][03784] Avg episode reward: [(0, '1.651')] [2024-03-21 04:57:50,521][03784] Fps is (10 sec: 42598.6, 60 sec: 45875.1, 300 sec: 46541.7). Total num frames: 1045004288. Throughput: 0: 46442.2. Samples: 1046323500. Policy #0 lag: (min: 1.0, avg: 54.6, max: 104.0) [2024-03-21 04:57:50,522][03784] Avg episode reward: [(0, '0.852')] [2024-03-21 04:57:50,915][04017] Updated weights for policy 0, policy_version 31892 (0.0011) [2024-03-21 04:57:55,521][03784] Fps is (10 sec: 55705.2, 60 sec: 48059.6, 300 sec: 46874.9). Total num frames: 1045331968. Throughput: 0: 46404.3. Samples: 1046597600. Policy #0 lag: (min: 1.0, avg: 54.6, max: 104.0) [2024-03-21 04:57:55,522][03784] Avg episode reward: [(0, '1.131')] [2024-03-21 04:57:55,874][04017] Updated weights for policy 0, policy_version 31902 (0.0016) [2024-03-21 04:58:00,521][03784] Fps is (10 sec: 52428.4, 60 sec: 48059.7, 300 sec: 47097.0). Total num frames: 1045528576. Throughput: 0: 46168.8. Samples: 1046726500. Policy #0 lag: (min: 1.0, avg: 54.6, max: 104.0) [2024-03-21 04:58:00,522][03784] Avg episode reward: [(0, '0.815')] [2024-03-21 04:58:00,872][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000031909_1045594112.pth... [2024-03-21 04:58:00,986][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000031565_1034321920.pth [2024-03-21 04:58:02,818][04017] Updated weights for policy 0, policy_version 31912 (0.0010) [2024-03-21 04:58:05,521][03784] Fps is (10 sec: 45875.9, 60 sec: 45329.1, 300 sec: 47319.2). Total num frames: 1045790720. Throughput: 0: 46193.4. Samples: 1047003600. Policy #0 lag: (min: 0.0, avg: 38.8, max: 77.0) [2024-03-21 04:58:05,522][03784] Avg episode reward: [(0, '0.966')] [2024-03-21 04:58:10,521][03784] Fps is (10 sec: 42598.9, 60 sec: 45329.1, 300 sec: 47430.3). Total num frames: 1045954560. Throughput: 0: 46299.9. Samples: 1047298000. Policy #0 lag: (min: 0.0, avg: 38.8, max: 77.0) [2024-03-21 04:58:10,522][03784] Avg episode reward: [(0, '1.128')] [2024-03-21 04:58:12,953][04017] Updated weights for policy 0, policy_version 31922 (0.0015) [2024-03-21 04:58:15,521][03784] Fps is (10 sec: 36044.6, 60 sec: 44783.0, 300 sec: 46986.0). Total num frames: 1046151168. Throughput: 0: 46524.6. Samples: 1047445900. Policy #0 lag: (min: 0.0, avg: 38.8, max: 77.0) [2024-03-21 04:58:15,522][03784] Avg episode reward: [(0, '1.248')] [2024-03-21 04:58:19,843][04017] Updated weights for policy 0, policy_version 31932 (0.0010) [2024-03-21 04:58:20,521][03784] Fps is (10 sec: 45875.1, 60 sec: 44783.0, 300 sec: 47097.1). Total num frames: 1046413312. Throughput: 0: 46904.5. Samples: 1047741700. Policy #0 lag: (min: 0.0, avg: 38.8, max: 77.0) [2024-03-21 04:58:20,522][03784] Avg episode reward: [(0, '1.284')] [2024-03-21 04:58:23,361][04017] Updated weights for policy 0, policy_version 31942 (0.0012) [2024-03-21 04:58:25,521][03784] Fps is (10 sec: 65536.5, 60 sec: 49152.0, 300 sec: 47652.5). Total num frames: 1046806528. Throughput: 0: 46762.4. Samples: 1048011300. Policy #0 lag: (min: 4.0, avg: 55.0, max: 114.0) [2024-03-21 04:58:25,522][03784] Avg episode reward: [(0, '1.442')] [2024-03-21 04:58:29,099][04017] Updated weights for policy 0, policy_version 31952 (0.0011) [2024-03-21 04:58:29,111][03995] Signal inference workers to stop experience collection... (21050 times) [2024-03-21 04:58:29,111][03995] Signal inference workers to resume experience collection... (21050 times) [2024-03-21 04:58:29,177][04017] InferenceWorker_p0-w0: stopping experience collection (21050 times) [2024-03-21 04:58:29,177][04017] InferenceWorker_p0-w0: resuming experience collection (21050 times) [2024-03-21 04:58:30,521][03784] Fps is (10 sec: 62259.1, 60 sec: 48605.9, 300 sec: 47430.3). Total num frames: 1047035904. Throughput: 0: 47284.4. Samples: 1048156200. Policy #0 lag: (min: 4.0, avg: 55.0, max: 114.0) [2024-03-21 04:58:30,522][03784] Avg episode reward: [(0, '1.329')] [2024-03-21 04:58:35,521][03784] Fps is (10 sec: 36044.5, 60 sec: 46967.5, 300 sec: 46652.8). Total num frames: 1047166976. Throughput: 0: 47280.0. Samples: 1048451100. Policy #0 lag: (min: 4.0, avg: 55.0, max: 114.0) [2024-03-21 04:58:35,522][03784] Avg episode reward: [(0, '1.280')] [2024-03-21 04:58:38,753][04017] Updated weights for policy 0, policy_version 31962 (0.0011) [2024-03-21 04:58:40,521][03784] Fps is (10 sec: 39321.8, 60 sec: 47513.7, 300 sec: 46652.7). Total num frames: 1047429120. Throughput: 0: 47826.8. Samples: 1048749800. Policy #0 lag: (min: 4.0, avg: 55.0, max: 114.0) [2024-03-21 04:58:40,522][03784] Avg episode reward: [(0, '0.751')] [2024-03-21 04:58:44,243][04017] Updated weights for policy 0, policy_version 31972 (0.0019) [2024-03-21 04:58:45,521][03784] Fps is (10 sec: 55705.5, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 1047724032. Throughput: 0: 48204.5. Samples: 1048895700. Policy #0 lag: (min: 1.0, avg: 41.0, max: 81.0) [2024-03-21 04:58:45,522][03784] Avg episode reward: [(0, '0.751')] [2024-03-21 04:58:48,771][04017] Updated weights for policy 0, policy_version 31982 (0.0016) [2024-03-21 04:58:50,521][03784] Fps is (10 sec: 65535.8, 60 sec: 51336.6, 300 sec: 47319.2). Total num frames: 1048084480. Throughput: 0: 48037.7. Samples: 1049165300. Policy #0 lag: (min: 1.0, avg: 41.0, max: 81.0) [2024-03-21 04:58:50,522][03784] Avg episode reward: [(0, '1.204')] [2024-03-21 04:58:55,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 1048150016. Throughput: 0: 47966.6. Samples: 1049456500. Policy #0 lag: (min: 1.0, avg: 41.0, max: 81.0) [2024-03-21 04:58:55,522][03784] Avg episode reward: [(0, '0.689')] [2024-03-21 04:58:58,899][04017] Updated weights for policy 0, policy_version 31992 (0.0010) [2024-03-21 04:59:00,521][03784] Fps is (10 sec: 22937.7, 60 sec: 46421.4, 300 sec: 47097.1). Total num frames: 1048313856. Throughput: 0: 47944.5. Samples: 1049603400. Policy #0 lag: (min: 1.0, avg: 41.0, max: 81.0) [2024-03-21 04:59:00,522][03784] Avg episode reward: [(0, '1.715')] [2024-03-21 04:59:05,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45329.0, 300 sec: 46874.9). Total num frames: 1048510464. Throughput: 0: 48226.7. Samples: 1049911900. Policy #0 lag: (min: 0.0, avg: 35.3, max: 81.0) [2024-03-21 04:59:05,522][03784] Avg episode reward: [(0, '0.928')] [2024-03-21 04:59:08,376][04017] Updated weights for policy 0, policy_version 32002 (0.0020) [2024-03-21 04:59:10,521][03784] Fps is (10 sec: 52428.6, 60 sec: 48059.7, 300 sec: 47097.0). Total num frames: 1048838144. Throughput: 0: 47926.6. Samples: 1050168000. Policy #0 lag: (min: 0.0, avg: 35.3, max: 81.0) [2024-03-21 04:59:10,522][03784] Avg episode reward: [(0, '1.408')] [2024-03-21 04:59:11,847][04017] Updated weights for policy 0, policy_version 32012 (0.0018) [2024-03-21 04:59:15,521][03784] Fps is (10 sec: 62259.2, 60 sec: 49698.1, 300 sec: 47430.3). Total num frames: 1049133056. Throughput: 0: 48008.9. Samples: 1050316600. Policy #0 lag: (min: 0.0, avg: 35.3, max: 81.0) [2024-03-21 04:59:15,522][03784] Avg episode reward: [(0, '1.360')] [2024-03-21 04:59:20,521][03784] Fps is (10 sec: 42598.6, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 1049264128. Throughput: 0: 48128.9. Samples: 1050616900. Policy #0 lag: (min: 0.0, avg: 35.3, max: 81.0) [2024-03-21 04:59:20,522][03784] Avg episode reward: [(0, '1.305')] [2024-03-21 04:59:21,758][04017] Updated weights for policy 0, policy_version 32022 (0.0009) [2024-03-21 04:59:25,521][03784] Fps is (10 sec: 29491.5, 60 sec: 43690.7, 300 sec: 46874.9). Total num frames: 1049427968. Throughput: 0: 48237.8. Samples: 1050920500. Policy #0 lag: (min: 0.0, avg: 35.3, max: 81.0) [2024-03-21 04:59:25,522][03784] Avg episode reward: [(0, '0.704')] [2024-03-21 04:59:25,531][03995] Signal inference workers to stop experience collection... (21100 times) [2024-03-21 04:59:25,531][03995] Signal inference workers to resume experience collection... (21100 times) [2024-03-21 04:59:25,742][04017] InferenceWorker_p0-w0: stopping experience collection (21100 times) [2024-03-21 04:59:25,747][04017] InferenceWorker_p0-w0: resuming experience collection (21100 times) [2024-03-21 04:59:27,468][04017] Updated weights for policy 0, policy_version 32032 (0.0011) [2024-03-21 04:59:30,521][03784] Fps is (10 sec: 58981.9, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 1049853952. Throughput: 0: 48022.2. Samples: 1051056700. Policy #0 lag: (min: 2.0, avg: 31.8, max: 71.0) [2024-03-21 04:59:30,522][03784] Avg episode reward: [(0, '0.791')] [2024-03-21 04:59:31,366][04017] Updated weights for policy 0, policy_version 32042 (0.0019) [2024-03-21 04:59:35,521][03784] Fps is (10 sec: 75365.7, 60 sec: 50244.3, 300 sec: 47874.6). Total num frames: 1050181632. Throughput: 0: 48108.9. Samples: 1051330200. Policy #0 lag: (min: 2.0, avg: 31.8, max: 71.0) [2024-03-21 04:59:35,522][03784] Avg episode reward: [(0, '0.873')] [2024-03-21 04:59:36,219][04017] Updated weights for policy 0, policy_version 32052 (0.0012) [2024-03-21 04:59:40,521][03784] Fps is (10 sec: 55706.0, 60 sec: 49698.1, 300 sec: 47874.6). Total num frames: 1050411008. Throughput: 0: 48302.3. Samples: 1051630100. Policy #0 lag: (min: 2.0, avg: 31.8, max: 71.0) [2024-03-21 04:59:40,522][03784] Avg episode reward: [(0, '0.952')] [2024-03-21 04:59:45,521][03784] Fps is (10 sec: 32768.0, 60 sec: 46421.3, 300 sec: 47319.2). Total num frames: 1050509312. Throughput: 0: 48366.6. Samples: 1051779900. Policy #0 lag: (min: 2.0, avg: 31.8, max: 71.0) [2024-03-21 04:59:45,522][03784] Avg episode reward: [(0, '1.401')] [2024-03-21 04:59:46,595][04017] Updated weights for policy 0, policy_version 32062 (0.0011) [2024-03-21 04:59:50,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45329.1, 300 sec: 46986.0). Total num frames: 1050804224. Throughput: 0: 47935.6. Samples: 1052069000. Policy #0 lag: (min: 3.0, avg: 36.1, max: 71.0) [2024-03-21 04:59:50,522][03784] Avg episode reward: [(0, '1.376')] [2024-03-21 04:59:52,801][04017] Updated weights for policy 0, policy_version 32072 (0.0016) [2024-03-21 04:59:55,521][03784] Fps is (10 sec: 55706.0, 60 sec: 48606.0, 300 sec: 47430.3). Total num frames: 1051066368. Throughput: 0: 48989.0. Samples: 1052372500. Policy #0 lag: (min: 3.0, avg: 36.1, max: 71.0) [2024-03-21 04:59:55,522][03784] Avg episode reward: [(0, '1.376')] [2024-03-21 05:00:00,118][04017] Updated weights for policy 0, policy_version 32082 (0.0019) [2024-03-21 05:00:00,521][03784] Fps is (10 sec: 49151.5, 60 sec: 49698.1, 300 sec: 47097.1). Total num frames: 1051295744. Throughput: 0: 48817.7. Samples: 1052513400. Policy #0 lag: (min: 3.0, avg: 36.1, max: 71.0) [2024-03-21 05:00:00,522][03784] Avg episode reward: [(0, '1.376')] [2024-03-21 05:00:00,782][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000032084_1051328512.pth... [2024-03-21 05:00:00,891][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000031737_1039958016.pth [2024-03-21 05:00:05,521][03784] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 46986.0). Total num frames: 1051557888. Throughput: 0: 48191.2. Samples: 1052785500. Policy #0 lag: (min: 3.0, avg: 36.1, max: 71.0) [2024-03-21 05:00:05,522][03784] Avg episode reward: [(0, '0.731')] [2024-03-21 05:00:05,685][04017] Updated weights for policy 0, policy_version 32092 (0.0015) [2024-03-21 05:00:10,521][03784] Fps is (10 sec: 39321.9, 60 sec: 47513.6, 300 sec: 47097.1). Total num frames: 1051688960. Throughput: 0: 48159.9. Samples: 1053087700. Policy #0 lag: (min: 0.0, avg: 41.3, max: 84.0) [2024-03-21 05:00:10,522][03784] Avg episode reward: [(0, '1.415')] [2024-03-21 05:00:12,926][03995] Signal inference workers to stop experience collection... (21150 times) [2024-03-21 05:00:12,997][04017] InferenceWorker_p0-w0: stopping experience collection (21150 times) [2024-03-21 05:00:13,203][03995] Signal inference workers to resume experience collection... (21150 times) [2024-03-21 05:00:13,204][04017] InferenceWorker_p0-w0: resuming experience collection (21150 times) [2024-03-21 05:00:15,242][04017] Updated weights for policy 0, policy_version 32102 (0.0010) [2024-03-21 05:00:15,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46421.4, 300 sec: 47430.3). Total num frames: 1051918336. Throughput: 0: 48289.0. Samples: 1053229700. Policy #0 lag: (min: 0.0, avg: 41.3, max: 84.0) [2024-03-21 05:00:15,522][03784] Avg episode reward: [(0, '0.996')] [2024-03-21 05:00:20,521][03784] Fps is (10 sec: 45875.4, 60 sec: 48059.8, 300 sec: 47208.1). Total num frames: 1052147712. Throughput: 0: 48657.9. Samples: 1053519800. Policy #0 lag: (min: 0.0, avg: 41.3, max: 84.0) [2024-03-21 05:00:20,522][03784] Avg episode reward: [(0, '0.987')] [2024-03-21 05:00:22,046][04017] Updated weights for policy 0, policy_version 32112 (0.0020) [2024-03-21 05:00:25,521][03784] Fps is (10 sec: 45874.9, 60 sec: 49151.9, 300 sec: 47430.3). Total num frames: 1052377088. Throughput: 0: 47119.9. Samples: 1053750500. Policy #0 lag: (min: 0.0, avg: 41.3, max: 84.0) [2024-03-21 05:00:25,522][03784] Avg episode reward: [(0, '0.800')] [2024-03-21 05:00:30,224][04017] Updated weights for policy 0, policy_version 32122 (0.0012) [2024-03-21 05:00:30,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45329.2, 300 sec: 47652.5). Total num frames: 1052573696. Throughput: 0: 46891.2. Samples: 1053890000. Policy #0 lag: (min: 0.0, avg: 54.5, max: 122.0) [2024-03-21 05:00:30,522][03784] Avg episode reward: [(0, '1.266')] [2024-03-21 05:00:35,521][03784] Fps is (10 sec: 49152.0, 60 sec: 44782.9, 300 sec: 47430.3). Total num frames: 1052868608. Throughput: 0: 46499.9. Samples: 1054161500. Policy #0 lag: (min: 0.0, avg: 54.5, max: 122.0) [2024-03-21 05:00:35,522][03784] Avg episode reward: [(0, '0.765')] [2024-03-21 05:00:35,886][04017] Updated weights for policy 0, policy_version 32132 (0.0016) [2024-03-21 05:00:40,521][03784] Fps is (10 sec: 58981.8, 60 sec: 45875.2, 300 sec: 47208.1). Total num frames: 1053163520. Throughput: 0: 46144.4. Samples: 1054449000. Policy #0 lag: (min: 0.0, avg: 54.5, max: 122.0) [2024-03-21 05:00:40,522][03784] Avg episode reward: [(0, '1.363')] [2024-03-21 05:00:41,978][04017] Updated weights for policy 0, policy_version 32142 (0.0011) [2024-03-21 05:00:45,521][03784] Fps is (10 sec: 55706.0, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 1053425664. Throughput: 0: 46282.3. Samples: 1054596100. Policy #0 lag: (min: 0.0, avg: 54.5, max: 122.0) [2024-03-21 05:00:45,522][03784] Avg episode reward: [(0, '1.207')] [2024-03-21 05:00:49,783][04017] Updated weights for policy 0, policy_version 32152 (0.0017) [2024-03-21 05:00:50,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46421.3, 300 sec: 47097.1). Total num frames: 1053589504. Throughput: 0: 47059.9. Samples: 1054903200. Policy #0 lag: (min: 0.0, avg: 54.5, max: 122.0) [2024-03-21 05:00:50,522][03784] Avg episode reward: [(0, '1.001')] [2024-03-21 05:00:54,525][04017] Updated weights for policy 0, policy_version 32162 (0.0016) [2024-03-21 05:00:55,521][03784] Fps is (10 sec: 55705.9, 60 sec: 48605.9, 300 sec: 47541.4). Total num frames: 1053982720. Throughput: 0: 46464.5. Samples: 1055178600. Policy #0 lag: (min: 0.0, avg: 33.7, max: 76.0) [2024-03-21 05:00:55,522][03784] Avg episode reward: [(0, '1.164')] [2024-03-21 05:01:00,379][04017] Updated weights for policy 0, policy_version 32172 (0.0010) [2024-03-21 05:01:00,435][03995] Signal inference workers to stop experience collection... (21200 times) [2024-03-21 05:01:00,477][04017] InferenceWorker_p0-w0: stopping experience collection (21200 times) [2024-03-21 05:01:00,522][03784] Fps is (10 sec: 62257.1, 60 sec: 48605.6, 300 sec: 47763.4). Total num frames: 1054212096. Throughput: 0: 46604.1. Samples: 1055326900. Policy #0 lag: (min: 0.0, avg: 33.7, max: 76.0) [2024-03-21 05:01:00,522][03784] Avg episode reward: [(0, '1.164')] [2024-03-21 05:01:00,689][03995] Signal inference workers to resume experience collection... (21200 times) [2024-03-21 05:01:00,689][04017] InferenceWorker_p0-w0: resuming experience collection (21200 times) [2024-03-21 05:01:05,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48059.7, 300 sec: 48096.8). Total num frames: 1054441472. Throughput: 0: 46260.0. Samples: 1055601500. Policy #0 lag: (min: 0.0, avg: 33.7, max: 76.0) [2024-03-21 05:01:05,522][03784] Avg episode reward: [(0, '0.707')] [2024-03-21 05:01:09,445][04017] Updated weights for policy 0, policy_version 32182 (0.0015) [2024-03-21 05:01:10,521][03784] Fps is (10 sec: 32769.3, 60 sec: 47513.6, 300 sec: 47430.3). Total num frames: 1054539776. Throughput: 0: 47324.5. Samples: 1055880100. Policy #0 lag: (min: 0.0, avg: 33.7, max: 76.0) [2024-03-21 05:01:10,522][03784] Avg episode reward: [(0, '0.886')] [2024-03-21 05:01:15,521][03784] Fps is (10 sec: 36044.8, 60 sec: 48059.8, 300 sec: 47541.4). Total num frames: 1054801920. Throughput: 0: 47204.4. Samples: 1056014200. Policy #0 lag: (min: 1.0, avg: 39.6, max: 91.0) [2024-03-21 05:01:15,522][03784] Avg episode reward: [(0, '0.838')] [2024-03-21 05:01:16,853][04017] Updated weights for policy 0, policy_version 32192 (0.0016) [2024-03-21 05:01:20,521][03784] Fps is (10 sec: 52428.4, 60 sec: 48605.8, 300 sec: 47097.1). Total num frames: 1055064064. Throughput: 0: 47355.6. Samples: 1056292500. Policy #0 lag: (min: 1.0, avg: 39.6, max: 91.0) [2024-03-21 05:01:20,522][03784] Avg episode reward: [(0, '0.717')] [2024-03-21 05:01:25,521][03784] Fps is (10 sec: 32768.0, 60 sec: 45875.3, 300 sec: 46541.7). Total num frames: 1055129600. Throughput: 0: 47444.5. Samples: 1056584000. Policy #0 lag: (min: 1.0, avg: 39.6, max: 91.0) [2024-03-21 05:01:25,522][03784] Avg episode reward: [(0, '0.947')] [2024-03-21 05:01:26,808][04017] Updated weights for policy 0, policy_version 32202 (0.0010) [2024-03-21 05:01:30,521][03784] Fps is (10 sec: 36044.6, 60 sec: 47513.5, 300 sec: 47208.1). Total num frames: 1055424512. Throughput: 0: 47606.5. Samples: 1056738400. Policy #0 lag: (min: 1.0, avg: 39.6, max: 91.0) [2024-03-21 05:01:30,522][03784] Avg episode reward: [(0, '1.298')] [2024-03-21 05:01:31,409][04017] Updated weights for policy 0, policy_version 32212 (0.0016) [2024-03-21 05:01:35,521][03784] Fps is (10 sec: 68812.6, 60 sec: 49152.0, 300 sec: 47430.3). Total num frames: 1055817728. Throughput: 0: 46568.9. Samples: 1056998800. Policy #0 lag: (min: 0.0, avg: 36.3, max: 71.0) [2024-03-21 05:01:35,522][03784] Avg episode reward: [(0, '0.945')] [2024-03-21 05:01:36,514][04017] Updated weights for policy 0, policy_version 32222 (0.0011) [2024-03-21 05:01:40,521][03784] Fps is (10 sec: 55706.4, 60 sec: 46967.5, 300 sec: 47319.2). Total num frames: 1055981568. Throughput: 0: 47040.0. Samples: 1057295400. Policy #0 lag: (min: 0.0, avg: 36.3, max: 71.0) [2024-03-21 05:01:40,522][03784] Avg episode reward: [(0, '0.553')] [2024-03-21 05:01:44,592][04017] Updated weights for policy 0, policy_version 32232 (0.0012) [2024-03-21 05:01:45,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45875.2, 300 sec: 47208.1). Total num frames: 1056178176. Throughput: 0: 47218.2. Samples: 1057451700. Policy #0 lag: (min: 0.0, avg: 36.3, max: 71.0) [2024-03-21 05:01:45,522][03784] Avg episode reward: [(0, '0.553')] [2024-03-21 05:01:50,521][03784] Fps is (10 sec: 45874.8, 60 sec: 47513.6, 300 sec: 47430.3). Total num frames: 1056440320. Throughput: 0: 47533.3. Samples: 1057740500. Policy #0 lag: (min: 0.0, avg: 36.3, max: 71.0) [2024-03-21 05:01:50,522][03784] Avg episode reward: [(0, '1.366')] [2024-03-21 05:01:51,108][04017] Updated weights for policy 0, policy_version 32242 (0.0015) [2024-03-21 05:01:55,521][03784] Fps is (10 sec: 52428.7, 60 sec: 45329.0, 300 sec: 47652.5). Total num frames: 1056702464. Throughput: 0: 47655.5. Samples: 1058024600. Policy #0 lag: (min: 0.0, avg: 36.3, max: 71.0) [2024-03-21 05:01:55,522][03784] Avg episode reward: [(0, '0.578')] [2024-03-21 05:01:56,588][03995] Signal inference workers to stop experience collection... (21250 times) [2024-03-21 05:01:56,651][03995] Signal inference workers to resume experience collection... (21250 times) [2024-03-21 05:01:56,660][04017] InferenceWorker_p0-w0: stopping experience collection (21250 times) [2024-03-21 05:01:56,662][04017] Updated weights for policy 0, policy_version 32252 (0.0015) [2024-03-21 05:01:56,701][04017] InferenceWorker_p0-w0: resuming experience collection (21250 times) [2024-03-21 05:02:00,521][03784] Fps is (10 sec: 62259.2, 60 sec: 47513.9, 300 sec: 47430.3). Total num frames: 1057062912. Throughput: 0: 47571.0. Samples: 1058154900. Policy #0 lag: (min: 2.0, avg: 39.1, max: 89.0) [2024-03-21 05:02:00,522][03784] Avg episode reward: [(0, '1.144')] [2024-03-21 05:02:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000032259_1057062912.pth... [2024-03-21 05:02:00,701][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000031909_1045594112.pth [2024-03-21 05:02:02,443][04017] Updated weights for policy 0, policy_version 32262 (0.0023) [2024-03-21 05:02:05,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46421.3, 300 sec: 47430.3). Total num frames: 1057226752. Throughput: 0: 47657.8. Samples: 1058437100. Policy #0 lag: (min: 2.0, avg: 39.1, max: 89.0) [2024-03-21 05:02:05,522][03784] Avg episode reward: [(0, '0.665')] [2024-03-21 05:02:10,521][03784] Fps is (10 sec: 39321.4, 60 sec: 48605.8, 300 sec: 47430.3). Total num frames: 1057456128. Throughput: 0: 47533.2. Samples: 1058723000. Policy #0 lag: (min: 2.0, avg: 39.1, max: 89.0) [2024-03-21 05:02:10,522][03784] Avg episode reward: [(0, '1.228')] [2024-03-21 05:02:10,670][04017] Updated weights for policy 0, policy_version 32272 (0.0037) [2024-03-21 05:02:15,521][03784] Fps is (10 sec: 39321.7, 60 sec: 46967.5, 300 sec: 47097.1). Total num frames: 1057619968. Throughput: 0: 47037.9. Samples: 1058855100. Policy #0 lag: (min: 2.0, avg: 39.1, max: 89.0) [2024-03-21 05:02:15,522][03784] Avg episode reward: [(0, '1.425')] [2024-03-21 05:02:18,688][04017] Updated weights for policy 0, policy_version 32282 (0.0014) [2024-03-21 05:02:20,521][03784] Fps is (10 sec: 42598.9, 60 sec: 46967.5, 300 sec: 47541.4). Total num frames: 1057882112. Throughput: 0: 47971.2. Samples: 1059157500. Policy #0 lag: (min: 0.0, avg: 36.8, max: 107.0) [2024-03-21 05:02:20,522][03784] Avg episode reward: [(0, '0.855')] [2024-03-21 05:02:25,201][04017] Updated weights for policy 0, policy_version 32292 (0.0011) [2024-03-21 05:02:25,521][03784] Fps is (10 sec: 52428.4, 60 sec: 50244.2, 300 sec: 47541.4). Total num frames: 1058144256. Throughput: 0: 47557.7. Samples: 1059435500. Policy #0 lag: (min: 0.0, avg: 36.8, max: 107.0) [2024-03-21 05:02:25,522][03784] Avg episode reward: [(0, '1.344')] [2024-03-21 05:02:30,521][03784] Fps is (10 sec: 39321.4, 60 sec: 47513.7, 300 sec: 47208.1). Total num frames: 1058275328. Throughput: 0: 47406.6. Samples: 1059585000. Policy #0 lag: (min: 0.0, avg: 36.8, max: 107.0) [2024-03-21 05:02:30,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 05:02:34,621][04017] Updated weights for policy 0, policy_version 32302 (0.0011) [2024-03-21 05:02:35,521][03784] Fps is (10 sec: 36045.2, 60 sec: 44783.0, 300 sec: 47208.2). Total num frames: 1058504704. Throughput: 0: 47600.1. Samples: 1059882500. Policy #0 lag: (min: 0.0, avg: 36.8, max: 107.0) [2024-03-21 05:02:35,522][03784] Avg episode reward: [(0, '1.515')] [2024-03-21 05:02:40,521][03784] Fps is (10 sec: 49152.2, 60 sec: 46421.3, 300 sec: 47430.3). Total num frames: 1058766848. Throughput: 0: 47415.6. Samples: 1060158300. Policy #0 lag: (min: 0.0, avg: 48.7, max: 114.0) [2024-03-21 05:02:40,522][03784] Avg episode reward: [(0, '1.228')] [2024-03-21 05:02:40,612][04017] Updated weights for policy 0, policy_version 32312 (0.0020) [2024-03-21 05:02:44,871][04017] Updated weights for policy 0, policy_version 32322 (0.0030) [2024-03-21 05:02:45,521][03784] Fps is (10 sec: 68812.0, 60 sec: 50244.2, 300 sec: 48096.8). Total num frames: 1059192832. Throughput: 0: 47362.2. Samples: 1060286200. Policy #0 lag: (min: 0.0, avg: 48.7, max: 114.0) [2024-03-21 05:02:45,522][03784] Avg episode reward: [(0, '1.385')] [2024-03-21 05:02:50,521][03784] Fps is (10 sec: 58982.4, 60 sec: 48605.9, 300 sec: 47541.4). Total num frames: 1059356672. Throughput: 0: 47644.5. Samples: 1060581100. Policy #0 lag: (min: 0.0, avg: 48.7, max: 114.0) [2024-03-21 05:02:50,522][03784] Avg episode reward: [(0, '1.420')] [2024-03-21 05:02:53,760][04017] Updated weights for policy 0, policy_version 32332 (0.0022) [2024-03-21 05:02:54,284][03995] Signal inference workers to stop experience collection... (21300 times) [2024-03-21 05:02:54,285][03995] Signal inference workers to resume experience collection... (21300 times) [2024-03-21 05:02:54,361][04017] InferenceWorker_p0-w0: stopping experience collection (21300 times) [2024-03-21 05:02:54,363][04017] InferenceWorker_p0-w0: resuming experience collection (21300 times) [2024-03-21 05:02:55,521][03784] Fps is (10 sec: 32768.2, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 1059520512. Throughput: 0: 47726.8. Samples: 1060870700. Policy #0 lag: (min: 0.0, avg: 48.7, max: 114.0) [2024-03-21 05:02:55,522][03784] Avg episode reward: [(0, '1.179')] [2024-03-21 05:02:59,533][04017] Updated weights for policy 0, policy_version 32342 (0.0015) [2024-03-21 05:03:00,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 47541.4). Total num frames: 1059815424. Throughput: 0: 48048.9. Samples: 1061017300. Policy #0 lag: (min: 0.0, avg: 48.7, max: 114.0) [2024-03-21 05:03:00,522][03784] Avg episode reward: [(0, '0.947')] [2024-03-21 05:03:05,521][03784] Fps is (10 sec: 45874.8, 60 sec: 45875.1, 300 sec: 47541.4). Total num frames: 1059979264. Throughput: 0: 48017.7. Samples: 1061318300. Policy #0 lag: (min: 1.0, avg: 43.8, max: 88.0) [2024-03-21 05:03:05,522][03784] Avg episode reward: [(0, '0.947')] [2024-03-21 05:03:07,528][04017] Updated weights for policy 0, policy_version 32352 (0.0014) [2024-03-21 05:03:10,521][03784] Fps is (10 sec: 55705.1, 60 sec: 48605.9, 300 sec: 48207.8). Total num frames: 1060372480. Throughput: 0: 47926.6. Samples: 1061592200. Policy #0 lag: (min: 1.0, avg: 43.8, max: 88.0) [2024-03-21 05:03:10,522][03784] Avg episode reward: [(0, '0.525')] [2024-03-21 05:03:10,946][04017] Updated weights for policy 0, policy_version 32362 (0.0021) [2024-03-21 05:03:15,521][03784] Fps is (10 sec: 65536.6, 60 sec: 50244.2, 300 sec: 48207.8). Total num frames: 1060634624. Throughput: 0: 47355.6. Samples: 1061716000. Policy #0 lag: (min: 1.0, avg: 43.8, max: 88.0) [2024-03-21 05:03:15,522][03784] Avg episode reward: [(0, '1.522')] [2024-03-21 05:03:18,845][04017] Updated weights for policy 0, policy_version 32372 (0.0010) [2024-03-21 05:03:20,521][03784] Fps is (10 sec: 49152.4, 60 sec: 49698.1, 300 sec: 47652.4). Total num frames: 1060864000. Throughput: 0: 47031.0. Samples: 1061998900. Policy #0 lag: (min: 1.0, avg: 43.8, max: 88.0) [2024-03-21 05:03:20,522][03784] Avg episode reward: [(0, '1.410')] [2024-03-21 05:03:25,521][03784] Fps is (10 sec: 32768.0, 60 sec: 46967.5, 300 sec: 47208.1). Total num frames: 1060962304. Throughput: 0: 47315.5. Samples: 1062287500. Policy #0 lag: (min: 0.0, avg: 36.7, max: 74.0) [2024-03-21 05:03:25,522][03784] Avg episode reward: [(0, '1.500')] [2024-03-21 05:03:28,521][04017] Updated weights for policy 0, policy_version 32382 (0.0013) [2024-03-21 05:03:30,521][03784] Fps is (10 sec: 32767.9, 60 sec: 48605.8, 300 sec: 47541.4). Total num frames: 1061191680. Throughput: 0: 47533.3. Samples: 1062425200. Policy #0 lag: (min: 0.0, avg: 36.7, max: 74.0) [2024-03-21 05:03:30,522][03784] Avg episode reward: [(0, '1.256')] [2024-03-21 05:03:33,371][04017] Updated weights for policy 0, policy_version 32392 (0.0012) [2024-03-21 05:03:35,521][03784] Fps is (10 sec: 45875.5, 60 sec: 48605.9, 300 sec: 47430.3). Total num frames: 1061421056. Throughput: 0: 47231.2. Samples: 1062706500. Policy #0 lag: (min: 0.0, avg: 36.7, max: 74.0) [2024-03-21 05:03:35,522][03784] Avg episode reward: [(0, '1.191')] [2024-03-21 05:03:40,521][03784] Fps is (10 sec: 49152.1, 60 sec: 48605.8, 300 sec: 47319.2). Total num frames: 1061683200. Throughput: 0: 47313.3. Samples: 1062999800. Policy #0 lag: (min: 0.0, avg: 36.7, max: 74.0) [2024-03-21 05:03:40,522][03784] Avg episode reward: [(0, '0.989')] [2024-03-21 05:03:42,260][04017] Updated weights for policy 0, policy_version 32402 (0.0017) [2024-03-21 05:03:45,521][03784] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 46541.7). Total num frames: 1061814272. Throughput: 0: 46997.8. Samples: 1063132200. Policy #0 lag: (min: 0.0, avg: 36.7, max: 74.0) [2024-03-21 05:03:45,522][03784] Avg episode reward: [(0, '0.925')] [2024-03-21 05:03:49,034][04017] Updated weights for policy 0, policy_version 32412 (0.0011) [2024-03-21 05:03:50,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45329.1, 300 sec: 47208.2). Total num frames: 1062076416. Throughput: 0: 46580.1. Samples: 1063414400. Policy #0 lag: (min: 0.0, avg: 36.0, max: 73.0) [2024-03-21 05:03:50,522][03784] Avg episode reward: [(0, '1.151')] [2024-03-21 05:03:53,139][03995] Signal inference workers to stop experience collection... (21350 times) [2024-03-21 05:03:53,184][04017] InferenceWorker_p0-w0: stopping experience collection (21350 times) [2024-03-21 05:03:53,420][03995] Signal inference workers to resume experience collection... (21350 times) [2024-03-21 05:03:53,420][04017] InferenceWorker_p0-w0: resuming experience collection (21350 times) [2024-03-21 05:03:55,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 47208.1). Total num frames: 1062240256. Throughput: 0: 47320.1. Samples: 1063721600. Policy #0 lag: (min: 0.0, avg: 36.0, max: 73.0) [2024-03-21 05:03:55,522][03784] Avg episode reward: [(0, '0.922')] [2024-03-21 05:03:58,446][04017] Updated weights for policy 0, policy_version 32422 (0.0014) [2024-03-21 05:04:00,521][03784] Fps is (10 sec: 45874.8, 60 sec: 45329.0, 300 sec: 47541.4). Total num frames: 1062535168. Throughput: 0: 47733.3. Samples: 1063864000. Policy #0 lag: (min: 0.0, avg: 36.0, max: 73.0) [2024-03-21 05:04:00,522][03784] Avg episode reward: [(0, '1.050')] [2024-03-21 05:04:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000032426_1062535168.pth... [2024-03-21 05:04:00,663][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000032084_1051328512.pth [2024-03-21 05:04:02,584][04017] Updated weights for policy 0, policy_version 32432 (0.0012) [2024-03-21 05:04:05,521][03784] Fps is (10 sec: 72088.7, 60 sec: 49698.1, 300 sec: 47874.6). Total num frames: 1062961152. Throughput: 0: 47195.4. Samples: 1064122700. Policy #0 lag: (min: 0.0, avg: 36.0, max: 73.0) [2024-03-21 05:04:05,523][03784] Avg episode reward: [(0, '1.431')] [2024-03-21 05:04:08,756][04017] Updated weights for policy 0, policy_version 32442 (0.0016) [2024-03-21 05:04:10,521][03784] Fps is (10 sec: 58982.3, 60 sec: 45875.2, 300 sec: 47430.3). Total num frames: 1063124992. Throughput: 0: 47319.9. Samples: 1064416900. Policy #0 lag: (min: 0.0, avg: 48.3, max: 106.0) [2024-03-21 05:04:10,522][03784] Avg episode reward: [(0, '1.431')] [2024-03-21 05:04:15,035][04017] Updated weights for policy 0, policy_version 32452 (0.0011) [2024-03-21 05:04:15,521][03784] Fps is (10 sec: 45876.0, 60 sec: 46421.4, 300 sec: 47985.7). Total num frames: 1063419904. Throughput: 0: 47417.9. Samples: 1064559000. Policy #0 lag: (min: 0.0, avg: 48.3, max: 106.0) [2024-03-21 05:04:15,522][03784] Avg episode reward: [(0, '1.157')] [2024-03-21 05:04:19,318][04017] Updated weights for policy 0, policy_version 32462 (0.0014) [2024-03-21 05:04:20,521][03784] Fps is (10 sec: 58983.1, 60 sec: 47513.6, 300 sec: 48430.0). Total num frames: 1063714816. Throughput: 0: 46991.1. Samples: 1064821100. Policy #0 lag: (min: 0.0, avg: 48.3, max: 106.0) [2024-03-21 05:04:20,522][03784] Avg episode reward: [(0, '1.119')] [2024-03-21 05:04:25,521][03784] Fps is (10 sec: 58981.6, 60 sec: 50790.3, 300 sec: 47985.7). Total num frames: 1064009728. Throughput: 0: 47195.5. Samples: 1065123600. Policy #0 lag: (min: 0.0, avg: 48.3, max: 106.0) [2024-03-21 05:04:25,522][03784] Avg episode reward: [(0, '1.516')] [2024-03-21 05:04:25,771][04017] Updated weights for policy 0, policy_version 32472 (0.0020) [2024-03-21 05:04:30,521][03784] Fps is (10 sec: 42598.4, 60 sec: 49152.1, 300 sec: 47319.2). Total num frames: 1064140800. Throughput: 0: 47462.3. Samples: 1065268000. Policy #0 lag: (min: 0.0, avg: 48.3, max: 106.0) [2024-03-21 05:04:30,522][03784] Avg episode reward: [(0, '1.516')] [2024-03-21 05:04:35,521][03784] Fps is (10 sec: 22938.0, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 1064239104. Throughput: 0: 47997.8. Samples: 1065574300. Policy #0 lag: (min: 0.0, avg: 36.6, max: 74.0) [2024-03-21 05:04:35,522][03784] Avg episode reward: [(0, '1.287')] [2024-03-21 05:04:39,919][04017] Updated weights for policy 0, policy_version 32482 (0.0022) [2024-03-21 05:04:40,521][03784] Fps is (10 sec: 22937.4, 60 sec: 44782.9, 300 sec: 46986.0). Total num frames: 1064370176. Throughput: 0: 47415.5. Samples: 1065855300. Policy #0 lag: (min: 0.0, avg: 36.6, max: 74.0) [2024-03-21 05:04:40,522][03784] Avg episode reward: [(0, '0.553')] [2024-03-21 05:04:41,191][03995] Signal inference workers to stop experience collection... (21400 times) [2024-03-21 05:04:41,191][03995] Signal inference workers to resume experience collection... (21400 times) [2024-03-21 05:04:41,280][04017] InferenceWorker_p0-w0: stopping experience collection (21400 times) [2024-03-21 05:04:41,281][04017] InferenceWorker_p0-w0: resuming experience collection (21400 times) [2024-03-21 05:04:44,798][04017] Updated weights for policy 0, policy_version 32492 (0.0035) [2024-03-21 05:04:45,521][03784] Fps is (10 sec: 52428.4, 60 sec: 49152.0, 300 sec: 47319.2). Total num frames: 1064763392. Throughput: 0: 47111.2. Samples: 1065984000. Policy #0 lag: (min: 0.0, avg: 36.6, max: 74.0) [2024-03-21 05:04:45,522][03784] Avg episode reward: [(0, '0.787')] [2024-03-21 05:04:49,512][04017] Updated weights for policy 0, policy_version 32502 (0.0014) [2024-03-21 05:04:50,521][03784] Fps is (10 sec: 68812.9, 60 sec: 49698.1, 300 sec: 47430.3). Total num frames: 1065058304. Throughput: 0: 46924.5. Samples: 1066234300. Policy #0 lag: (min: 0.0, avg: 36.6, max: 74.0) [2024-03-21 05:04:50,522][03784] Avg episode reward: [(0, '1.165')] [2024-03-21 05:04:55,521][03784] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 47319.2). Total num frames: 1065254912. Throughput: 0: 47131.2. Samples: 1066537800. Policy #0 lag: (min: 2.0, avg: 37.2, max: 80.0) [2024-03-21 05:04:55,522][03784] Avg episode reward: [(0, '1.229')] [2024-03-21 05:04:57,807][04017] Updated weights for policy 0, policy_version 32512 (0.0026) [2024-03-21 05:05:00,521][03784] Fps is (10 sec: 45875.6, 60 sec: 49698.2, 300 sec: 47319.2). Total num frames: 1065517056. Throughput: 0: 47375.6. Samples: 1066690900. Policy #0 lag: (min: 2.0, avg: 37.2, max: 80.0) [2024-03-21 05:05:00,522][03784] Avg episode reward: [(0, '1.439')] [2024-03-21 05:05:04,494][04017] Updated weights for policy 0, policy_version 32522 (0.0014) [2024-03-21 05:05:05,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46967.6, 300 sec: 47763.5). Total num frames: 1065779200. Throughput: 0: 48097.7. Samples: 1066985500. Policy #0 lag: (min: 2.0, avg: 37.2, max: 80.0) [2024-03-21 05:05:05,522][03784] Avg episode reward: [(0, '1.439')] [2024-03-21 05:05:10,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45875.3, 300 sec: 47319.2). Total num frames: 1065877504. Throughput: 0: 47637.9. Samples: 1067267300. Policy #0 lag: (min: 2.0, avg: 37.2, max: 80.0) [2024-03-21 05:05:10,522][03784] Avg episode reward: [(0, '1.383')] [2024-03-21 05:05:12,444][04017] Updated weights for policy 0, policy_version 32532 (0.0020) [2024-03-21 05:05:15,521][03784] Fps is (10 sec: 39321.6, 60 sec: 45875.2, 300 sec: 47541.4). Total num frames: 1066172416. Throughput: 0: 47431.1. Samples: 1067402400. Policy #0 lag: (min: 0.0, avg: 43.3, max: 89.0) [2024-03-21 05:05:15,522][03784] Avg episode reward: [(0, '1.101')] [2024-03-21 05:05:18,129][04017] Updated weights for policy 0, policy_version 32542 (0.0011) [2024-03-21 05:05:20,521][03784] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 47319.2). Total num frames: 1066336256. Throughput: 0: 47382.1. Samples: 1067706500. Policy #0 lag: (min: 0.0, avg: 43.3, max: 89.0) [2024-03-21 05:05:20,522][03784] Avg episode reward: [(0, '1.101')] [2024-03-21 05:05:25,521][03784] Fps is (10 sec: 36044.7, 60 sec: 42052.3, 300 sec: 47319.2). Total num frames: 1066532864. Throughput: 0: 47531.1. Samples: 1067994200. Policy #0 lag: (min: 0.0, avg: 43.3, max: 89.0) [2024-03-21 05:05:25,522][03784] Avg episode reward: [(0, '1.326')] [2024-03-21 05:05:28,249][04017] Updated weights for policy 0, policy_version 32552 (0.0016) [2024-03-21 05:05:29,195][03995] Signal inference workers to stop experience collection... (21450 times) [2024-03-21 05:05:29,256][04017] InferenceWorker_p0-w0: stopping experience collection (21450 times) [2024-03-21 05:05:29,450][03995] Signal inference workers to resume experience collection... (21450 times) [2024-03-21 05:05:29,450][04017] InferenceWorker_p0-w0: resuming experience collection (21450 times) [2024-03-21 05:05:30,521][03784] Fps is (10 sec: 55705.8, 60 sec: 45875.2, 300 sec: 47541.4). Total num frames: 1066893312. Throughput: 0: 47600.0. Samples: 1068126000. Policy #0 lag: (min: 0.0, avg: 43.3, max: 89.0) [2024-03-21 05:05:30,522][03784] Avg episode reward: [(0, '0.699')] [2024-03-21 05:05:33,245][04017] Updated weights for policy 0, policy_version 32562 (0.0010) [2024-03-21 05:05:35,521][03784] Fps is (10 sec: 65536.2, 60 sec: 49151.9, 300 sec: 47541.4). Total num frames: 1067188224. Throughput: 0: 48037.8. Samples: 1068396000. Policy #0 lag: (min: 0.0, avg: 43.3, max: 89.0) [2024-03-21 05:05:35,524][03784] Avg episode reward: [(0, '0.699')] [2024-03-21 05:05:37,173][04017] Updated weights for policy 0, policy_version 32572 (0.0011) [2024-03-21 05:05:40,521][03784] Fps is (10 sec: 72089.3, 60 sec: 54067.2, 300 sec: 48096.8). Total num frames: 1067614208. Throughput: 0: 47453.3. Samples: 1068673200. Policy #0 lag: (min: 2.0, avg: 39.9, max: 88.0) [2024-03-21 05:05:40,522][03784] Avg episode reward: [(0, '0.699')] [2024-03-21 05:05:40,734][04017] Updated weights for policy 0, policy_version 32582 (0.0011) [2024-03-21 05:05:45,521][03784] Fps is (10 sec: 62259.4, 60 sec: 50790.4, 300 sec: 48207.8). Total num frames: 1067810816. Throughput: 0: 47251.1. Samples: 1068817200. Policy #0 lag: (min: 2.0, avg: 39.9, max: 88.0) [2024-03-21 05:05:45,522][03784] Avg episode reward: [(0, '1.075')] [2024-03-21 05:05:50,521][03784] Fps is (10 sec: 19660.8, 60 sec: 45875.2, 300 sec: 46874.9). Total num frames: 1067810816. Throughput: 0: 47575.5. Samples: 1069126400. Policy #0 lag: (min: 2.0, avg: 39.9, max: 88.0) [2024-03-21 05:05:50,522][03784] Avg episode reward: [(0, '1.483')] [2024-03-21 05:05:51,951][04017] Updated weights for policy 0, policy_version 32592 (0.0011) [2024-03-21 05:05:55,521][03784] Fps is (10 sec: 29491.2, 60 sec: 47513.6, 300 sec: 47097.1). Total num frames: 1068105728. Throughput: 0: 47531.1. Samples: 1069406200. Policy #0 lag: (min: 2.0, avg: 39.9, max: 88.0) [2024-03-21 05:05:55,522][03784] Avg episode reward: [(0, '1.235')] [2024-03-21 05:05:58,813][04017] Updated weights for policy 0, policy_version 32602 (0.0019) [2024-03-21 05:06:00,521][03784] Fps is (10 sec: 55705.5, 60 sec: 47513.5, 300 sec: 47208.1). Total num frames: 1068367872. Throughput: 0: 47262.2. Samples: 1069529200. Policy #0 lag: (min: 0.0, avg: 46.4, max: 120.0) [2024-03-21 05:06:00,522][03784] Avg episode reward: [(0, '1.253')] [2024-03-21 05:06:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000032604_1068367872.pth... [2024-03-21 05:06:00,673][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000032259_1057062912.pth [2024-03-21 05:06:05,521][03784] Fps is (10 sec: 39321.7, 60 sec: 45329.1, 300 sec: 47319.2). Total num frames: 1068498944. Throughput: 0: 46971.2. Samples: 1069820200. Policy #0 lag: (min: 0.0, avg: 46.4, max: 120.0) [2024-03-21 05:06:05,522][03784] Avg episode reward: [(0, '1.283')] [2024-03-21 05:06:06,746][04017] Updated weights for policy 0, policy_version 32612 (0.0013) [2024-03-21 05:06:10,521][03784] Fps is (10 sec: 32768.0, 60 sec: 46967.4, 300 sec: 47097.0). Total num frames: 1068695552. Throughput: 0: 46393.3. Samples: 1070081900. Policy #0 lag: (min: 0.0, avg: 46.4, max: 120.0) [2024-03-21 05:06:10,522][03784] Avg episode reward: [(0, '0.696')] [2024-03-21 05:06:15,521][03784] Fps is (10 sec: 29491.0, 60 sec: 43690.7, 300 sec: 46541.7). Total num frames: 1068793856. Throughput: 0: 46677.8. Samples: 1070226500. Policy #0 lag: (min: 0.0, avg: 46.4, max: 120.0) [2024-03-21 05:06:15,522][03784] Avg episode reward: [(0, '1.213')] [2024-03-21 05:06:17,002][03995] Signal inference workers to stop experience collection... (21500 times) [2024-03-21 05:06:17,076][04017] InferenceWorker_p0-w0: stopping experience collection (21500 times) [2024-03-21 05:06:17,257][03995] Signal inference workers to resume experience collection... (21500 times) [2024-03-21 05:06:17,257][04017] InferenceWorker_p0-w0: resuming experience collection (21500 times) [2024-03-21 05:06:17,260][04017] Updated weights for policy 0, policy_version 32622 (0.0011) [2024-03-21 05:06:20,521][03784] Fps is (10 sec: 45875.6, 60 sec: 46967.5, 300 sec: 47541.4). Total num frames: 1069154304. Throughput: 0: 46620.0. Samples: 1070493900. Policy #0 lag: (min: 1.0, avg: 30.1, max: 75.0) [2024-03-21 05:06:20,522][03784] Avg episode reward: [(0, '1.444')] [2024-03-21 05:06:24,709][04017] Updated weights for policy 0, policy_version 32632 (0.0011) [2024-03-21 05:06:25,521][03784] Fps is (10 sec: 55706.1, 60 sec: 46967.6, 300 sec: 47208.2). Total num frames: 1069350912. Throughput: 0: 47044.6. Samples: 1070790200. Policy #0 lag: (min: 1.0, avg: 30.1, max: 75.0) [2024-03-21 05:06:25,522][03784] Avg episode reward: [(0, '1.375')] [2024-03-21 05:06:27,712][04017] Updated weights for policy 0, policy_version 32642 (0.0017) [2024-03-21 05:06:30,521][03784] Fps is (10 sec: 55705.6, 60 sec: 46967.5, 300 sec: 47097.1). Total num frames: 1069711360. Throughput: 0: 46464.5. Samples: 1070908100. Policy #0 lag: (min: 1.0, avg: 30.1, max: 75.0) [2024-03-21 05:06:30,522][03784] Avg episode reward: [(0, '1.261')] [2024-03-21 05:06:35,038][04017] Updated weights for policy 0, policy_version 32652 (0.0015) [2024-03-21 05:06:35,521][03784] Fps is (10 sec: 62258.7, 60 sec: 46421.3, 300 sec: 47430.3). Total num frames: 1069973504. Throughput: 0: 45766.7. Samples: 1071185900. Policy #0 lag: (min: 1.0, avg: 30.1, max: 75.0) [2024-03-21 05:06:35,522][03784] Avg episode reward: [(0, '1.261')] [2024-03-21 05:06:39,952][04017] Updated weights for policy 0, policy_version 32662 (0.0012) [2024-03-21 05:06:40,521][03784] Fps is (10 sec: 58982.0, 60 sec: 44783.0, 300 sec: 47874.6). Total num frames: 1070301184. Throughput: 0: 45493.3. Samples: 1071453400. Policy #0 lag: (min: 1.0, avg: 30.1, max: 75.0) [2024-03-21 05:06:40,522][03784] Avg episode reward: [(0, '1.353')] [2024-03-21 05:06:43,721][04017] Updated weights for policy 0, policy_version 32672 (0.0012) [2024-03-21 05:06:45,521][03784] Fps is (10 sec: 65536.2, 60 sec: 46967.5, 300 sec: 48096.8). Total num frames: 1070628864. Throughput: 0: 45284.5. Samples: 1071567000. Policy #0 lag: (min: 2.0, avg: 41.2, max: 76.0) [2024-03-21 05:06:45,522][03784] Avg episode reward: [(0, '0.581')] [2024-03-21 05:06:50,521][03784] Fps is (10 sec: 52428.7, 60 sec: 50244.3, 300 sec: 47874.6). Total num frames: 1070825472. Throughput: 0: 45577.7. Samples: 1071871200. Policy #0 lag: (min: 2.0, avg: 41.2, max: 76.0) [2024-03-21 05:06:50,522][03784] Avg episode reward: [(0, '0.870')] [2024-03-21 05:06:55,521][03784] Fps is (10 sec: 19660.7, 60 sec: 45329.0, 300 sec: 46652.8). Total num frames: 1070825472. Throughput: 0: 46566.7. Samples: 1072177400. Policy #0 lag: (min: 2.0, avg: 41.2, max: 76.0) [2024-03-21 05:06:55,522][03784] Avg episode reward: [(0, '1.187')] [2024-03-21 05:07:00,521][03784] Fps is (10 sec: 6553.6, 60 sec: 42052.3, 300 sec: 46319.5). Total num frames: 1070891008. Throughput: 0: 46955.5. Samples: 1072339500. Policy #0 lag: (min: 2.0, avg: 41.2, max: 76.0) [2024-03-21 05:07:00,522][03784] Avg episode reward: [(0, '0.779')] [2024-03-21 05:07:00,934][04017] Updated weights for policy 0, policy_version 32682 (0.0015) [2024-03-21 05:07:05,521][03784] Fps is (10 sec: 29491.1, 60 sec: 43690.6, 300 sec: 46319.5). Total num frames: 1071120384. Throughput: 0: 47504.3. Samples: 1072631600. Policy #0 lag: (min: 1.0, avg: 27.4, max: 66.0) [2024-03-21 05:07:05,522][03784] Avg episode reward: [(0, '0.522')] [2024-03-21 05:07:06,277][03995] Signal inference workers to stop experience collection... (21550 times) [2024-03-21 05:07:06,295][03995] Signal inference workers to resume experience collection... (21550 times) [2024-03-21 05:07:06,322][04017] InferenceWorker_p0-w0: stopping experience collection (21550 times) [2024-03-21 05:07:06,360][04017] InferenceWorker_p0-w0: resuming experience collection (21550 times) [2024-03-21 05:07:06,617][04017] Updated weights for policy 0, policy_version 32692 (0.0023) [2024-03-21 05:07:10,521][03784] Fps is (10 sec: 49152.5, 60 sec: 44783.0, 300 sec: 46652.8). Total num frames: 1071382528. Throughput: 0: 46635.6. Samples: 1072888800. Policy #0 lag: (min: 1.0, avg: 27.4, max: 66.0) [2024-03-21 05:07:10,522][03784] Avg episode reward: [(0, '0.931')] [2024-03-21 05:07:13,666][04017] Updated weights for policy 0, policy_version 32702 (0.0010) [2024-03-21 05:07:15,521][03784] Fps is (10 sec: 58982.1, 60 sec: 48605.8, 300 sec: 46874.9). Total num frames: 1071710208. Throughput: 0: 46759.8. Samples: 1073012300. Policy #0 lag: (min: 1.0, avg: 27.4, max: 66.0) [2024-03-21 05:07:15,522][03784] Avg episode reward: [(0, '0.565')] [2024-03-21 05:07:17,518][04017] Updated weights for policy 0, policy_version 32712 (0.0018) [2024-03-21 05:07:20,521][03784] Fps is (10 sec: 81918.6, 60 sec: 50790.3, 300 sec: 47652.4). Total num frames: 1072201728. Throughput: 0: 45364.4. Samples: 1073227300. Policy #0 lag: (min: 1.0, avg: 27.4, max: 66.0) [2024-03-21 05:07:20,522][03784] Avg episode reward: [(0, '0.724')] [2024-03-21 05:07:20,661][04017] Updated weights for policy 0, policy_version 32722 (0.0011) [2024-03-21 05:07:25,521][03784] Fps is (10 sec: 72089.9, 60 sec: 51336.4, 300 sec: 47985.7). Total num frames: 1072431104. Throughput: 0: 45837.7. Samples: 1073516100. Policy #0 lag: (min: 2.0, avg: 45.8, max: 98.0) [2024-03-21 05:07:25,522][03784] Avg episode reward: [(0, '1.176')] [2024-03-21 05:07:27,999][04017] Updated weights for policy 0, policy_version 32732 (0.0010) [2024-03-21 05:07:30,521][03784] Fps is (10 sec: 42599.0, 60 sec: 48605.9, 300 sec: 47874.6). Total num frames: 1072627712. Throughput: 0: 46733.4. Samples: 1073670000. Policy #0 lag: (min: 2.0, avg: 45.8, max: 98.0) [2024-03-21 05:07:30,522][03784] Avg episode reward: [(0, '0.863')] [2024-03-21 05:07:35,521][03784] Fps is (10 sec: 36045.1, 60 sec: 46967.5, 300 sec: 47541.4). Total num frames: 1072791552. Throughput: 0: 47106.7. Samples: 1073991000. Policy #0 lag: (min: 2.0, avg: 45.8, max: 98.0) [2024-03-21 05:07:35,522][03784] Avg episode reward: [(0, '0.609')] [2024-03-21 05:07:40,306][04017] Updated weights for policy 0, policy_version 32742 (0.0010) [2024-03-21 05:07:40,521][03784] Fps is (10 sec: 26214.4, 60 sec: 43144.6, 300 sec: 46430.6). Total num frames: 1072889856. Throughput: 0: 47000.1. Samples: 1074292400. Policy #0 lag: (min: 2.0, avg: 45.8, max: 98.0) [2024-03-21 05:07:40,522][03784] Avg episode reward: [(0, '1.243')] [2024-03-21 05:07:45,387][04017] Updated weights for policy 0, policy_version 32752 (0.0018) [2024-03-21 05:07:45,521][03784] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 46986.0). Total num frames: 1073217536. Throughput: 0: 46335.5. Samples: 1074424600. Policy #0 lag: (min: 3.0, avg: 48.2, max: 106.0) [2024-03-21 05:07:45,522][03784] Avg episode reward: [(0, '1.115')] [2024-03-21 05:07:50,521][03784] Fps is (10 sec: 49151.7, 60 sec: 42598.4, 300 sec: 46986.0). Total num frames: 1073381376. Throughput: 0: 46046.7. Samples: 1074703700. Policy #0 lag: (min: 3.0, avg: 48.2, max: 106.0) [2024-03-21 05:07:50,522][03784] Avg episode reward: [(0, '1.451')] [2024-03-21 05:07:53,959][03995] Signal inference workers to stop experience collection... (21600 times) [2024-03-21 05:07:54,027][03995] Signal inference workers to resume experience collection... (21600 times) [2024-03-21 05:07:54,051][04017] InferenceWorker_p0-w0: stopping experience collection (21600 times) [2024-03-21 05:07:54,095][04017] InferenceWorker_p0-w0: resuming experience collection (21600 times) [2024-03-21 05:07:54,394][04017] Updated weights for policy 0, policy_version 32762 (0.0011) [2024-03-21 05:07:55,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46967.4, 300 sec: 46874.9). Total num frames: 1073643520. Throughput: 0: 46657.6. Samples: 1074988400. Policy #0 lag: (min: 3.0, avg: 48.2, max: 106.0) [2024-03-21 05:07:55,522][03784] Avg episode reward: [(0, '1.451')] [2024-03-21 05:07:58,910][04017] Updated weights for policy 0, policy_version 32772 (0.0013) [2024-03-21 05:08:00,521][03784] Fps is (10 sec: 55705.3, 60 sec: 50790.4, 300 sec: 47319.2). Total num frames: 1073938432. Throughput: 0: 46682.3. Samples: 1075113000. Policy #0 lag: (min: 3.0, avg: 48.2, max: 106.0) [2024-03-21 05:08:00,522][03784] Avg episode reward: [(0, '0.973')] [2024-03-21 05:08:00,709][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000032775_1073971200.pth... [2024-03-21 05:08:00,819][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000032426_1062535168.pth [2024-03-21 05:08:04,787][04017] Updated weights for policy 0, policy_version 32782 (0.0016) [2024-03-21 05:08:05,521][03784] Fps is (10 sec: 58982.4, 60 sec: 51882.6, 300 sec: 46986.0). Total num frames: 1074233344. Throughput: 0: 48302.2. Samples: 1075400900. Policy #0 lag: (min: 3.0, avg: 48.2, max: 106.0) [2024-03-21 05:08:05,522][03784] Avg episode reward: [(0, '1.398')] [2024-03-21 05:08:10,521][03784] Fps is (10 sec: 49151.9, 60 sec: 50790.3, 300 sec: 46763.8). Total num frames: 1074429952. Throughput: 0: 48277.8. Samples: 1075688600. Policy #0 lag: (min: 1.0, avg: 57.1, max: 109.0) [2024-03-21 05:08:10,522][03784] Avg episode reward: [(0, '0.726')] [2024-03-21 05:08:14,162][04017] Updated weights for policy 0, policy_version 32792 (0.0020) [2024-03-21 05:08:15,521][03784] Fps is (10 sec: 39321.8, 60 sec: 48605.9, 300 sec: 46652.7). Total num frames: 1074626560. Throughput: 0: 48559.9. Samples: 1075855200. Policy #0 lag: (min: 1.0, avg: 57.1, max: 109.0) [2024-03-21 05:08:15,522][03784] Avg episode reward: [(0, '0.625')] [2024-03-21 05:08:18,852][04017] Updated weights for policy 0, policy_version 32802 (0.0017) [2024-03-21 05:08:20,521][03784] Fps is (10 sec: 49152.5, 60 sec: 45329.2, 300 sec: 47319.2). Total num frames: 1074921472. Throughput: 0: 47220.0. Samples: 1076115900. Policy #0 lag: (min: 1.0, avg: 57.1, max: 109.0) [2024-03-21 05:08:20,522][03784] Avg episode reward: [(0, '0.625')] [2024-03-21 05:08:25,521][03784] Fps is (10 sec: 49152.3, 60 sec: 44783.0, 300 sec: 47208.1). Total num frames: 1075118080. Throughput: 0: 46617.7. Samples: 1076390200. Policy #0 lag: (min: 1.0, avg: 57.1, max: 109.0) [2024-03-21 05:08:25,522][03784] Avg episode reward: [(0, '1.302')] [2024-03-21 05:08:26,889][04017] Updated weights for policy 0, policy_version 32812 (0.0011) [2024-03-21 05:08:30,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44236.8, 300 sec: 46986.0). Total num frames: 1075281920. Throughput: 0: 47046.7. Samples: 1076541700. Policy #0 lag: (min: 0.0, avg: 35.9, max: 73.0) [2024-03-21 05:08:30,522][03784] Avg episode reward: [(0, '0.656')] [2024-03-21 05:08:35,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 1075478528. Throughput: 0: 47408.9. Samples: 1076837100. Policy #0 lag: (min: 0.0, avg: 35.9, max: 73.0) [2024-03-21 05:08:35,522][03784] Avg episode reward: [(0, '1.249')] [2024-03-21 05:08:36,954][04017] Updated weights for policy 0, policy_version 32822 (0.0017) [2024-03-21 05:08:40,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46967.4, 300 sec: 47097.1). Total num frames: 1075707904. Throughput: 0: 47997.9. Samples: 1077148300. Policy #0 lag: (min: 0.0, avg: 35.9, max: 73.0) [2024-03-21 05:08:40,522][03784] Avg episode reward: [(0, '1.249')] [2024-03-21 05:08:41,950][04017] Updated weights for policy 0, policy_version 32832 (0.0011) [2024-03-21 05:08:45,061][03995] Signal inference workers to stop experience collection... (21650 times) [2024-03-21 05:08:45,127][04017] InferenceWorker_p0-w0: stopping experience collection (21650 times) [2024-03-21 05:08:45,311][03995] Signal inference workers to resume experience collection... (21650 times) [2024-03-21 05:08:45,311][04017] InferenceWorker_p0-w0: resuming experience collection (21650 times) [2024-03-21 05:08:45,521][03784] Fps is (10 sec: 52429.0, 60 sec: 46421.4, 300 sec: 47208.1). Total num frames: 1076002816. Throughput: 0: 48186.8. Samples: 1077281400. Policy #0 lag: (min: 0.0, avg: 35.9, max: 73.0) [2024-03-21 05:08:45,522][03784] Avg episode reward: [(0, '0.707')] [2024-03-21 05:08:47,423][04017] Updated weights for policy 0, policy_version 32842 (0.0011) [2024-03-21 05:08:50,521][03784] Fps is (10 sec: 62259.0, 60 sec: 49152.0, 300 sec: 47763.5). Total num frames: 1076330496. Throughput: 0: 48302.3. Samples: 1077574500. Policy #0 lag: (min: 0.0, avg: 35.9, max: 73.0) [2024-03-21 05:08:50,522][03784] Avg episode reward: [(0, '0.707')] [2024-03-21 05:08:54,356][04017] Updated weights for policy 0, policy_version 32852 (0.0017) [2024-03-21 05:08:55,521][03784] Fps is (10 sec: 58981.9, 60 sec: 49152.0, 300 sec: 47652.5). Total num frames: 1076592640. Throughput: 0: 48526.7. Samples: 1077872300. Policy #0 lag: (min: 0.0, avg: 39.5, max: 92.0) [2024-03-21 05:08:55,522][03784] Avg episode reward: [(0, '0.862')] [2024-03-21 05:08:58,424][04017] Updated weights for policy 0, policy_version 32862 (0.0016) [2024-03-21 05:09:00,521][03784] Fps is (10 sec: 58982.1, 60 sec: 49698.1, 300 sec: 47319.2). Total num frames: 1076920320. Throughput: 0: 47919.9. Samples: 1078011600. Policy #0 lag: (min: 0.0, avg: 39.5, max: 92.0) [2024-03-21 05:09:00,522][03784] Avg episode reward: [(0, '0.526')] [2024-03-21 05:09:04,024][04017] Updated weights for policy 0, policy_version 32872 (0.0017) [2024-03-21 05:09:05,521][03784] Fps is (10 sec: 58982.5, 60 sec: 49152.1, 300 sec: 47652.5). Total num frames: 1077182464. Throughput: 0: 48386.6. Samples: 1078293300. Policy #0 lag: (min: 0.0, avg: 39.5, max: 92.0) [2024-03-21 05:09:05,522][03784] Avg episode reward: [(0, '0.650')] [2024-03-21 05:09:10,521][03784] Fps is (10 sec: 45875.4, 60 sec: 49152.0, 300 sec: 47319.2). Total num frames: 1077379072. Throughput: 0: 48642.1. Samples: 1078579100. Policy #0 lag: (min: 0.0, avg: 39.5, max: 92.0) [2024-03-21 05:09:10,523][03784] Avg episode reward: [(0, '0.977')] [2024-03-21 05:09:11,324][04017] Updated weights for policy 0, policy_version 32882 (0.0011) [2024-03-21 05:09:15,521][03784] Fps is (10 sec: 45875.3, 60 sec: 50244.3, 300 sec: 47208.1). Total num frames: 1077641216. Throughput: 0: 47773.3. Samples: 1078691500. Policy #0 lag: (min: 0.0, avg: 39.5, max: 92.0) [2024-03-21 05:09:15,522][03784] Avg episode reward: [(0, '1.583')] [2024-03-21 05:09:20,306][04017] Updated weights for policy 0, policy_version 32892 (0.0013) [2024-03-21 05:09:20,521][03784] Fps is (10 sec: 42598.6, 60 sec: 48059.7, 300 sec: 46763.8). Total num frames: 1077805056. Throughput: 0: 47191.1. Samples: 1078960700. Policy #0 lag: (min: 0.0, avg: 40.6, max: 90.0) [2024-03-21 05:09:20,522][03784] Avg episode reward: [(0, '1.283')] [2024-03-21 05:09:25,521][03784] Fps is (10 sec: 29491.0, 60 sec: 46967.4, 300 sec: 46763.8). Total num frames: 1077936128. Throughput: 0: 46428.8. Samples: 1079237600. Policy #0 lag: (min: 0.0, avg: 40.6, max: 90.0) [2024-03-21 05:09:25,522][03784] Avg episode reward: [(0, '1.510')] [2024-03-21 05:09:30,521][03784] Fps is (10 sec: 26214.4, 60 sec: 46421.3, 300 sec: 46874.9). Total num frames: 1078067200. Throughput: 0: 47011.1. Samples: 1079396900. Policy #0 lag: (min: 0.0, avg: 40.6, max: 90.0) [2024-03-21 05:09:30,522][03784] Avg episode reward: [(0, '1.004')] [2024-03-21 05:09:32,621][04017] Updated weights for policy 0, policy_version 32902 (0.0010) [2024-03-21 05:09:35,521][03784] Fps is (10 sec: 36045.1, 60 sec: 46967.5, 300 sec: 47208.1). Total num frames: 1078296576. Throughput: 0: 47206.7. Samples: 1079698800. Policy #0 lag: (min: 0.0, avg: 40.6, max: 90.0) [2024-03-21 05:09:35,522][03784] Avg episode reward: [(0, '1.104')] [2024-03-21 05:09:39,784][04017] Updated weights for policy 0, policy_version 32912 (0.0012) [2024-03-21 05:09:39,811][03995] Signal inference workers to stop experience collection... (21700 times) [2024-03-21 05:09:39,811][03995] Signal inference workers to resume experience collection... (21700 times) [2024-03-21 05:09:39,850][04017] InferenceWorker_p0-w0: stopping experience collection (21700 times) [2024-03-21 05:09:39,851][04017] InferenceWorker_p0-w0: resuming experience collection (21700 times) [2024-03-21 05:09:40,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46967.5, 300 sec: 46652.7). Total num frames: 1078525952. Throughput: 0: 46804.5. Samples: 1079978500. Policy #0 lag: (min: 1.0, avg: 30.5, max: 71.0) [2024-03-21 05:09:40,522][03784] Avg episode reward: [(0, '1.536')] [2024-03-21 05:09:44,579][04017] Updated weights for policy 0, policy_version 32922 (0.0020) [2024-03-21 05:09:45,521][03784] Fps is (10 sec: 55705.7, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 1078853632. Throughput: 0: 46915.7. Samples: 1080122800. Policy #0 lag: (min: 1.0, avg: 30.5, max: 71.0) [2024-03-21 05:09:45,522][03784] Avg episode reward: [(0, '1.265')] [2024-03-21 05:09:49,282][04017] Updated weights for policy 0, policy_version 32932 (0.0011) [2024-03-21 05:09:50,521][03784] Fps is (10 sec: 58982.5, 60 sec: 46421.4, 300 sec: 46986.0). Total num frames: 1079115776. Throughput: 0: 46495.6. Samples: 1080385600. Policy #0 lag: (min: 1.0, avg: 30.5, max: 71.0) [2024-03-21 05:09:50,522][03784] Avg episode reward: [(0, '0.860')] [2024-03-21 05:09:55,521][03784] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 1079312384. Throughput: 0: 46566.7. Samples: 1080674600. Policy #0 lag: (min: 1.0, avg: 30.5, max: 71.0) [2024-03-21 05:09:55,522][03784] Avg episode reward: [(0, '1.542')] [2024-03-21 05:09:56,698][04017] Updated weights for policy 0, policy_version 32942 (0.0013) [2024-03-21 05:10:00,521][03784] Fps is (10 sec: 62259.2, 60 sec: 46967.5, 300 sec: 47319.2). Total num frames: 1079738368. Throughput: 0: 46875.5. Samples: 1080800900. Policy #0 lag: (min: 1.0, avg: 30.5, max: 71.0) [2024-03-21 05:10:00,522][03784] Avg episode reward: [(0, '0.696')] [2024-03-21 05:10:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000032951_1079738368.pth... [2024-03-21 05:10:00,649][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000032604_1068367872.pth [2024-03-21 05:10:02,112][04017] Updated weights for policy 0, policy_version 32952 (0.0017) [2024-03-21 05:10:05,521][03784] Fps is (10 sec: 68812.4, 60 sec: 46967.4, 300 sec: 47874.6). Total num frames: 1080000512. Throughput: 0: 47142.2. Samples: 1081082100. Policy #0 lag: (min: 0.0, avg: 40.8, max: 72.0) [2024-03-21 05:10:05,522][03784] Avg episode reward: [(0, '1.530')] [2024-03-21 05:10:06,488][04017] Updated weights for policy 0, policy_version 32962 (0.0010) [2024-03-21 05:10:10,521][03784] Fps is (10 sec: 45874.7, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 1080197120. Throughput: 0: 47513.3. Samples: 1081375700. Policy #0 lag: (min: 0.0, avg: 40.8, max: 72.0) [2024-03-21 05:10:10,522][03784] Avg episode reward: [(0, '1.227')] [2024-03-21 05:10:15,521][03784] Fps is (10 sec: 39321.6, 60 sec: 45875.2, 300 sec: 47652.4). Total num frames: 1080393728. Throughput: 0: 47128.8. Samples: 1081517700. Policy #0 lag: (min: 0.0, avg: 40.8, max: 72.0) [2024-03-21 05:10:15,522][03784] Avg episode reward: [(0, '0.728')] [2024-03-21 05:10:17,649][04017] Updated weights for policy 0, policy_version 32972 (0.0016) [2024-03-21 05:10:20,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45875.1, 300 sec: 47541.4). Total num frames: 1080557568. Throughput: 0: 46877.6. Samples: 1081808300. Policy #0 lag: (min: 0.0, avg: 40.8, max: 72.0) [2024-03-21 05:10:20,522][03784] Avg episode reward: [(0, '1.345')] [2024-03-21 05:10:25,521][03784] Fps is (10 sec: 32768.3, 60 sec: 46421.4, 300 sec: 46874.9). Total num frames: 1080721408. Throughput: 0: 47435.6. Samples: 1082113100. Policy #0 lag: (min: 0.0, avg: 40.8, max: 72.0) [2024-03-21 05:10:25,522][03784] Avg episode reward: [(0, '1.345')] [2024-03-21 05:10:26,650][04017] Updated weights for policy 0, policy_version 32982 (0.0011) [2024-03-21 05:10:30,521][03784] Fps is (10 sec: 36045.3, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 1080918016. Throughput: 0: 47446.6. Samples: 1082257900. Policy #0 lag: (min: 0.0, avg: 32.1, max: 80.0) [2024-03-21 05:10:30,522][03784] Avg episode reward: [(0, '1.210')] [2024-03-21 05:10:31,423][03995] Signal inference workers to stop experience collection... (21750 times) [2024-03-21 05:10:31,491][03995] Signal inference workers to resume experience collection... (21750 times) [2024-03-21 05:10:31,496][04017] InferenceWorker_p0-w0: stopping experience collection (21750 times) [2024-03-21 05:10:31,538][04017] InferenceWorker_p0-w0: resuming experience collection (21750 times) [2024-03-21 05:10:33,305][04017] Updated weights for policy 0, policy_version 32992 (0.0015) [2024-03-21 05:10:35,521][03784] Fps is (10 sec: 55705.2, 60 sec: 49698.1, 300 sec: 46319.5). Total num frames: 1081278464. Throughput: 0: 47464.4. Samples: 1082521500. Policy #0 lag: (min: 0.0, avg: 32.1, max: 80.0) [2024-03-21 05:10:35,522][03784] Avg episode reward: [(0, '1.210')] [2024-03-21 05:10:36,816][04017] Updated weights for policy 0, policy_version 33002 (0.0028) [2024-03-21 05:10:40,521][03784] Fps is (10 sec: 65535.2, 60 sec: 50790.3, 300 sec: 46652.7). Total num frames: 1081573376. Throughput: 0: 46013.2. Samples: 1082745200. Policy #0 lag: (min: 0.0, avg: 32.1, max: 80.0) [2024-03-21 05:10:40,522][03784] Avg episode reward: [(0, '1.062')] [2024-03-21 05:10:43,368][04017] Updated weights for policy 0, policy_version 33012 (0.0013) [2024-03-21 05:10:45,521][03784] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 47430.3). Total num frames: 1081802752. Throughput: 0: 46635.6. Samples: 1082899500. Policy #0 lag: (min: 0.0, avg: 32.1, max: 80.0) [2024-03-21 05:10:45,522][03784] Avg episode reward: [(0, '1.127')] [2024-03-21 05:10:50,521][03784] Fps is (10 sec: 42598.7, 60 sec: 48059.7, 300 sec: 47097.0). Total num frames: 1081999360. Throughput: 0: 47151.1. Samples: 1083203900. Policy #0 lag: (min: 0.0, avg: 38.9, max: 86.0) [2024-03-21 05:10:50,530][03784] Avg episode reward: [(0, '1.461')] [2024-03-21 05:10:55,055][04017] Updated weights for policy 0, policy_version 33022 (0.0010) [2024-03-21 05:10:55,521][03784] Fps is (10 sec: 29491.2, 60 sec: 46421.3, 300 sec: 46541.7). Total num frames: 1082097664. Throughput: 0: 47620.1. Samples: 1083518600. Policy #0 lag: (min: 0.0, avg: 38.9, max: 86.0) [2024-03-21 05:10:55,522][03784] Avg episode reward: [(0, '1.235')] [2024-03-21 05:11:00,311][04017] Updated weights for policy 0, policy_version 33032 (0.0011) [2024-03-21 05:11:00,521][03784] Fps is (10 sec: 39321.4, 60 sec: 44236.7, 300 sec: 47097.0). Total num frames: 1082392576. Throughput: 0: 47648.9. Samples: 1083661900. Policy #0 lag: (min: 0.0, avg: 38.9, max: 86.0) [2024-03-21 05:11:00,522][03784] Avg episode reward: [(0, '1.528')] [2024-03-21 05:11:05,251][04017] Updated weights for policy 0, policy_version 33042 (0.0011) [2024-03-21 05:11:05,521][03784] Fps is (10 sec: 62259.4, 60 sec: 45329.1, 300 sec: 47541.4). Total num frames: 1082720256. Throughput: 0: 47560.2. Samples: 1083948500. Policy #0 lag: (min: 0.0, avg: 38.9, max: 86.0) [2024-03-21 05:11:05,522][03784] Avg episode reward: [(0, '1.528')] [2024-03-21 05:11:10,521][03784] Fps is (10 sec: 45875.3, 60 sec: 44236.9, 300 sec: 47652.4). Total num frames: 1082851328. Throughput: 0: 47113.2. Samples: 1084233200. Policy #0 lag: (min: 0.0, avg: 38.9, max: 86.0) [2024-03-21 05:11:10,522][03784] Avg episode reward: [(0, '1.528')] [2024-03-21 05:11:15,521][03784] Fps is (10 sec: 29491.1, 60 sec: 43690.7, 300 sec: 46986.0). Total num frames: 1083015168. Throughput: 0: 46877.8. Samples: 1084367400. Policy #0 lag: (min: 0.0, avg: 41.6, max: 84.0) [2024-03-21 05:11:15,522][03784] Avg episode reward: [(0, '1.785')] [2024-03-21 05:11:15,522][03995] Saving new best policy, reward=1.785! [2024-03-21 05:11:16,444][04017] Updated weights for policy 0, policy_version 33052 (0.0011) [2024-03-21 05:11:19,961][04017] Updated weights for policy 0, policy_version 33062 (0.0011) [2024-03-21 05:11:20,521][03784] Fps is (10 sec: 55706.2, 60 sec: 47513.7, 300 sec: 47652.4). Total num frames: 1083408384. Throughput: 0: 46960.1. Samples: 1084634700. Policy #0 lag: (min: 0.0, avg: 41.6, max: 84.0) [2024-03-21 05:11:20,522][03784] Avg episode reward: [(0, '0.717')] [2024-03-21 05:11:21,473][03995] Signal inference workers to stop experience collection... (21800 times) [2024-03-21 05:11:21,475][03995] Signal inference workers to resume experience collection... (21800 times) [2024-03-21 05:11:21,536][04017] InferenceWorker_p0-w0: stopping experience collection (21800 times) [2024-03-21 05:11:21,537][04017] InferenceWorker_p0-w0: resuming experience collection (21800 times) [2024-03-21 05:11:22,931][04017] Updated weights for policy 0, policy_version 33072 (0.0014) [2024-03-21 05:11:25,521][03784] Fps is (10 sec: 72088.7, 60 sec: 50244.1, 300 sec: 47541.3). Total num frames: 1083736064. Throughput: 0: 48197.8. Samples: 1084914100. Policy #0 lag: (min: 0.0, avg: 41.6, max: 84.0) [2024-03-21 05:11:25,523][03784] Avg episode reward: [(0, '1.309')] [2024-03-21 05:11:30,276][04017] Updated weights for policy 0, policy_version 33082 (0.0011) [2024-03-21 05:11:30,521][03784] Fps is (10 sec: 62258.9, 60 sec: 51882.7, 300 sec: 47652.4). Total num frames: 1084030976. Throughput: 0: 48280.0. Samples: 1085072100. Policy #0 lag: (min: 0.0, avg: 41.6, max: 84.0) [2024-03-21 05:11:30,522][03784] Avg episode reward: [(0, '1.309')] [2024-03-21 05:11:35,521][03784] Fps is (10 sec: 52429.2, 60 sec: 49698.1, 300 sec: 47319.2). Total num frames: 1084260352. Throughput: 0: 47464.4. Samples: 1085339800. Policy #0 lag: (min: 0.0, avg: 43.3, max: 89.0) [2024-03-21 05:11:35,522][03784] Avg episode reward: [(0, '1.072')] [2024-03-21 05:11:40,198][04017] Updated weights for policy 0, policy_version 33092 (0.0023) [2024-03-21 05:11:40,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46421.4, 300 sec: 46541.7). Total num frames: 1084358656. Throughput: 0: 47037.8. Samples: 1085635300. Policy #0 lag: (min: 0.0, avg: 43.3, max: 89.0) [2024-03-21 05:11:40,522][03784] Avg episode reward: [(0, '1.312')] [2024-03-21 05:11:45,521][03784] Fps is (10 sec: 29491.3, 60 sec: 45875.2, 300 sec: 46541.7). Total num frames: 1084555264. Throughput: 0: 47140.1. Samples: 1085783200. Policy #0 lag: (min: 0.0, avg: 43.3, max: 89.0) [2024-03-21 05:11:45,522][03784] Avg episode reward: [(0, '0.904')] [2024-03-21 05:11:49,032][04017] Updated weights for policy 0, policy_version 33102 (0.0023) [2024-03-21 05:11:50,521][03784] Fps is (10 sec: 39321.5, 60 sec: 45875.2, 300 sec: 47208.1). Total num frames: 1084751872. Throughput: 0: 46899.9. Samples: 1086059000. Policy #0 lag: (min: 0.0, avg: 43.3, max: 89.0) [2024-03-21 05:11:50,522][03784] Avg episode reward: [(0, '1.618')] [2024-03-21 05:11:55,521][03784] Fps is (10 sec: 36044.9, 60 sec: 46967.5, 300 sec: 47541.4). Total num frames: 1084915712. Throughput: 0: 47075.7. Samples: 1086351600. Policy #0 lag: (min: 1.0, avg: 23.2, max: 57.0) [2024-03-21 05:11:55,522][03784] Avg episode reward: [(0, '1.383')] [2024-03-21 05:11:56,728][04017] Updated weights for policy 0, policy_version 33112 (0.0014) [2024-03-21 05:12:00,521][03784] Fps is (10 sec: 45875.5, 60 sec: 46967.5, 300 sec: 47763.5). Total num frames: 1085210624. Throughput: 0: 47420.0. Samples: 1086501300. Policy #0 lag: (min: 1.0, avg: 23.2, max: 57.0) [2024-03-21 05:12:00,522][03784] Avg episode reward: [(0, '1.288')] [2024-03-21 05:12:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000033119_1085243392.pth... [2024-03-21 05:12:00,655][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000032775_1073971200.pth [2024-03-21 05:12:02,881][04017] Updated weights for policy 0, policy_version 33122 (0.0011) [2024-03-21 05:12:05,521][03784] Fps is (10 sec: 52428.3, 60 sec: 45329.0, 300 sec: 47652.4). Total num frames: 1085440000. Throughput: 0: 47744.3. Samples: 1086783200. Policy #0 lag: (min: 1.0, avg: 23.2, max: 57.0) [2024-03-21 05:12:05,522][03784] Avg episode reward: [(0, '1.245')] [2024-03-21 05:12:08,071][04017] Updated weights for policy 0, policy_version 33132 (0.0009) [2024-03-21 05:12:10,521][03784] Fps is (10 sec: 52428.6, 60 sec: 48059.8, 300 sec: 47541.4). Total num frames: 1085734912. Throughput: 0: 47897.9. Samples: 1087069500. Policy #0 lag: (min: 1.0, avg: 23.2, max: 57.0) [2024-03-21 05:12:10,522][03784] Avg episode reward: [(0, '1.216')] [2024-03-21 05:12:14,794][03995] Signal inference workers to stop experience collection... (21850 times) [2024-03-21 05:12:14,905][04017] InferenceWorker_p0-w0: stopping experience collection (21850 times) [2024-03-21 05:12:15,024][03995] Signal inference workers to resume experience collection... (21850 times) [2024-03-21 05:12:15,024][04017] InferenceWorker_p0-w0: resuming experience collection (21850 times) [2024-03-21 05:12:15,521][03784] Fps is (10 sec: 52428.9, 60 sec: 49152.0, 300 sec: 46652.8). Total num frames: 1085964288. Throughput: 0: 48086.6. Samples: 1087236000. Policy #0 lag: (min: 1.0, avg: 23.2, max: 57.0) [2024-03-21 05:12:15,522][03784] Avg episode reward: [(0, '1.216')] [2024-03-21 05:12:15,653][04017] Updated weights for policy 0, policy_version 33142 (0.0010) [2024-03-21 05:12:20,521][03784] Fps is (10 sec: 52429.0, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 1086259200. Throughput: 0: 48446.7. Samples: 1087519900. Policy #0 lag: (min: 0.0, avg: 32.3, max: 84.0) [2024-03-21 05:12:20,522][03784] Avg episode reward: [(0, '1.408')] [2024-03-21 05:12:21,771][04017] Updated weights for policy 0, policy_version 33152 (0.0011) [2024-03-21 05:12:24,832][04017] Updated weights for policy 0, policy_version 33162 (0.0011) [2024-03-21 05:12:25,521][03784] Fps is (10 sec: 75367.1, 60 sec: 49698.3, 300 sec: 47763.5). Total num frames: 1086717952. Throughput: 0: 46851.2. Samples: 1087743600. Policy #0 lag: (min: 0.0, avg: 32.3, max: 84.0) [2024-03-21 05:12:25,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 05:12:30,521][03784] Fps is (10 sec: 65535.2, 60 sec: 48059.6, 300 sec: 47874.6). Total num frames: 1086914560. Throughput: 0: 46971.0. Samples: 1087896900. Policy #0 lag: (min: 0.0, avg: 32.3, max: 84.0) [2024-03-21 05:12:30,522][03784] Avg episode reward: [(0, '0.890')] [2024-03-21 05:12:34,523][04017] Updated weights for policy 0, policy_version 33172 (0.0015) [2024-03-21 05:12:35,521][03784] Fps is (10 sec: 32767.6, 60 sec: 46421.3, 300 sec: 47985.7). Total num frames: 1087045632. Throughput: 0: 47264.4. Samples: 1088185900. Policy #0 lag: (min: 0.0, avg: 32.3, max: 84.0) [2024-03-21 05:12:35,522][03784] Avg episode reward: [(0, '0.772')] [2024-03-21 05:12:40,521][03784] Fps is (10 sec: 29491.4, 60 sec: 47513.6, 300 sec: 47430.3). Total num frames: 1087209472. Throughput: 0: 46999.9. Samples: 1088466600. Policy #0 lag: (min: 0.0, avg: 46.2, max: 112.0) [2024-03-21 05:12:40,522][03784] Avg episode reward: [(0, '1.286')] [2024-03-21 05:12:44,053][04017] Updated weights for policy 0, policy_version 33182 (0.0017) [2024-03-21 05:12:45,521][03784] Fps is (10 sec: 32768.1, 60 sec: 46967.4, 300 sec: 47430.3). Total num frames: 1087373312. Throughput: 0: 46926.6. Samples: 1088613000. Policy #0 lag: (min: 0.0, avg: 46.2, max: 112.0) [2024-03-21 05:12:45,522][03784] Avg episode reward: [(0, '1.203')] [2024-03-21 05:12:50,521][03784] Fps is (10 sec: 36045.0, 60 sec: 46967.5, 300 sec: 47208.2). Total num frames: 1087569920. Throughput: 0: 47126.8. Samples: 1088903900. Policy #0 lag: (min: 0.0, avg: 46.2, max: 112.0) [2024-03-21 05:12:50,522][03784] Avg episode reward: [(0, '0.893')] [2024-03-21 05:12:54,550][04017] Updated weights for policy 0, policy_version 33192 (0.0010) [2024-03-21 05:12:55,521][03784] Fps is (10 sec: 32768.3, 60 sec: 46421.3, 300 sec: 46652.8). Total num frames: 1087700992. Throughput: 0: 46842.3. Samples: 1089177400. Policy #0 lag: (min: 0.0, avg: 46.2, max: 112.0) [2024-03-21 05:12:55,522][03784] Avg episode reward: [(0, '1.242')] [2024-03-21 05:13:00,230][04017] Updated weights for policy 0, policy_version 33202 (0.0010) [2024-03-21 05:13:00,521][03784] Fps is (10 sec: 39321.2, 60 sec: 45875.1, 300 sec: 46541.7). Total num frames: 1087963136. Throughput: 0: 46346.6. Samples: 1089321600. Policy #0 lag: (min: 0.0, avg: 28.6, max: 66.0) [2024-03-21 05:13:00,522][03784] Avg episode reward: [(0, '1.183')] [2024-03-21 05:13:05,313][04017] Updated weights for policy 0, policy_version 33212 (0.0011) [2024-03-21 05:13:05,521][03784] Fps is (10 sec: 58982.0, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 1088290816. Throughput: 0: 46173.3. Samples: 1089597700. Policy #0 lag: (min: 0.0, avg: 28.6, max: 66.0) [2024-03-21 05:13:05,522][03784] Avg episode reward: [(0, '1.224')] [2024-03-21 05:13:08,713][03995] Signal inference workers to stop experience collection... (21900 times) [2024-03-21 05:13:08,719][03995] Signal inference workers to resume experience collection... (21900 times) [2024-03-21 05:13:08,790][04017] InferenceWorker_p0-w0: stopping experience collection (21900 times) [2024-03-21 05:13:08,790][04017] InferenceWorker_p0-w0: resuming experience collection (21900 times) [2024-03-21 05:13:10,521][03784] Fps is (10 sec: 58982.2, 60 sec: 46967.4, 300 sec: 47208.1). Total num frames: 1088552960. Throughput: 0: 47744.3. Samples: 1089892100. Policy #0 lag: (min: 0.0, avg: 28.6, max: 66.0) [2024-03-21 05:13:10,522][03784] Avg episode reward: [(0, '1.224')] [2024-03-21 05:13:11,229][04017] Updated weights for policy 0, policy_version 33222 (0.0021) [2024-03-21 05:13:15,521][03784] Fps is (10 sec: 55705.9, 60 sec: 48059.8, 300 sec: 47208.1). Total num frames: 1088847872. Throughput: 0: 47473.5. Samples: 1090033200. Policy #0 lag: (min: 0.0, avg: 28.6, max: 66.0) [2024-03-21 05:13:15,522][03784] Avg episode reward: [(0, '1.520')] [2024-03-21 05:13:17,009][04017] Updated weights for policy 0, policy_version 33232 (0.0010) [2024-03-21 05:13:20,521][03784] Fps is (10 sec: 55706.6, 60 sec: 47513.7, 300 sec: 47430.3). Total num frames: 1089110016. Throughput: 0: 47502.4. Samples: 1090323500. Policy #0 lag: (min: 0.0, avg: 28.6, max: 66.0) [2024-03-21 05:13:20,522][03784] Avg episode reward: [(0, '1.520')] [2024-03-21 05:13:24,389][04017] Updated weights for policy 0, policy_version 33242 (0.0010) [2024-03-21 05:13:25,521][03784] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 47541.4). Total num frames: 1089306624. Throughput: 0: 47922.3. Samples: 1090623100. Policy #0 lag: (min: 0.0, avg: 44.6, max: 92.0) [2024-03-21 05:13:25,522][03784] Avg episode reward: [(0, '1.090')] [2024-03-21 05:13:30,034][04017] Updated weights for policy 0, policy_version 33252 (0.0022) [2024-03-21 05:13:30,521][03784] Fps is (10 sec: 52427.9, 60 sec: 45329.1, 300 sec: 47985.7). Total num frames: 1089634304. Throughput: 0: 48048.8. Samples: 1090775200. Policy #0 lag: (min: 0.0, avg: 44.6, max: 92.0) [2024-03-21 05:13:30,522][03784] Avg episode reward: [(0, '1.146')] [2024-03-21 05:13:35,521][03784] Fps is (10 sec: 52428.5, 60 sec: 46421.4, 300 sec: 47874.6). Total num frames: 1089830912. Throughput: 0: 47662.1. Samples: 1091048700. Policy #0 lag: (min: 0.0, avg: 44.6, max: 92.0) [2024-03-21 05:13:35,522][03784] Avg episode reward: [(0, '1.527')] [2024-03-21 05:13:37,093][04017] Updated weights for policy 0, policy_version 33262 (0.0015) [2024-03-21 05:13:40,521][03784] Fps is (10 sec: 42598.3, 60 sec: 47513.5, 300 sec: 47652.4). Total num frames: 1090060288. Throughput: 0: 47895.4. Samples: 1091332700. Policy #0 lag: (min: 0.0, avg: 44.6, max: 92.0) [2024-03-21 05:13:40,522][03784] Avg episode reward: [(0, '1.259')] [2024-03-21 05:13:45,019][04017] Updated weights for policy 0, policy_version 33272 (0.0015) [2024-03-21 05:13:45,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48605.9, 300 sec: 47319.2). Total num frames: 1090289664. Throughput: 0: 47431.1. Samples: 1091456000. Policy #0 lag: (min: 0.0, avg: 53.3, max: 114.0) [2024-03-21 05:13:45,522][03784] Avg episode reward: [(0, '1.535')] [2024-03-21 05:13:48,848][04017] Updated weights for policy 0, policy_version 33282 (0.0012) [2024-03-21 05:13:50,521][03784] Fps is (10 sec: 52429.6, 60 sec: 50244.3, 300 sec: 47430.3). Total num frames: 1090584576. Throughput: 0: 47500.1. Samples: 1091735200. Policy #0 lag: (min: 0.0, avg: 53.3, max: 114.0) [2024-03-21 05:13:50,522][03784] Avg episode reward: [(0, '0.771')] [2024-03-21 05:13:55,521][03784] Fps is (10 sec: 39321.9, 60 sec: 49698.1, 300 sec: 46652.8). Total num frames: 1090682880. Throughput: 0: 47924.5. Samples: 1092048700. Policy #0 lag: (min: 0.0, avg: 53.3, max: 114.0) [2024-03-21 05:13:55,522][03784] Avg episode reward: [(0, '0.676')] [2024-03-21 05:13:58,722][04017] Updated weights for policy 0, policy_version 33292 (0.0017) [2024-03-21 05:14:00,521][03784] Fps is (10 sec: 32767.9, 60 sec: 49152.0, 300 sec: 46541.7). Total num frames: 1090912256. Throughput: 0: 48097.7. Samples: 1092197600. Policy #0 lag: (min: 0.0, avg: 53.3, max: 114.0) [2024-03-21 05:14:00,522][03784] Avg episode reward: [(0, '0.626')] [2024-03-21 05:14:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000033292_1090912256.pth... [2024-03-21 05:14:00,701][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000032951_1079738368.pth [2024-03-21 05:14:01,254][03995] Signal inference workers to stop experience collection... (21950 times) [2024-03-21 05:14:01,321][04017] InferenceWorker_p0-w0: stopping experience collection (21950 times) [2024-03-21 05:14:01,516][03995] Signal inference workers to resume experience collection... (21950 times) [2024-03-21 05:14:01,517][04017] InferenceWorker_p0-w0: resuming experience collection (21950 times) [2024-03-21 05:14:05,521][03784] Fps is (10 sec: 52428.9, 60 sec: 48605.9, 300 sec: 46874.9). Total num frames: 1091207168. Throughput: 0: 47759.9. Samples: 1092472700. Policy #0 lag: (min: 0.0, avg: 53.3, max: 114.0) [2024-03-21 05:14:05,522][03784] Avg episode reward: [(0, '1.574')] [2024-03-21 05:14:06,437][04017] Updated weights for policy 0, policy_version 33302 (0.0016) [2024-03-21 05:14:10,521][03784] Fps is (10 sec: 58982.6, 60 sec: 49152.1, 300 sec: 46986.0). Total num frames: 1091502080. Throughput: 0: 47017.8. Samples: 1092738900. Policy #0 lag: (min: 0.0, avg: 43.1, max: 95.0) [2024-03-21 05:14:10,522][03784] Avg episode reward: [(0, '1.645')] [2024-03-21 05:14:11,726][04017] Updated weights for policy 0, policy_version 33312 (0.0012) [2024-03-21 05:14:15,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46967.4, 300 sec: 46986.0). Total num frames: 1091665920. Throughput: 0: 47166.7. Samples: 1092897700. Policy #0 lag: (min: 0.0, avg: 43.1, max: 95.0) [2024-03-21 05:14:15,522][03784] Avg episode reward: [(0, '1.217')] [2024-03-21 05:14:20,521][03784] Fps is (10 sec: 36044.6, 60 sec: 45875.1, 300 sec: 47208.1). Total num frames: 1091862528. Throughput: 0: 47780.0. Samples: 1093198800. Policy #0 lag: (min: 0.0, avg: 43.1, max: 95.0) [2024-03-21 05:14:20,522][03784] Avg episode reward: [(0, '0.891')] [2024-03-21 05:14:20,913][04017] Updated weights for policy 0, policy_version 33322 (0.0010) [2024-03-21 05:14:25,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46967.4, 300 sec: 47652.4). Total num frames: 1092124672. Throughput: 0: 47917.9. Samples: 1093489000. Policy #0 lag: (min: 0.0, avg: 43.1, max: 95.0) [2024-03-21 05:14:25,522][03784] Avg episode reward: [(0, '1.721')] [2024-03-21 05:14:26,605][04017] Updated weights for policy 0, policy_version 33332 (0.0010) [2024-03-21 05:14:30,521][03784] Fps is (10 sec: 58982.2, 60 sec: 46967.5, 300 sec: 47985.7). Total num frames: 1092452352. Throughput: 0: 48062.2. Samples: 1093618800. Policy #0 lag: (min: 1.0, avg: 33.6, max: 115.0) [2024-03-21 05:14:30,522][03784] Avg episode reward: [(0, '1.051')] [2024-03-21 05:14:34,462][04017] Updated weights for policy 0, policy_version 33342 (0.0011) [2024-03-21 05:14:35,521][03784] Fps is (10 sec: 42598.0, 60 sec: 45329.0, 300 sec: 47541.4). Total num frames: 1092550656. Throughput: 0: 48322.1. Samples: 1093909700. Policy #0 lag: (min: 1.0, avg: 33.6, max: 115.0) [2024-03-21 05:14:35,522][03784] Avg episode reward: [(0, '1.406')] [2024-03-21 05:14:40,521][03784] Fps is (10 sec: 26214.4, 60 sec: 44236.8, 300 sec: 46986.0). Total num frames: 1092714496. Throughput: 0: 48039.9. Samples: 1094210500. Policy #0 lag: (min: 1.0, avg: 33.6, max: 115.0) [2024-03-21 05:14:40,522][03784] Avg episode reward: [(0, '1.360')] [2024-03-21 05:14:42,078][04017] Updated weights for policy 0, policy_version 33352 (0.0017) [2024-03-21 05:14:45,348][04017] Updated weights for policy 0, policy_version 33362 (0.0018) [2024-03-21 05:14:45,521][03784] Fps is (10 sec: 65535.9, 60 sec: 48605.8, 300 sec: 47763.5). Total num frames: 1093206016. Throughput: 0: 47171.0. Samples: 1094320300. Policy #0 lag: (min: 1.0, avg: 33.6, max: 115.0) [2024-03-21 05:14:45,522][03784] Avg episode reward: [(0, '1.107')] [2024-03-21 05:14:46,153][03995] Signal inference workers to stop experience collection... (22000 times) [2024-03-21 05:14:46,205][04017] InferenceWorker_p0-w0: stopping experience collection (22000 times) [2024-03-21 05:14:46,223][03995] Signal inference workers to resume experience collection... (22000 times) [2024-03-21 05:14:46,259][04017] InferenceWorker_p0-w0: resuming experience collection (22000 times) [2024-03-21 05:14:49,136][04017] Updated weights for policy 0, policy_version 33372 (0.0016) [2024-03-21 05:14:50,521][03784] Fps is (10 sec: 91749.5, 60 sec: 50790.2, 300 sec: 48541.0). Total num frames: 1093632000. Throughput: 0: 47170.9. Samples: 1094595400. Policy #0 lag: (min: 2.0, avg: 56.6, max: 114.0) [2024-03-21 05:14:50,522][03784] Avg episode reward: [(0, '1.257')] [2024-03-21 05:14:55,521][03784] Fps is (10 sec: 45876.1, 60 sec: 49698.2, 300 sec: 47208.1). Total num frames: 1093664768. Throughput: 0: 47742.3. Samples: 1094887300. Policy #0 lag: (min: 2.0, avg: 56.6, max: 114.0) [2024-03-21 05:14:55,522][03784] Avg episode reward: [(0, '1.447')] [2024-03-21 05:14:59,822][04017] Updated weights for policy 0, policy_version 33382 (0.0010) [2024-03-21 05:15:00,521][03784] Fps is (10 sec: 22938.0, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 1093861376. Throughput: 0: 47477.8. Samples: 1095034200. Policy #0 lag: (min: 2.0, avg: 56.6, max: 114.0) [2024-03-21 05:15:00,522][03784] Avg episode reward: [(0, '0.609')] [2024-03-21 05:15:05,521][03784] Fps is (10 sec: 45875.0, 60 sec: 48605.9, 300 sec: 47208.2). Total num frames: 1094123520. Throughput: 0: 46868.9. Samples: 1095307900. Policy #0 lag: (min: 2.0, avg: 56.6, max: 114.0) [2024-03-21 05:15:05,522][03784] Avg episode reward: [(0, '1.316')] [2024-03-21 05:15:06,472][04017] Updated weights for policy 0, policy_version 33392 (0.0020) [2024-03-21 05:15:10,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45329.0, 300 sec: 46874.9). Total num frames: 1094221824. Throughput: 0: 46442.2. Samples: 1095578900. Policy #0 lag: (min: 2.0, avg: 56.6, max: 114.0) [2024-03-21 05:15:10,522][03784] Avg episode reward: [(0, '0.902')] [2024-03-21 05:15:15,521][03784] Fps is (10 sec: 26214.4, 60 sec: 45329.1, 300 sec: 46874.9). Total num frames: 1094385664. Throughput: 0: 46137.8. Samples: 1095695000. Policy #0 lag: (min: 1.0, avg: 42.4, max: 88.0) [2024-03-21 05:15:15,522][03784] Avg episode reward: [(0, '1.015')] [2024-03-21 05:15:17,447][04017] Updated weights for policy 0, policy_version 33402 (0.0015) [2024-03-21 05:15:20,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46421.3, 300 sec: 47208.1). Total num frames: 1094647808. Throughput: 0: 46209.0. Samples: 1095989100. Policy #0 lag: (min: 1.0, avg: 42.4, max: 88.0) [2024-03-21 05:15:20,522][03784] Avg episode reward: [(0, '1.259')] [2024-03-21 05:15:23,217][04017] Updated weights for policy 0, policy_version 33412 (0.0015) [2024-03-21 05:15:25,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46421.4, 300 sec: 47430.3). Total num frames: 1094909952. Throughput: 0: 46077.9. Samples: 1096284000. Policy #0 lag: (min: 1.0, avg: 42.4, max: 88.0) [2024-03-21 05:15:25,522][03784] Avg episode reward: [(0, '0.606')] [2024-03-21 05:15:29,433][04017] Updated weights for policy 0, policy_version 33422 (0.0022) [2024-03-21 05:15:30,521][03784] Fps is (10 sec: 62259.1, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 1095270400. Throughput: 0: 47057.9. Samples: 1096437900. Policy #0 lag: (min: 1.0, avg: 42.4, max: 88.0) [2024-03-21 05:15:30,522][03784] Avg episode reward: [(0, '0.818')] [2024-03-21 05:15:35,521][03784] Fps is (10 sec: 55705.6, 60 sec: 48606.0, 300 sec: 47097.1). Total num frames: 1095467008. Throughput: 0: 46966.8. Samples: 1096708900. Policy #0 lag: (min: 1.0, avg: 42.4, max: 88.0) [2024-03-21 05:15:35,522][03784] Avg episode reward: [(0, '0.642')] [2024-03-21 05:15:35,888][04017] Updated weights for policy 0, policy_version 33432 (0.0025) [2024-03-21 05:15:38,692][03995] Signal inference workers to stop experience collection... (22050 times) [2024-03-21 05:15:38,750][03995] Signal inference workers to resume experience collection... (22050 times) [2024-03-21 05:15:38,777][04017] InferenceWorker_p0-w0: stopping experience collection (22050 times) [2024-03-21 05:15:38,839][04017] InferenceWorker_p0-w0: resuming experience collection (22050 times) [2024-03-21 05:15:40,521][03784] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 47319.2). Total num frames: 1095761920. Throughput: 0: 46677.7. Samples: 1096987800. Policy #0 lag: (min: 0.0, avg: 40.7, max: 106.0) [2024-03-21 05:15:40,522][03784] Avg episode reward: [(0, '0.785')] [2024-03-21 05:15:45,269][04017] Updated weights for policy 0, policy_version 33442 (0.0015) [2024-03-21 05:15:45,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43690.8, 300 sec: 46874.9). Total num frames: 1095827456. Throughput: 0: 46811.1. Samples: 1097140700. Policy #0 lag: (min: 0.0, avg: 40.7, max: 106.0) [2024-03-21 05:15:45,522][03784] Avg episode reward: [(0, '0.527')] [2024-03-21 05:15:50,521][03784] Fps is (10 sec: 36044.8, 60 sec: 41506.3, 300 sec: 47541.4). Total num frames: 1096122368. Throughput: 0: 46426.7. Samples: 1097397100. Policy #0 lag: (min: 0.0, avg: 40.7, max: 106.0) [2024-03-21 05:15:50,522][03784] Avg episode reward: [(0, '1.235')] [2024-03-21 05:15:50,570][04017] Updated weights for policy 0, policy_version 33452 (0.0016) [2024-03-21 05:15:55,521][03784] Fps is (10 sec: 58982.1, 60 sec: 45875.1, 300 sec: 47541.4). Total num frames: 1096417280. Throughput: 0: 46153.3. Samples: 1097655800. Policy #0 lag: (min: 0.0, avg: 40.7, max: 106.0) [2024-03-21 05:15:55,522][03784] Avg episode reward: [(0, '0.625')] [2024-03-21 05:15:56,991][04017] Updated weights for policy 0, policy_version 33462 (0.0013) [2024-03-21 05:16:00,521][03784] Fps is (10 sec: 55705.0, 60 sec: 46967.4, 300 sec: 47319.2). Total num frames: 1096679424. Throughput: 0: 46788.8. Samples: 1097800500. Policy #0 lag: (min: 1.0, avg: 40.7, max: 72.0) [2024-03-21 05:16:00,522][03784] Avg episode reward: [(0, '1.254')] [2024-03-21 05:16:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000033468_1096679424.pth... [2024-03-21 05:16:00,652][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000033119_1085243392.pth [2024-03-21 05:16:04,443][04017] Updated weights for policy 0, policy_version 33472 (0.0015) [2024-03-21 05:16:05,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45329.1, 300 sec: 47430.3). Total num frames: 1096843264. Throughput: 0: 46580.0. Samples: 1098085200. Policy #0 lag: (min: 1.0, avg: 40.7, max: 72.0) [2024-03-21 05:16:05,522][03784] Avg episode reward: [(0, '1.292')] [2024-03-21 05:16:10,521][03784] Fps is (10 sec: 42598.8, 60 sec: 48059.8, 300 sec: 47763.5). Total num frames: 1097105408. Throughput: 0: 46197.8. Samples: 1098362900. Policy #0 lag: (min: 1.0, avg: 40.7, max: 72.0) [2024-03-21 05:16:10,522][03784] Avg episode reward: [(0, '0.795')] [2024-03-21 05:16:10,591][04017] Updated weights for policy 0, policy_version 33482 (0.0011) [2024-03-21 05:16:15,521][03784] Fps is (10 sec: 55705.5, 60 sec: 50244.3, 300 sec: 47430.3). Total num frames: 1097400320. Throughput: 0: 45724.5. Samples: 1098495500. Policy #0 lag: (min: 1.0, avg: 40.7, max: 72.0) [2024-03-21 05:16:15,522][03784] Avg episode reward: [(0, '0.657')] [2024-03-21 05:16:18,197][04017] Updated weights for policy 0, policy_version 33492 (0.0016) [2024-03-21 05:16:20,521][03784] Fps is (10 sec: 45875.0, 60 sec: 48605.9, 300 sec: 46874.9). Total num frames: 1097564160. Throughput: 0: 46824.4. Samples: 1098816000. Policy #0 lag: (min: 0.0, avg: 34.9, max: 76.0) [2024-03-21 05:16:20,522][03784] Avg episode reward: [(0, '0.657')] [2024-03-21 05:16:24,369][04017] Updated weights for policy 0, policy_version 33502 (0.0025) [2024-03-21 05:16:25,521][03784] Fps is (10 sec: 42598.8, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 1097826304. Throughput: 0: 47013.4. Samples: 1099103400. Policy #0 lag: (min: 0.0, avg: 34.9, max: 76.0) [2024-03-21 05:16:25,522][03784] Avg episode reward: [(0, '0.657')] [2024-03-21 05:16:29,070][03995] Signal inference workers to stop experience collection... (22100 times) [2024-03-21 05:16:29,070][03995] Signal inference workers to resume experience collection... (22100 times) [2024-03-21 05:16:29,114][04017] InferenceWorker_p0-w0: stopping experience collection (22100 times) [2024-03-21 05:16:29,114][04017] InferenceWorker_p0-w0: resuming experience collection (22100 times) [2024-03-21 05:16:30,023][04017] Updated weights for policy 0, policy_version 33512 (0.0019) [2024-03-21 05:16:30,521][03784] Fps is (10 sec: 55705.9, 60 sec: 47513.7, 300 sec: 46986.0). Total num frames: 1098121216. Throughput: 0: 46877.8. Samples: 1099250200. Policy #0 lag: (min: 0.0, avg: 34.9, max: 76.0) [2024-03-21 05:16:30,522][03784] Avg episode reward: [(0, '1.259')] [2024-03-21 05:16:35,521][03784] Fps is (10 sec: 39321.1, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 1098219520. Throughput: 0: 47473.3. Samples: 1099533400. Policy #0 lag: (min: 0.0, avg: 34.9, max: 76.0) [2024-03-21 05:16:35,522][03784] Avg episode reward: [(0, '1.388')] [2024-03-21 05:16:40,521][03784] Fps is (10 sec: 26214.4, 60 sec: 43690.7, 300 sec: 46874.9). Total num frames: 1098383360. Throughput: 0: 48206.7. Samples: 1099825100. Policy #0 lag: (min: 0.0, avg: 34.9, max: 76.0) [2024-03-21 05:16:40,522][03784] Avg episode reward: [(0, '0.870')] [2024-03-21 05:16:43,384][04017] Updated weights for policy 0, policy_version 33522 (0.0011) [2024-03-21 05:16:45,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45875.2, 300 sec: 46874.9). Total num frames: 1098579968. Throughput: 0: 48213.5. Samples: 1099970100. Policy #0 lag: (min: 0.0, avg: 21.6, max: 59.0) [2024-03-21 05:16:45,522][03784] Avg episode reward: [(0, '1.033')] [2024-03-21 05:16:47,556][04017] Updated weights for policy 0, policy_version 33532 (0.0026) [2024-03-21 05:16:50,521][03784] Fps is (10 sec: 55705.2, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 1098940416. Throughput: 0: 47564.4. Samples: 1100225600. Policy #0 lag: (min: 0.0, avg: 21.6, max: 59.0) [2024-03-21 05:16:50,522][03784] Avg episode reward: [(0, '0.700')] [2024-03-21 05:16:53,647][04017] Updated weights for policy 0, policy_version 33542 (0.0018) [2024-03-21 05:16:55,521][03784] Fps is (10 sec: 68811.1, 60 sec: 47513.5, 300 sec: 47652.4). Total num frames: 1099268096. Throughput: 0: 47202.0. Samples: 1100487000. Policy #0 lag: (min: 0.0, avg: 21.6, max: 59.0) [2024-03-21 05:16:55,522][03784] Avg episode reward: [(0, '1.025')] [2024-03-21 05:16:57,682][04017] Updated weights for policy 0, policy_version 33552 (0.0010) [2024-03-21 05:17:00,521][03784] Fps is (10 sec: 65536.8, 60 sec: 48606.0, 300 sec: 47985.7). Total num frames: 1099595776. Throughput: 0: 47386.7. Samples: 1100627900. Policy #0 lag: (min: 0.0, avg: 21.6, max: 59.0) [2024-03-21 05:17:00,522][03784] Avg episode reward: [(0, '1.296')] [2024-03-21 05:17:04,399][04017] Updated weights for policy 0, policy_version 33562 (0.0010) [2024-03-21 05:17:05,521][03784] Fps is (10 sec: 49152.7, 60 sec: 48605.8, 300 sec: 47541.4). Total num frames: 1099759616. Throughput: 0: 47033.3. Samples: 1100932500. Policy #0 lag: (min: 0.0, avg: 21.6, max: 59.0) [2024-03-21 05:17:05,522][03784] Avg episode reward: [(0, '1.721')] [2024-03-21 05:17:09,611][04017] Updated weights for policy 0, policy_version 33572 (0.0025) [2024-03-21 05:17:10,521][03784] Fps is (10 sec: 55705.1, 60 sec: 50790.4, 300 sec: 48096.8). Total num frames: 1100152832. Throughput: 0: 47139.9. Samples: 1101224700. Policy #0 lag: (min: 0.0, avg: 35.6, max: 73.0) [2024-03-21 05:17:10,522][03784] Avg episode reward: [(0, '1.721')] [2024-03-21 05:17:15,521][03784] Fps is (10 sec: 55705.7, 60 sec: 48605.8, 300 sec: 47652.4). Total num frames: 1100316672. Throughput: 0: 47140.0. Samples: 1101371500. Policy #0 lag: (min: 0.0, avg: 35.6, max: 73.0) [2024-03-21 05:17:15,522][03784] Avg episode reward: [(0, '0.769')] [2024-03-21 05:17:17,796][04017] Updated weights for policy 0, policy_version 33582 (0.0010) [2024-03-21 05:17:20,521][03784] Fps is (10 sec: 32768.0, 60 sec: 48605.9, 300 sec: 46652.7). Total num frames: 1100480512. Throughput: 0: 47535.5. Samples: 1101672500. Policy #0 lag: (min: 0.0, avg: 35.6, max: 73.0) [2024-03-21 05:17:20,522][03784] Avg episode reward: [(0, '0.769')] [2024-03-21 05:17:23,688][03995] Signal inference workers to stop experience collection... (22150 times) [2024-03-21 05:17:23,751][03995] Signal inference workers to resume experience collection... (22150 times) [2024-03-21 05:17:23,755][04017] InferenceWorker_p0-w0: stopping experience collection (22150 times) [2024-03-21 05:17:23,805][04017] InferenceWorker_p0-w0: resuming experience collection (22150 times) [2024-03-21 05:17:25,521][03784] Fps is (10 sec: 26214.6, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 1100578816. Throughput: 0: 47717.8. Samples: 1101972400. Policy #0 lag: (min: 0.0, avg: 35.6, max: 73.0) [2024-03-21 05:17:25,522][03784] Avg episode reward: [(0, '1.319')] [2024-03-21 05:17:27,753][04017] Updated weights for policy 0, policy_version 33592 (0.0022) [2024-03-21 05:17:30,521][03784] Fps is (10 sec: 39321.5, 60 sec: 45875.1, 300 sec: 46874.9). Total num frames: 1100873728. Throughput: 0: 47128.8. Samples: 1102090900. Policy #0 lag: (min: 3.0, avg: 35.9, max: 77.0) [2024-03-21 05:17:30,522][03784] Avg episode reward: [(0, '0.906')] [2024-03-21 05:17:35,521][03784] Fps is (10 sec: 45874.9, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 1101037568. Throughput: 0: 47711.2. Samples: 1102372600. Policy #0 lag: (min: 3.0, avg: 35.9, max: 77.0) [2024-03-21 05:17:35,522][03784] Avg episode reward: [(0, '1.312')] [2024-03-21 05:17:37,112][04017] Updated weights for policy 0, policy_version 33602 (0.0015) [2024-03-21 05:17:40,521][03784] Fps is (10 sec: 36044.9, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 1101234176. Throughput: 0: 48409.1. Samples: 1102665400. Policy #0 lag: (min: 3.0, avg: 35.9, max: 77.0) [2024-03-21 05:17:40,522][03784] Avg episode reward: [(0, '1.162')] [2024-03-21 05:17:42,425][04017] Updated weights for policy 0, policy_version 33612 (0.0011) [2024-03-21 05:17:45,521][03784] Fps is (10 sec: 45875.2, 60 sec: 48605.8, 300 sec: 47208.1). Total num frames: 1101496320. Throughput: 0: 48328.8. Samples: 1102802700. Policy #0 lag: (min: 3.0, avg: 35.9, max: 77.0) [2024-03-21 05:17:45,522][03784] Avg episode reward: [(0, '1.029')] [2024-03-21 05:17:49,767][04017] Updated weights for policy 0, policy_version 33622 (0.0014) [2024-03-21 05:17:50,521][03784] Fps is (10 sec: 55705.8, 60 sec: 47513.7, 300 sec: 47763.5). Total num frames: 1101791232. Throughput: 0: 48189.0. Samples: 1103101000. Policy #0 lag: (min: 2.0, avg: 43.3, max: 113.0) [2024-03-21 05:17:50,522][03784] Avg episode reward: [(0, '1.011')] [2024-03-21 05:17:55,521][03784] Fps is (10 sec: 52429.1, 60 sec: 45875.4, 300 sec: 47652.5). Total num frames: 1102020608. Throughput: 0: 47566.8. Samples: 1103365200. Policy #0 lag: (min: 2.0, avg: 43.3, max: 113.0) [2024-03-21 05:17:55,522][03784] Avg episode reward: [(0, '0.535')] [2024-03-21 05:17:55,574][04017] Updated weights for policy 0, policy_version 33632 (0.0012) [2024-03-21 05:18:00,521][03784] Fps is (10 sec: 49151.4, 60 sec: 44782.8, 300 sec: 47430.3). Total num frames: 1102282752. Throughput: 0: 47533.3. Samples: 1103510500. Policy #0 lag: (min: 2.0, avg: 43.3, max: 113.0) [2024-03-21 05:18:00,523][03784] Avg episode reward: [(0, '1.297')] [2024-03-21 05:18:00,536][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000033640_1102315520.pth... [2024-03-21 05:18:00,650][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000033292_1090912256.pth [2024-03-21 05:18:02,419][04017] Updated weights for policy 0, policy_version 33642 (0.0018) [2024-03-21 05:18:05,521][03784] Fps is (10 sec: 58981.7, 60 sec: 47513.6, 300 sec: 47652.4). Total num frames: 1102610432. Throughput: 0: 46926.6. Samples: 1103784200. Policy #0 lag: (min: 2.0, avg: 43.3, max: 113.0) [2024-03-21 05:18:05,522][03784] Avg episode reward: [(0, '1.233')] [2024-03-21 05:18:06,725][04017] Updated weights for policy 0, policy_version 33652 (0.0013) [2024-03-21 05:18:10,521][03784] Fps is (10 sec: 55706.2, 60 sec: 44783.0, 300 sec: 47430.3). Total num frames: 1102839808. Throughput: 0: 46715.5. Samples: 1104074600. Policy #0 lag: (min: 2.0, avg: 43.3, max: 113.0) [2024-03-21 05:18:10,522][03784] Avg episode reward: [(0, '0.733')] [2024-03-21 05:18:14,865][04017] Updated weights for policy 0, policy_version 33662 (0.0010) [2024-03-21 05:18:14,912][03995] Signal inference workers to stop experience collection... (22200 times) [2024-03-21 05:18:14,923][03995] Signal inference workers to resume experience collection... (22200 times) [2024-03-21 05:18:14,977][04017] InferenceWorker_p0-w0: stopping experience collection (22200 times) [2024-03-21 05:18:14,978][04017] InferenceWorker_p0-w0: resuming experience collection (22200 times) [2024-03-21 05:18:15,521][03784] Fps is (10 sec: 45875.8, 60 sec: 45875.3, 300 sec: 47319.2). Total num frames: 1103069184. Throughput: 0: 47691.2. Samples: 1104237000. Policy #0 lag: (min: 0.0, avg: 36.0, max: 114.0) [2024-03-21 05:18:15,522][03784] Avg episode reward: [(0, '1.154')] [2024-03-21 05:18:18,823][04017] Updated weights for policy 0, policy_version 33672 (0.0011) [2024-03-21 05:18:20,521][03784] Fps is (10 sec: 62259.0, 60 sec: 49698.1, 300 sec: 47985.7). Total num frames: 1103462400. Throughput: 0: 46995.6. Samples: 1104487400. Policy #0 lag: (min: 0.0, avg: 36.0, max: 114.0) [2024-03-21 05:18:20,522][03784] Avg episode reward: [(0, '1.084')] [2024-03-21 05:18:25,521][03784] Fps is (10 sec: 58982.1, 60 sec: 51336.5, 300 sec: 47541.4). Total num frames: 1103659008. Throughput: 0: 46180.0. Samples: 1104743500. Policy #0 lag: (min: 0.0, avg: 36.0, max: 114.0) [2024-03-21 05:18:25,522][03784] Avg episode reward: [(0, '0.636')] [2024-03-21 05:18:27,731][04017] Updated weights for policy 0, policy_version 33682 (0.0013) [2024-03-21 05:18:30,521][03784] Fps is (10 sec: 32767.7, 60 sec: 48605.8, 300 sec: 47319.2). Total num frames: 1103790080. Throughput: 0: 46377.7. Samples: 1104889700. Policy #0 lag: (min: 0.0, avg: 36.0, max: 114.0) [2024-03-21 05:18:30,522][03784] Avg episode reward: [(0, '1.543')] [2024-03-21 05:18:35,521][03784] Fps is (10 sec: 26214.5, 60 sec: 48059.8, 300 sec: 46986.0). Total num frames: 1103921152. Throughput: 0: 46351.1. Samples: 1105186800. Policy #0 lag: (min: 1.0, avg: 46.0, max: 88.0) [2024-03-21 05:18:35,522][03784] Avg episode reward: [(0, '1.024')] [2024-03-21 05:18:38,510][04017] Updated weights for policy 0, policy_version 33692 (0.0010) [2024-03-21 05:18:40,521][03784] Fps is (10 sec: 39321.9, 60 sec: 49152.0, 300 sec: 47097.1). Total num frames: 1104183296. Throughput: 0: 46531.0. Samples: 1105459100. Policy #0 lag: (min: 1.0, avg: 46.0, max: 88.0) [2024-03-21 05:18:40,522][03784] Avg episode reward: [(0, '0.864')] [2024-03-21 05:18:45,521][03784] Fps is (10 sec: 36044.7, 60 sec: 46421.4, 300 sec: 46430.6). Total num frames: 1104281600. Throughput: 0: 46542.4. Samples: 1105604900. Policy #0 lag: (min: 1.0, avg: 46.0, max: 88.0) [2024-03-21 05:18:45,522][03784] Avg episode reward: [(0, '1.154')] [2024-03-21 05:18:46,851][04017] Updated weights for policy 0, policy_version 33702 (0.0015) [2024-03-21 05:18:50,421][04017] Updated weights for policy 0, policy_version 33712 (0.0014) [2024-03-21 05:18:50,521][03784] Fps is (10 sec: 49152.3, 60 sec: 48059.7, 300 sec: 47430.3). Total num frames: 1104674816. Throughput: 0: 46706.8. Samples: 1105886000. Policy #0 lag: (min: 1.0, avg: 46.0, max: 88.0) [2024-03-21 05:18:50,522][03784] Avg episode reward: [(0, '1.321')] [2024-03-21 05:18:55,521][03784] Fps is (10 sec: 58982.5, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 1104871424. Throughput: 0: 46871.1. Samples: 1106183800. Policy #0 lag: (min: 1.0, avg: 46.0, max: 88.0) [2024-03-21 05:18:55,522][03784] Avg episode reward: [(0, '1.415')] [2024-03-21 05:18:56,937][04017] Updated weights for policy 0, policy_version 33722 (0.0019) [2024-03-21 05:19:00,521][03784] Fps is (10 sec: 52428.9, 60 sec: 48606.0, 300 sec: 47430.3). Total num frames: 1105199104. Throughput: 0: 46084.4. Samples: 1106310800. Policy #0 lag: (min: 3.0, avg: 40.5, max: 93.0) [2024-03-21 05:19:00,522][03784] Avg episode reward: [(0, '0.802')] [2024-03-21 05:19:05,521][03784] Fps is (10 sec: 42598.4, 60 sec: 44783.0, 300 sec: 46763.8). Total num frames: 1105297408. Throughput: 0: 46933.4. Samples: 1106599400. Policy #0 lag: (min: 3.0, avg: 40.5, max: 93.0) [2024-03-21 05:19:05,522][03784] Avg episode reward: [(0, '1.300')] [2024-03-21 05:19:06,035][04017] Updated weights for policy 0, policy_version 33732 (0.0016) [2024-03-21 05:19:06,805][03995] Signal inference workers to stop experience collection... (22250 times) [2024-03-21 05:19:06,805][03995] Signal inference workers to resume experience collection... (22250 times) [2024-03-21 05:19:06,849][04017] InferenceWorker_p0-w0: stopping experience collection (22250 times) [2024-03-21 05:19:06,850][04017] InferenceWorker_p0-w0: resuming experience collection (22250 times) [2024-03-21 05:19:10,496][04017] Updated weights for policy 0, policy_version 33742 (0.0016) [2024-03-21 05:19:10,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 1105657856. Throughput: 0: 47346.7. Samples: 1106874100. Policy #0 lag: (min: 3.0, avg: 40.5, max: 93.0) [2024-03-21 05:19:10,522][03784] Avg episode reward: [(0, '1.408')] [2024-03-21 05:19:15,521][03784] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 46874.9). Total num frames: 1105690624. Throughput: 0: 47384.6. Samples: 1107022000. Policy #0 lag: (min: 3.0, avg: 40.5, max: 93.0) [2024-03-21 05:19:15,522][03784] Avg episode reward: [(0, '0.546')] [2024-03-21 05:19:20,521][03784] Fps is (10 sec: 32768.0, 60 sec: 42052.3, 300 sec: 46986.0). Total num frames: 1105985536. Throughput: 0: 47020.0. Samples: 1107302700. Policy #0 lag: (min: 2.0, avg: 31.8, max: 67.0) [2024-03-21 05:19:20,522][03784] Avg episode reward: [(0, '0.765')] [2024-03-21 05:19:20,536][04017] Updated weights for policy 0, policy_version 33752 (0.0011) [2024-03-21 05:19:25,521][03784] Fps is (10 sec: 58982.1, 60 sec: 43690.6, 300 sec: 46874.9). Total num frames: 1106280448. Throughput: 0: 47535.5. Samples: 1107598200. Policy #0 lag: (min: 2.0, avg: 31.8, max: 67.0) [2024-03-21 05:19:25,522][03784] Avg episode reward: [(0, '1.423')] [2024-03-21 05:19:25,797][04017] Updated weights for policy 0, policy_version 33762 (0.0013) [2024-03-21 05:19:30,050][04017] Updated weights for policy 0, policy_version 33772 (0.0010) [2024-03-21 05:19:30,521][03784] Fps is (10 sec: 68812.9, 60 sec: 48059.9, 300 sec: 47874.6). Total num frames: 1106673664. Throughput: 0: 47495.6. Samples: 1107742200. Policy #0 lag: (min: 2.0, avg: 31.8, max: 67.0) [2024-03-21 05:19:30,522][03784] Avg episode reward: [(0, '1.423')] [2024-03-21 05:19:35,431][04017] Updated weights for policy 0, policy_version 33782 (0.0017) [2024-03-21 05:19:35,521][03784] Fps is (10 sec: 68811.9, 60 sec: 50790.2, 300 sec: 48318.9). Total num frames: 1106968576. Throughput: 0: 47975.3. Samples: 1108044900. Policy #0 lag: (min: 2.0, avg: 31.8, max: 67.0) [2024-03-21 05:19:35,523][03784] Avg episode reward: [(0, '1.423')] [2024-03-21 05:19:40,521][03784] Fps is (10 sec: 39321.5, 60 sec: 48059.8, 300 sec: 46986.0). Total num frames: 1107066880. Throughput: 0: 48031.1. Samples: 1108345200. Policy #0 lag: (min: 2.0, avg: 31.8, max: 67.0) [2024-03-21 05:19:40,522][03784] Avg episode reward: [(0, '1.240')] [2024-03-21 05:19:45,399][04017] Updated weights for policy 0, policy_version 33792 (0.0021) [2024-03-21 05:19:45,521][03784] Fps is (10 sec: 32768.3, 60 sec: 50244.2, 300 sec: 46319.5). Total num frames: 1107296256. Throughput: 0: 47991.0. Samples: 1108470400. Policy #0 lag: (min: 0.0, avg: 38.5, max: 76.0) [2024-03-21 05:19:45,522][03784] Avg episode reward: [(0, '1.240')] [2024-03-21 05:19:49,800][04017] Updated weights for policy 0, policy_version 33802 (0.0015) [2024-03-21 05:19:50,521][03784] Fps is (10 sec: 58982.5, 60 sec: 49698.1, 300 sec: 47430.3). Total num frames: 1107656704. Throughput: 0: 47528.9. Samples: 1108738200. Policy #0 lag: (min: 0.0, avg: 38.5, max: 76.0) [2024-03-21 05:19:50,522][03784] Avg episode reward: [(0, '1.435')] [2024-03-21 05:19:55,521][03784] Fps is (10 sec: 49152.7, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 1107787776. Throughput: 0: 47884.5. Samples: 1109028900. Policy #0 lag: (min: 0.0, avg: 38.5, max: 76.0) [2024-03-21 05:19:55,522][03784] Avg episode reward: [(0, '1.048')] [2024-03-21 05:19:56,238][03995] Signal inference workers to stop experience collection... (22300 times) [2024-03-21 05:19:56,298][04017] InferenceWorker_p0-w0: stopping experience collection (22300 times) [2024-03-21 05:19:56,572][03995] Signal inference workers to resume experience collection... (22300 times) [2024-03-21 05:19:56,572][04017] InferenceWorker_p0-w0: resuming experience collection (22300 times) [2024-03-21 05:19:59,824][04017] Updated weights for policy 0, policy_version 33812 (0.0015) [2024-03-21 05:20:00,521][03784] Fps is (10 sec: 36044.5, 60 sec: 46967.4, 300 sec: 47097.0). Total num frames: 1108017152. Throughput: 0: 47819.9. Samples: 1109173900. Policy #0 lag: (min: 0.0, avg: 38.5, max: 76.0) [2024-03-21 05:20:00,522][03784] Avg episode reward: [(0, '0.618')] [2024-03-21 05:20:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000033814_1108017152.pth... [2024-03-21 05:20:00,647][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000033468_1096679424.pth [2024-03-21 05:20:05,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46967.5, 300 sec: 47097.1). Total num frames: 1108115456. Throughput: 0: 48077.8. Samples: 1109466200. Policy #0 lag: (min: 0.0, avg: 30.8, max: 72.0) [2024-03-21 05:20:05,522][03784] Avg episode reward: [(0, '0.826')] [2024-03-21 05:20:09,640][04017] Updated weights for policy 0, policy_version 33822 (0.0015) [2024-03-21 05:20:10,521][03784] Fps is (10 sec: 32768.1, 60 sec: 44782.9, 300 sec: 47319.2). Total num frames: 1108344832. Throughput: 0: 47753.3. Samples: 1109747100. Policy #0 lag: (min: 0.0, avg: 30.8, max: 72.0) [2024-03-21 05:20:10,522][03784] Avg episode reward: [(0, '0.458')] [2024-03-21 05:20:13,646][04017] Updated weights for policy 0, policy_version 33832 (0.0016) [2024-03-21 05:20:15,521][03784] Fps is (10 sec: 55705.2, 60 sec: 49698.1, 300 sec: 47541.4). Total num frames: 1108672512. Throughput: 0: 47173.3. Samples: 1109865000. Policy #0 lag: (min: 0.0, avg: 30.8, max: 72.0) [2024-03-21 05:20:15,522][03784] Avg episode reward: [(0, '0.786')] [2024-03-21 05:20:19,782][04017] Updated weights for policy 0, policy_version 33842 (0.0019) [2024-03-21 05:20:20,521][03784] Fps is (10 sec: 62259.7, 60 sec: 49698.1, 300 sec: 47652.4). Total num frames: 1108967424. Throughput: 0: 47360.2. Samples: 1110176100. Policy #0 lag: (min: 0.0, avg: 30.8, max: 72.0) [2024-03-21 05:20:20,522][03784] Avg episode reward: [(0, '0.786')] [2024-03-21 05:20:25,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 1109098496. Throughput: 0: 47351.1. Samples: 1110476000. Policy #0 lag: (min: 0.0, avg: 30.8, max: 72.0) [2024-03-21 05:20:25,522][03784] Avg episode reward: [(0, '0.786')] [2024-03-21 05:20:28,594][04017] Updated weights for policy 0, policy_version 33852 (0.0015) [2024-03-21 05:20:30,521][03784] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 47097.1). Total num frames: 1109360640. Throughput: 0: 47760.1. Samples: 1110619600. Policy #0 lag: (min: 1.0, avg: 33.0, max: 82.0) [2024-03-21 05:20:30,522][03784] Avg episode reward: [(0, '1.038')] [2024-03-21 05:20:33,028][04017] Updated weights for policy 0, policy_version 33862 (0.0016) [2024-03-21 05:20:35,521][03784] Fps is (10 sec: 65535.8, 60 sec: 46421.5, 300 sec: 47430.3). Total num frames: 1109753856. Throughput: 0: 47724.4. Samples: 1110885800. Policy #0 lag: (min: 1.0, avg: 33.0, max: 82.0) [2024-03-21 05:20:35,522][03784] Avg episode reward: [(0, '1.398')] [2024-03-21 05:20:40,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46421.3, 300 sec: 47541.4). Total num frames: 1109852160. Throughput: 0: 47717.7. Samples: 1111176200. Policy #0 lag: (min: 1.0, avg: 33.0, max: 82.0) [2024-03-21 05:20:40,522][03784] Avg episode reward: [(0, '1.418')] [2024-03-21 05:20:41,833][04017] Updated weights for policy 0, policy_version 33872 (0.0018) [2024-03-21 05:20:45,521][03784] Fps is (10 sec: 39321.6, 60 sec: 47513.6, 300 sec: 47541.4). Total num frames: 1110147072. Throughput: 0: 47068.9. Samples: 1111292000. Policy #0 lag: (min: 1.0, avg: 33.0, max: 82.0) [2024-03-21 05:20:45,522][03784] Avg episode reward: [(0, '0.667')] [2024-03-21 05:20:45,989][03995] Signal inference workers to stop experience collection... (22350 times) [2024-03-21 05:20:46,063][03995] Signal inference workers to resume experience collection... (22350 times) [2024-03-21 05:20:46,088][04017] InferenceWorker_p0-w0: stopping experience collection (22350 times) [2024-03-21 05:20:46,148][04017] InferenceWorker_p0-w0: resuming experience collection (22350 times) [2024-03-21 05:20:46,430][04017] Updated weights for policy 0, policy_version 33882 (0.0011) [2024-03-21 05:20:50,521][03784] Fps is (10 sec: 62259.5, 60 sec: 46967.5, 300 sec: 47652.5). Total num frames: 1110474752. Throughput: 0: 47086.7. Samples: 1111585100. Policy #0 lag: (min: 0.0, avg: 44.1, max: 79.0) [2024-03-21 05:20:50,522][03784] Avg episode reward: [(0, '0.584')] [2024-03-21 05:20:52,562][04017] Updated weights for policy 0, policy_version 33892 (0.0010) [2024-03-21 05:20:55,521][03784] Fps is (10 sec: 52428.9, 60 sec: 48059.7, 300 sec: 47430.3). Total num frames: 1110671360. Throughput: 0: 47480.1. Samples: 1111883700. Policy #0 lag: (min: 0.0, avg: 44.1, max: 79.0) [2024-03-21 05:20:55,522][03784] Avg episode reward: [(0, '0.584')] [2024-03-21 05:21:00,521][03784] Fps is (10 sec: 32768.0, 60 sec: 46421.4, 300 sec: 47319.2). Total num frames: 1110802432. Throughput: 0: 47893.4. Samples: 1112020200. Policy #0 lag: (min: 0.0, avg: 44.1, max: 79.0) [2024-03-21 05:21:00,522][03784] Avg episode reward: [(0, '1.262')] [2024-03-21 05:21:02,482][04017] Updated weights for policy 0, policy_version 33902 (0.0011) [2024-03-21 05:21:05,521][03784] Fps is (10 sec: 29491.1, 60 sec: 47513.5, 300 sec: 46986.0). Total num frames: 1110966272. Throughput: 0: 47155.5. Samples: 1112298100. Policy #0 lag: (min: 0.0, avg: 44.1, max: 79.0) [2024-03-21 05:21:05,522][03784] Avg episode reward: [(0, '1.137')] [2024-03-21 05:21:09,530][04017] Updated weights for policy 0, policy_version 33912 (0.0020) [2024-03-21 05:21:10,521][03784] Fps is (10 sec: 49151.9, 60 sec: 49152.1, 300 sec: 47097.1). Total num frames: 1111293952. Throughput: 0: 46606.7. Samples: 1112573300. Policy #0 lag: (min: 0.0, avg: 44.1, max: 79.0) [2024-03-21 05:21:10,522][03784] Avg episode reward: [(0, '0.837')] [2024-03-21 05:21:15,521][03784] Fps is (10 sec: 55705.9, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 1111523328. Throughput: 0: 46602.3. Samples: 1112716700. Policy #0 lag: (min: 0.0, avg: 30.7, max: 93.0) [2024-03-21 05:21:15,522][03784] Avg episode reward: [(0, '1.521')] [2024-03-21 05:21:15,944][04017] Updated weights for policy 0, policy_version 33922 (0.0016) [2024-03-21 05:21:20,521][03784] Fps is (10 sec: 39321.3, 60 sec: 45329.0, 300 sec: 46986.0). Total num frames: 1111687168. Throughput: 0: 46755.5. Samples: 1112989800. Policy #0 lag: (min: 0.0, avg: 30.7, max: 93.0) [2024-03-21 05:21:20,522][03784] Avg episode reward: [(0, '1.220')] [2024-03-21 05:21:25,493][04017] Updated weights for policy 0, policy_version 33932 (0.0015) [2024-03-21 05:21:25,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46421.3, 300 sec: 46652.7). Total num frames: 1111883776. Throughput: 0: 46673.3. Samples: 1113276500. Policy #0 lag: (min: 0.0, avg: 30.7, max: 93.0) [2024-03-21 05:21:25,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 05:21:29,474][04017] Updated weights for policy 0, policy_version 33942 (0.0024) [2024-03-21 05:21:30,521][03784] Fps is (10 sec: 55705.5, 60 sec: 48059.7, 300 sec: 47541.4). Total num frames: 1112244224. Throughput: 0: 47128.8. Samples: 1113412800. Policy #0 lag: (min: 0.0, avg: 30.7, max: 93.0) [2024-03-21 05:21:30,522][03784] Avg episode reward: [(0, '1.001')] [2024-03-21 05:21:33,919][04017] Updated weights for policy 0, policy_version 33952 (0.0011) [2024-03-21 05:21:34,687][03995] Signal inference workers to stop experience collection... (22400 times) [2024-03-21 05:21:34,730][04017] InferenceWorker_p0-w0: stopping experience collection (22400 times) [2024-03-21 05:21:34,979][03995] Signal inference workers to resume experience collection... (22400 times) [2024-03-21 05:21:34,979][04017] InferenceWorker_p0-w0: resuming experience collection (22400 times) [2024-03-21 05:21:35,521][03784] Fps is (10 sec: 78643.7, 60 sec: 48605.9, 300 sec: 48430.0). Total num frames: 1112670208. Throughput: 0: 46562.2. Samples: 1113680400. Policy #0 lag: (min: 3.0, avg: 52.1, max: 112.0) [2024-03-21 05:21:35,522][03784] Avg episode reward: [(0, '0.918')] [2024-03-21 05:21:39,911][04017] Updated weights for policy 0, policy_version 33962 (0.0012) [2024-03-21 05:21:40,521][03784] Fps is (10 sec: 62259.3, 60 sec: 50244.2, 300 sec: 48430.0). Total num frames: 1112866816. Throughput: 0: 46079.9. Samples: 1113957300. Policy #0 lag: (min: 3.0, avg: 52.1, max: 112.0) [2024-03-21 05:21:40,523][03784] Avg episode reward: [(0, '1.492')] [2024-03-21 05:21:45,521][03784] Fps is (10 sec: 36044.4, 60 sec: 48059.7, 300 sec: 47763.5). Total num frames: 1113030656. Throughput: 0: 46617.7. Samples: 1114118000. Policy #0 lag: (min: 3.0, avg: 52.1, max: 112.0) [2024-03-21 05:21:45,522][03784] Avg episode reward: [(0, '1.492')] [2024-03-21 05:21:50,521][03784] Fps is (10 sec: 29491.4, 60 sec: 44782.9, 300 sec: 47097.1). Total num frames: 1113161728. Throughput: 0: 46848.9. Samples: 1114406300. Policy #0 lag: (min: 3.0, avg: 52.1, max: 112.0) [2024-03-21 05:21:50,522][03784] Avg episode reward: [(0, '0.727')] [2024-03-21 05:21:52,661][04017] Updated weights for policy 0, policy_version 33972 (0.0010) [2024-03-21 05:21:55,521][03784] Fps is (10 sec: 29491.5, 60 sec: 44236.8, 300 sec: 46541.7). Total num frames: 1113325568. Throughput: 0: 46973.4. Samples: 1114687100. Policy #0 lag: (min: 3.0, avg: 52.1, max: 112.0) [2024-03-21 05:21:55,522][03784] Avg episode reward: [(0, '0.696')] [2024-03-21 05:21:58,412][04017] Updated weights for policy 0, policy_version 33982 (0.0011) [2024-03-21 05:22:00,521][03784] Fps is (10 sec: 36044.4, 60 sec: 45329.0, 300 sec: 46652.7). Total num frames: 1113522176. Throughput: 0: 46826.6. Samples: 1114823900. Policy #0 lag: (min: 0.0, avg: 38.1, max: 71.0) [2024-03-21 05:22:00,522][03784] Avg episode reward: [(0, '0.750')] [2024-03-21 05:22:00,864][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000033983_1113554944.pth... [2024-03-21 05:22:00,988][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000033640_1102315520.pth [2024-03-21 05:22:05,521][03784] Fps is (10 sec: 45874.8, 60 sec: 46967.4, 300 sec: 46208.4). Total num frames: 1113784320. Throughput: 0: 46980.0. Samples: 1115103900. Policy #0 lag: (min: 0.0, avg: 38.1, max: 71.0) [2024-03-21 05:22:05,522][03784] Avg episode reward: [(0, '1.200')] [2024-03-21 05:22:05,833][04017] Updated weights for policy 0, policy_version 33992 (0.0022) [2024-03-21 05:22:10,521][03784] Fps is (10 sec: 45875.5, 60 sec: 44782.9, 300 sec: 46319.5). Total num frames: 1113980928. Throughput: 0: 47026.7. Samples: 1115392700. Policy #0 lag: (min: 0.0, avg: 38.1, max: 71.0) [2024-03-21 05:22:10,522][03784] Avg episode reward: [(0, '1.335')] [2024-03-21 05:22:12,791][04017] Updated weights for policy 0, policy_version 34002 (0.0011) [2024-03-21 05:22:15,521][03784] Fps is (10 sec: 55706.2, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 1114341376. Throughput: 0: 47135.7. Samples: 1115533900. Policy #0 lag: (min: 0.0, avg: 38.1, max: 71.0) [2024-03-21 05:22:15,522][03784] Avg episode reward: [(0, '1.455')] [2024-03-21 05:22:17,649][04017] Updated weights for policy 0, policy_version 34012 (0.0015) [2024-03-21 05:22:20,521][03784] Fps is (10 sec: 62259.2, 60 sec: 48605.9, 300 sec: 47541.4). Total num frames: 1114603520. Throughput: 0: 47357.7. Samples: 1115811500. Policy #0 lag: (min: 0.0, avg: 39.3, max: 81.0) [2024-03-21 05:22:20,522][03784] Avg episode reward: [(0, '0.974')] [2024-03-21 05:22:25,521][03784] Fps is (10 sec: 45874.6, 60 sec: 48605.8, 300 sec: 47208.1). Total num frames: 1114800128. Throughput: 0: 48128.9. Samples: 1116123100. Policy #0 lag: (min: 0.0, avg: 39.3, max: 81.0) [2024-03-21 05:22:25,522][03784] Avg episode reward: [(0, '0.974')] [2024-03-21 05:22:25,730][04017] Updated weights for policy 0, policy_version 34022 (0.0018) [2024-03-21 05:22:28,140][03995] Signal inference workers to stop experience collection... (22450 times) [2024-03-21 05:22:28,198][03995] Signal inference workers to resume experience collection... (22450 times) [2024-03-21 05:22:28,200][04017] InferenceWorker_p0-w0: stopping experience collection (22450 times) [2024-03-21 05:22:28,255][04017] InferenceWorker_p0-w0: resuming experience collection (22450 times) [2024-03-21 05:22:29,739][04017] Updated weights for policy 0, policy_version 34032 (0.0018) [2024-03-21 05:22:30,521][03784] Fps is (10 sec: 55705.8, 60 sec: 48605.9, 300 sec: 47874.6). Total num frames: 1115160576. Throughput: 0: 47700.1. Samples: 1116264500. Policy #0 lag: (min: 0.0, avg: 39.3, max: 81.0) [2024-03-21 05:22:30,522][03784] Avg episode reward: [(0, '0.974')] [2024-03-21 05:22:35,521][03784] Fps is (10 sec: 62259.7, 60 sec: 45875.2, 300 sec: 48096.8). Total num frames: 1115422720. Throughput: 0: 47640.0. Samples: 1116550100. Policy #0 lag: (min: 0.0, avg: 39.3, max: 81.0) [2024-03-21 05:22:35,522][03784] Avg episode reward: [(0, '1.173')] [2024-03-21 05:22:39,051][04017] Updated weights for policy 0, policy_version 34042 (0.0017) [2024-03-21 05:22:40,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 47763.5). Total num frames: 1115586560. Throughput: 0: 48171.1. Samples: 1116854800. Policy #0 lag: (min: 0.0, avg: 39.3, max: 81.0) [2024-03-21 05:22:40,522][03784] Avg episode reward: [(0, '1.669')] [2024-03-21 05:22:44,665][04017] Updated weights for policy 0, policy_version 34052 (0.0011) [2024-03-21 05:22:45,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46967.5, 300 sec: 47652.4). Total num frames: 1115848704. Throughput: 0: 48391.2. Samples: 1117001500. Policy #0 lag: (min: 0.0, avg: 41.0, max: 83.0) [2024-03-21 05:22:45,522][03784] Avg episode reward: [(0, '1.669')] [2024-03-21 05:22:50,521][03784] Fps is (10 sec: 42597.8, 60 sec: 47513.5, 300 sec: 47430.3). Total num frames: 1116012544. Throughput: 0: 49057.7. Samples: 1117311500. Policy #0 lag: (min: 0.0, avg: 41.0, max: 83.0) [2024-03-21 05:22:50,522][03784] Avg episode reward: [(0, '1.250')] [2024-03-21 05:22:53,388][04017] Updated weights for policy 0, policy_version 34062 (0.0015) [2024-03-21 05:22:55,521][03784] Fps is (10 sec: 45875.2, 60 sec: 49698.1, 300 sec: 47541.4). Total num frames: 1116307456. Throughput: 0: 48588.9. Samples: 1117579200. Policy #0 lag: (min: 0.0, avg: 41.0, max: 83.0) [2024-03-21 05:22:55,522][03784] Avg episode reward: [(0, '1.108')] [2024-03-21 05:23:00,521][03784] Fps is (10 sec: 42598.9, 60 sec: 48605.9, 300 sec: 46874.9). Total num frames: 1116438528. Throughput: 0: 48166.6. Samples: 1117701400. Policy #0 lag: (min: 0.0, avg: 41.0, max: 83.0) [2024-03-21 05:23:00,522][03784] Avg episode reward: [(0, '1.197')] [2024-03-21 05:23:01,025][04017] Updated weights for policy 0, policy_version 34072 (0.0018) [2024-03-21 05:23:04,982][04017] Updated weights for policy 0, policy_version 34082 (0.0015) [2024-03-21 05:23:05,521][03784] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 47319.2). Total num frames: 1116798976. Throughput: 0: 48326.7. Samples: 1117986200. Policy #0 lag: (min: 1.0, avg: 32.7, max: 83.0) [2024-03-21 05:23:05,522][03784] Avg episode reward: [(0, '1.526')] [2024-03-21 05:23:10,324][04017] Updated weights for policy 0, policy_version 34092 (0.0013) [2024-03-21 05:23:10,521][03784] Fps is (10 sec: 68812.8, 60 sec: 52428.8, 300 sec: 47652.4). Total num frames: 1117126656. Throughput: 0: 47009.0. Samples: 1118238500. Policy #0 lag: (min: 1.0, avg: 32.7, max: 83.0) [2024-03-21 05:23:10,522][03784] Avg episode reward: [(0, '0.913')] [2024-03-21 05:23:15,521][03784] Fps is (10 sec: 62259.0, 60 sec: 51336.5, 300 sec: 47319.2). Total num frames: 1117421568. Throughput: 0: 47251.1. Samples: 1118390800. Policy #0 lag: (min: 1.0, avg: 32.7, max: 83.0) [2024-03-21 05:23:15,530][03784] Avg episode reward: [(0, '1.224')] [2024-03-21 05:23:17,807][04017] Updated weights for policy 0, policy_version 34102 (0.0009) [2024-03-21 05:23:19,533][03995] Signal inference workers to stop experience collection... (22500 times) [2024-03-21 05:23:19,603][03995] Signal inference workers to resume experience collection... (22500 times) [2024-03-21 05:23:19,608][04017] InferenceWorker_p0-w0: stopping experience collection (22500 times) [2024-03-21 05:23:19,667][04017] InferenceWorker_p0-w0: resuming experience collection (22500 times) [2024-03-21 05:23:20,521][03784] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 47319.2). Total num frames: 1117618176. Throughput: 0: 47868.8. Samples: 1118704200. Policy #0 lag: (min: 1.0, avg: 32.7, max: 83.0) [2024-03-21 05:23:20,522][03784] Avg episode reward: [(0, '1.224')] [2024-03-21 05:23:25,521][03784] Fps is (10 sec: 26214.3, 60 sec: 48059.8, 300 sec: 47097.1). Total num frames: 1117683712. Throughput: 0: 47797.7. Samples: 1119005700. Policy #0 lag: (min: 1.0, avg: 41.8, max: 88.0) [2024-03-21 05:23:25,522][03784] Avg episode reward: [(0, '0.605')] [2024-03-21 05:23:26,885][04017] Updated weights for policy 0, policy_version 34112 (0.0011) [2024-03-21 05:23:30,521][03784] Fps is (10 sec: 26214.4, 60 sec: 45329.0, 300 sec: 47319.2). Total num frames: 1117880320. Throughput: 0: 47573.3. Samples: 1119142300. Policy #0 lag: (min: 1.0, avg: 41.8, max: 88.0) [2024-03-21 05:23:30,522][03784] Avg episode reward: [(0, '1.023')] [2024-03-21 05:23:34,359][04017] Updated weights for policy 0, policy_version 34122 (0.0015) [2024-03-21 05:23:35,521][03784] Fps is (10 sec: 52429.2, 60 sec: 46421.4, 300 sec: 47541.4). Total num frames: 1118208000. Throughput: 0: 47409.1. Samples: 1119444900. Policy #0 lag: (min: 1.0, avg: 41.8, max: 88.0) [2024-03-21 05:23:35,522][03784] Avg episode reward: [(0, '1.023')] [2024-03-21 05:23:40,521][03784] Fps is (10 sec: 49152.4, 60 sec: 46421.3, 300 sec: 47763.5). Total num frames: 1118371840. Throughput: 0: 47651.1. Samples: 1119723500. Policy #0 lag: (min: 1.0, avg: 41.8, max: 88.0) [2024-03-21 05:23:40,522][03784] Avg episode reward: [(0, '0.897')] [2024-03-21 05:23:40,866][04017] Updated weights for policy 0, policy_version 34132 (0.0010) [2024-03-21 05:23:45,521][03784] Fps is (10 sec: 49151.9, 60 sec: 47513.6, 300 sec: 47541.4). Total num frames: 1118699520. Throughput: 0: 48133.3. Samples: 1119867400. Policy #0 lag: (min: 1.0, avg: 41.8, max: 88.0) [2024-03-21 05:23:45,522][03784] Avg episode reward: [(0, '0.868')] [2024-03-21 05:23:48,474][04017] Updated weights for policy 0, policy_version 34142 (0.0010) [2024-03-21 05:23:50,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46421.4, 300 sec: 47208.1). Total num frames: 1118797824. Throughput: 0: 48113.3. Samples: 1120151300. Policy #0 lag: (min: 0.0, avg: 32.2, max: 64.0) [2024-03-21 05:23:50,522][03784] Avg episode reward: [(0, '0.653')] [2024-03-21 05:23:55,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 1119059968. Throughput: 0: 48700.0. Samples: 1120430000. Policy #0 lag: (min: 0.0, avg: 32.2, max: 64.0) [2024-03-21 05:23:55,522][03784] Avg episode reward: [(0, '0.799')] [2024-03-21 05:23:55,588][04017] Updated weights for policy 0, policy_version 34152 (0.0010) [2024-03-21 05:24:00,521][03784] Fps is (10 sec: 58981.8, 60 sec: 49151.9, 300 sec: 47763.5). Total num frames: 1119387648. Throughput: 0: 48037.7. Samples: 1120552500. Policy #0 lag: (min: 0.0, avg: 32.2, max: 64.0) [2024-03-21 05:24:00,522][03784] Avg episode reward: [(0, '0.633')] [2024-03-21 05:24:00,648][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000034162_1119420416.pth... [2024-03-21 05:24:00,656][04017] Updated weights for policy 0, policy_version 34162 (0.0022) [2024-03-21 05:24:00,787][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000033814_1108017152.pth [2024-03-21 05:24:05,521][03784] Fps is (10 sec: 55704.9, 60 sec: 46967.4, 300 sec: 47319.2). Total num frames: 1119617024. Throughput: 0: 47744.4. Samples: 1120852700. Policy #0 lag: (min: 0.0, avg: 32.2, max: 64.0) [2024-03-21 05:24:05,522][03784] Avg episode reward: [(0, '0.840')] [2024-03-21 05:24:07,441][04017] Updated weights for policy 0, policy_version 34172 (0.0016) [2024-03-21 05:24:10,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46421.2, 300 sec: 48207.8). Total num frames: 1119911936. Throughput: 0: 47271.1. Samples: 1121132900. Policy #0 lag: (min: 1.0, avg: 41.7, max: 87.0) [2024-03-21 05:24:10,522][03784] Avg episode reward: [(0, '1.431')] [2024-03-21 05:24:13,166][03995] Signal inference workers to stop experience collection... (22550 times) [2024-03-21 05:24:13,214][04017] InferenceWorker_p0-w0: stopping experience collection (22550 times) [2024-03-21 05:24:13,240][03995] Signal inference workers to resume experience collection... (22550 times) [2024-03-21 05:24:13,255][04017] InferenceWorker_p0-w0: resuming experience collection (22550 times) [2024-03-21 05:24:13,579][04017] Updated weights for policy 0, policy_version 34182 (0.0019) [2024-03-21 05:24:15,521][03784] Fps is (10 sec: 62259.4, 60 sec: 46967.4, 300 sec: 48318.9). Total num frames: 1120239616. Throughput: 0: 47571.1. Samples: 1121283000. Policy #0 lag: (min: 1.0, avg: 41.7, max: 87.0) [2024-03-21 05:24:15,522][03784] Avg episode reward: [(0, '0.765')] [2024-03-21 05:24:20,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45329.0, 300 sec: 47652.4). Total num frames: 1120337920. Throughput: 0: 46713.2. Samples: 1121547000. Policy #0 lag: (min: 1.0, avg: 41.7, max: 87.0) [2024-03-21 05:24:20,522][03784] Avg episode reward: [(0, '0.973')] [2024-03-21 05:24:21,109][04017] Updated weights for policy 0, policy_version 34192 (0.0022) [2024-03-21 05:24:25,521][03784] Fps is (10 sec: 29491.3, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 1120534528. Throughput: 0: 46644.4. Samples: 1121822500. Policy #0 lag: (min: 1.0, avg: 41.7, max: 87.0) [2024-03-21 05:24:25,522][03784] Avg episode reward: [(0, '0.889')] [2024-03-21 05:24:30,521][03784] Fps is (10 sec: 36044.7, 60 sec: 46967.4, 300 sec: 46541.7). Total num frames: 1120698368. Throughput: 0: 46564.3. Samples: 1121962800. Policy #0 lag: (min: 1.0, avg: 41.7, max: 87.0) [2024-03-21 05:24:30,522][03784] Avg episode reward: [(0, '1.488')] [2024-03-21 05:24:31,709][04017] Updated weights for policy 0, policy_version 34202 (0.0015) [2024-03-21 05:24:35,521][03784] Fps is (10 sec: 36045.1, 60 sec: 44782.9, 300 sec: 46874.9). Total num frames: 1120894976. Throughput: 0: 46686.7. Samples: 1122252200. Policy #0 lag: (min: 0.0, avg: 34.3, max: 89.0) [2024-03-21 05:24:35,522][03784] Avg episode reward: [(0, '0.889')] [2024-03-21 05:24:37,728][04017] Updated weights for policy 0, policy_version 34212 (0.0013) [2024-03-21 05:24:40,521][03784] Fps is (10 sec: 62259.8, 60 sec: 49152.0, 300 sec: 47541.4). Total num frames: 1121320960. Throughput: 0: 45795.5. Samples: 1122490800. Policy #0 lag: (min: 0.0, avg: 34.3, max: 89.0) [2024-03-21 05:24:40,522][03784] Avg episode reward: [(0, '1.349')] [2024-03-21 05:24:42,621][04017] Updated weights for policy 0, policy_version 34222 (0.0018) [2024-03-21 05:24:45,521][03784] Fps is (10 sec: 58982.3, 60 sec: 46421.3, 300 sec: 46874.9). Total num frames: 1121484800. Throughput: 0: 46360.1. Samples: 1122638700. Policy #0 lag: (min: 0.0, avg: 34.3, max: 89.0) [2024-03-21 05:24:45,522][03784] Avg episode reward: [(0, '0.897')] [2024-03-21 05:24:48,571][04017] Updated weights for policy 0, policy_version 34232 (0.0017) [2024-03-21 05:24:50,521][03784] Fps is (10 sec: 42598.1, 60 sec: 49151.9, 300 sec: 47319.2). Total num frames: 1121746944. Throughput: 0: 46100.0. Samples: 1122927200. Policy #0 lag: (min: 0.0, avg: 34.3, max: 89.0) [2024-03-21 05:24:50,522][03784] Avg episode reward: [(0, '1.095')] [2024-03-21 05:24:55,427][04017] Updated weights for policy 0, policy_version 34242 (0.0016) [2024-03-21 05:24:55,521][03784] Fps is (10 sec: 55705.1, 60 sec: 49698.0, 300 sec: 47541.4). Total num frames: 1122041856. Throughput: 0: 46208.9. Samples: 1123212300. Policy #0 lag: (min: 1.0, avg: 42.5, max: 80.0) [2024-03-21 05:24:55,522][03784] Avg episode reward: [(0, '1.306')] [2024-03-21 05:25:00,521][03784] Fps is (10 sec: 49152.5, 60 sec: 47513.7, 300 sec: 47874.6). Total num frames: 1122238464. Throughput: 0: 46086.8. Samples: 1123356900. Policy #0 lag: (min: 1.0, avg: 42.5, max: 80.0) [2024-03-21 05:25:00,522][03784] Avg episode reward: [(0, '0.738')] [2024-03-21 05:25:05,115][04017] Updated weights for policy 0, policy_version 34252 (0.0015) [2024-03-21 05:25:05,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46421.3, 300 sec: 47652.4). Total num frames: 1122402304. Throughput: 0: 46331.0. Samples: 1123631900. Policy #0 lag: (min: 1.0, avg: 42.5, max: 80.0) [2024-03-21 05:25:05,522][03784] Avg episode reward: [(0, '1.435')] [2024-03-21 05:25:09,404][03995] Signal inference workers to stop experience collection... (22600 times) [2024-03-21 05:25:09,405][03995] Signal inference workers to resume experience collection... (22600 times) [2024-03-21 05:25:09,476][04017] InferenceWorker_p0-w0: stopping experience collection (22600 times) [2024-03-21 05:25:09,477][04017] InferenceWorker_p0-w0: resuming experience collection (22600 times) [2024-03-21 05:25:10,521][03784] Fps is (10 sec: 36044.6, 60 sec: 44783.0, 300 sec: 47208.1). Total num frames: 1122598912. Throughput: 0: 45866.7. Samples: 1123886500. Policy #0 lag: (min: 1.0, avg: 42.5, max: 80.0) [2024-03-21 05:25:10,522][03784] Avg episode reward: [(0, '1.071')] [2024-03-21 05:25:12,631][04017] Updated weights for policy 0, policy_version 34262 (0.0020) [2024-03-21 05:25:15,521][03784] Fps is (10 sec: 45875.8, 60 sec: 43690.7, 300 sec: 47097.1). Total num frames: 1122861056. Throughput: 0: 45869.0. Samples: 1124026900. Policy #0 lag: (min: 1.0, avg: 42.5, max: 80.0) [2024-03-21 05:25:15,522][03784] Avg episode reward: [(0, '1.148')] [2024-03-21 05:25:20,521][03784] Fps is (10 sec: 39321.4, 60 sec: 44236.8, 300 sec: 47097.0). Total num frames: 1122992128. Throughput: 0: 45768.8. Samples: 1124311800. Policy #0 lag: (min: 0.0, avg: 42.9, max: 103.0) [2024-03-21 05:25:20,522][03784] Avg episode reward: [(0, '1.506')] [2024-03-21 05:25:23,179][04017] Updated weights for policy 0, policy_version 34272 (0.0010) [2024-03-21 05:25:25,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45329.1, 300 sec: 47097.1). Total num frames: 1123254272. Throughput: 0: 47486.7. Samples: 1124627700. Policy #0 lag: (min: 0.0, avg: 42.9, max: 103.0) [2024-03-21 05:25:25,522][03784] Avg episode reward: [(0, '1.506')] [2024-03-21 05:25:26,395][04017] Updated weights for policy 0, policy_version 34282 (0.0019) [2024-03-21 05:25:30,521][03784] Fps is (10 sec: 58982.8, 60 sec: 48059.8, 300 sec: 46874.9). Total num frames: 1123581952. Throughput: 0: 46900.0. Samples: 1124749200. Policy #0 lag: (min: 0.0, avg: 42.9, max: 103.0) [2024-03-21 05:25:30,522][03784] Avg episode reward: [(0, '1.269')] [2024-03-21 05:25:33,881][04017] Updated weights for policy 0, policy_version 34292 (0.0011) [2024-03-21 05:25:35,521][03784] Fps is (10 sec: 58982.3, 60 sec: 49152.0, 300 sec: 47430.3). Total num frames: 1123844096. Throughput: 0: 47311.2. Samples: 1125056200. Policy #0 lag: (min: 0.0, avg: 42.9, max: 103.0) [2024-03-21 05:25:35,522][03784] Avg episode reward: [(0, '1.269')] [2024-03-21 05:25:38,150][04017] Updated weights for policy 0, policy_version 34302 (0.0008) [2024-03-21 05:25:40,521][03784] Fps is (10 sec: 55705.3, 60 sec: 46967.4, 300 sec: 47430.3). Total num frames: 1124139008. Throughput: 0: 47215.6. Samples: 1125337000. Policy #0 lag: (min: 0.0, avg: 42.9, max: 103.0) [2024-03-21 05:25:40,522][03784] Avg episode reward: [(0, '0.701')] [2024-03-21 05:25:44,880][04017] Updated weights for policy 0, policy_version 34312 (0.0011) [2024-03-21 05:25:45,521][03784] Fps is (10 sec: 55705.2, 60 sec: 48605.8, 300 sec: 47208.1). Total num frames: 1124401152. Throughput: 0: 47277.7. Samples: 1125484400. Policy #0 lag: (min: 0.0, avg: 48.8, max: 105.0) [2024-03-21 05:25:45,522][03784] Avg episode reward: [(0, '1.392')] [2024-03-21 05:25:50,521][03784] Fps is (10 sec: 45875.6, 60 sec: 47513.7, 300 sec: 47208.1). Total num frames: 1124597760. Throughput: 0: 47258.0. Samples: 1125758500. Policy #0 lag: (min: 0.0, avg: 48.8, max: 105.0) [2024-03-21 05:25:50,522][03784] Avg episode reward: [(0, '0.857')] [2024-03-21 05:25:52,416][04017] Updated weights for policy 0, policy_version 34322 (0.0011) [2024-03-21 05:25:55,090][03995] Signal inference workers to stop experience collection... (22650 times) [2024-03-21 05:25:55,156][04017] InferenceWorker_p0-w0: stopping experience collection (22650 times) [2024-03-21 05:25:55,344][03995] Signal inference workers to resume experience collection... (22650 times) [2024-03-21 05:25:55,344][04017] InferenceWorker_p0-w0: resuming experience collection (22650 times) [2024-03-21 05:25:55,521][03784] Fps is (10 sec: 42598.8, 60 sec: 46421.4, 300 sec: 47541.4). Total num frames: 1124827136. Throughput: 0: 47602.3. Samples: 1126028600. Policy #0 lag: (min: 0.0, avg: 48.8, max: 105.0) [2024-03-21 05:25:55,522][03784] Avg episode reward: [(0, '1.514')] [2024-03-21 05:25:57,531][04017] Updated weights for policy 0, policy_version 34332 (0.0015) [2024-03-21 05:26:00,521][03784] Fps is (10 sec: 45874.5, 60 sec: 46967.3, 300 sec: 47763.5). Total num frames: 1125056512. Throughput: 0: 47377.7. Samples: 1126158900. Policy #0 lag: (min: 0.0, avg: 48.8, max: 105.0) [2024-03-21 05:26:00,522][03784] Avg episode reward: [(0, '1.051')] [2024-03-21 05:26:00,648][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000034335_1125089280.pth... [2024-03-21 05:26:00,761][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000033983_1113554944.pth [2024-03-21 05:26:05,521][03784] Fps is (10 sec: 32767.6, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 1125154816. Throughput: 0: 48160.0. Samples: 1126479000. Policy #0 lag: (min: 0.0, avg: 48.8, max: 105.0) [2024-03-21 05:26:05,522][03784] Avg episode reward: [(0, '1.663')] [2024-03-21 05:26:10,392][04017] Updated weights for policy 0, policy_version 34342 (0.0011) [2024-03-21 05:26:10,521][03784] Fps is (10 sec: 26214.8, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 1125318656. Throughput: 0: 47426.6. Samples: 1126761900. Policy #0 lag: (min: 0.0, avg: 34.6, max: 84.0) [2024-03-21 05:26:10,522][03784] Avg episode reward: [(0, '0.656')] [2024-03-21 05:26:15,521][03784] Fps is (10 sec: 39321.5, 60 sec: 44782.9, 300 sec: 46986.0). Total num frames: 1125548032. Throughput: 0: 47942.1. Samples: 1126906600. Policy #0 lag: (min: 0.0, avg: 34.6, max: 84.0) [2024-03-21 05:26:15,522][03784] Avg episode reward: [(0, '1.010')] [2024-03-21 05:26:16,487][04017] Updated weights for policy 0, policy_version 34352 (0.0016) [2024-03-21 05:26:20,521][03784] Fps is (10 sec: 62259.2, 60 sec: 49152.1, 300 sec: 47652.5). Total num frames: 1125941248. Throughput: 0: 47051.1. Samples: 1127173500. Policy #0 lag: (min: 0.0, avg: 34.6, max: 84.0) [2024-03-21 05:26:20,522][03784] Avg episode reward: [(0, '1.174')] [2024-03-21 05:26:20,644][04017] Updated weights for policy 0, policy_version 34362 (0.0014) [2024-03-21 05:26:24,754][04017] Updated weights for policy 0, policy_version 34372 (0.0015) [2024-03-21 05:26:25,521][03784] Fps is (10 sec: 78643.5, 60 sec: 51336.4, 300 sec: 47763.5). Total num frames: 1126334464. Throughput: 0: 47222.2. Samples: 1127462000. Policy #0 lag: (min: 0.0, avg: 34.6, max: 84.0) [2024-03-21 05:26:25,522][03784] Avg episode reward: [(0, '1.174')] [2024-03-21 05:26:30,521][03784] Fps is (10 sec: 45874.8, 60 sec: 46967.4, 300 sec: 46541.7). Total num frames: 1126400000. Throughput: 0: 47326.6. Samples: 1127614100. Policy #0 lag: (min: 1.0, avg: 48.8, max: 103.0) [2024-03-21 05:26:30,522][03784] Avg episode reward: [(0, '1.088')] [2024-03-21 05:26:34,059][04017] Updated weights for policy 0, policy_version 34382 (0.0025) [2024-03-21 05:26:35,521][03784] Fps is (10 sec: 36044.9, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 1126694912. Throughput: 0: 47991.0. Samples: 1127918100. Policy #0 lag: (min: 1.0, avg: 48.8, max: 103.0) [2024-03-21 05:26:35,531][03784] Avg episode reward: [(0, '1.088')] [2024-03-21 05:26:39,482][04017] Updated weights for policy 0, policy_version 34392 (0.0015) [2024-03-21 05:26:40,521][03784] Fps is (10 sec: 65536.4, 60 sec: 48605.9, 300 sec: 47541.4). Total num frames: 1127055360. Throughput: 0: 47900.0. Samples: 1128184100. Policy #0 lag: (min: 1.0, avg: 48.8, max: 103.0) [2024-03-21 05:26:40,522][03784] Avg episode reward: [(0, '1.658')] [2024-03-21 05:26:44,285][03995] Signal inference workers to stop experience collection... (22700 times) [2024-03-21 05:26:44,316][04017] InferenceWorker_p0-w0: stopping experience collection (22700 times) [2024-03-21 05:26:44,570][03995] Signal inference workers to resume experience collection... (22700 times) [2024-03-21 05:26:44,570][04017] InferenceWorker_p0-w0: resuming experience collection (22700 times) [2024-03-21 05:26:45,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 47430.3). Total num frames: 1127153664. Throughput: 0: 48531.2. Samples: 1128342800. Policy #0 lag: (min: 1.0, avg: 48.8, max: 103.0) [2024-03-21 05:26:45,522][03784] Avg episode reward: [(0, '1.348')] [2024-03-21 05:26:49,898][04017] Updated weights for policy 0, policy_version 34402 (0.0010) [2024-03-21 05:26:50,521][03784] Fps is (10 sec: 29491.1, 60 sec: 45875.2, 300 sec: 47541.4). Total num frames: 1127350272. Throughput: 0: 48440.1. Samples: 1128658800. Policy #0 lag: (min: 1.0, avg: 48.8, max: 103.0) [2024-03-21 05:26:50,530][03784] Avg episode reward: [(0, '1.348')] [2024-03-21 05:26:55,080][04017] Updated weights for policy 0, policy_version 34412 (0.0011) [2024-03-21 05:26:55,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46967.4, 300 sec: 47874.6). Total num frames: 1127645184. Throughput: 0: 48691.1. Samples: 1128953000. Policy #0 lag: (min: 0.0, avg: 41.7, max: 113.0) [2024-03-21 05:26:55,522][03784] Avg episode reward: [(0, '1.155')] [2024-03-21 05:27:00,521][03784] Fps is (10 sec: 55705.6, 60 sec: 47513.7, 300 sec: 47874.6). Total num frames: 1127907328. Throughput: 0: 48455.6. Samples: 1129087100. Policy #0 lag: (min: 0.0, avg: 41.7, max: 113.0) [2024-03-21 05:27:00,522][03784] Avg episode reward: [(0, '1.365')] [2024-03-21 05:27:00,895][04017] Updated weights for policy 0, policy_version 34422 (0.0016) [2024-03-21 05:27:05,521][03784] Fps is (10 sec: 52428.6, 60 sec: 50244.3, 300 sec: 48096.8). Total num frames: 1128169472. Throughput: 0: 48842.1. Samples: 1129371400. Policy #0 lag: (min: 0.0, avg: 41.7, max: 113.0) [2024-03-21 05:27:05,522][03784] Avg episode reward: [(0, '1.090')] [2024-03-21 05:27:09,576][04017] Updated weights for policy 0, policy_version 34432 (0.0019) [2024-03-21 05:27:10,521][03784] Fps is (10 sec: 36044.7, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 1128267776. Throughput: 0: 49062.2. Samples: 1129669800. Policy #0 lag: (min: 0.0, avg: 41.7, max: 113.0) [2024-03-21 05:27:10,522][03784] Avg episode reward: [(0, '1.064')] [2024-03-21 05:27:15,521][03784] Fps is (10 sec: 32768.3, 60 sec: 49152.1, 300 sec: 47097.1). Total num frames: 1128497152. Throughput: 0: 48915.6. Samples: 1129815300. Policy #0 lag: (min: 0.0, avg: 43.1, max: 108.0) [2024-03-21 05:27:15,522][03784] Avg episode reward: [(0, '1.422')] [2024-03-21 05:27:16,843][04017] Updated weights for policy 0, policy_version 34442 (0.0011) [2024-03-21 05:27:20,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46967.4, 300 sec: 47319.2). Total num frames: 1128759296. Throughput: 0: 48166.6. Samples: 1130085600. Policy #0 lag: (min: 0.0, avg: 43.1, max: 108.0) [2024-03-21 05:27:20,522][03784] Avg episode reward: [(0, '0.885')] [2024-03-21 05:27:22,254][04017] Updated weights for policy 0, policy_version 34452 (0.0012) [2024-03-21 05:27:25,521][03784] Fps is (10 sec: 68812.7, 60 sec: 47513.7, 300 sec: 47541.4). Total num frames: 1129185280. Throughput: 0: 48000.0. Samples: 1130344100. Policy #0 lag: (min: 0.0, avg: 43.1, max: 108.0) [2024-03-21 05:27:25,522][03784] Avg episode reward: [(0, '1.423')] [2024-03-21 05:27:25,935][04017] Updated weights for policy 0, policy_version 34462 (0.0012) [2024-03-21 05:27:30,521][03784] Fps is (10 sec: 65535.5, 60 sec: 50244.2, 300 sec: 47430.3). Total num frames: 1129414656. Throughput: 0: 47542.1. Samples: 1130482200. Policy #0 lag: (min: 0.0, avg: 43.1, max: 108.0) [2024-03-21 05:27:30,522][03784] Avg episode reward: [(0, '1.025')] [2024-03-21 05:27:33,669][04017] Updated weights for policy 0, policy_version 34472 (0.0015) [2024-03-21 05:27:35,521][03784] Fps is (10 sec: 49151.2, 60 sec: 49698.0, 300 sec: 47763.5). Total num frames: 1129676800. Throughput: 0: 46995.4. Samples: 1130773600. Policy #0 lag: (min: 0.0, avg: 43.1, max: 108.0) [2024-03-21 05:27:35,522][03784] Avg episode reward: [(0, '1.408')] [2024-03-21 05:27:36,007][03995] Signal inference workers to stop experience collection... (22750 times) [2024-03-21 05:27:36,081][03995] Signal inference workers to resume experience collection... (22750 times) [2024-03-21 05:27:36,120][04017] InferenceWorker_p0-w0: stopping experience collection (22750 times) [2024-03-21 05:27:36,184][04017] InferenceWorker_p0-w0: resuming experience collection (22750 times) [2024-03-21 05:27:39,746][04017] Updated weights for policy 0, policy_version 34482 (0.0011) [2024-03-21 05:27:40,521][03784] Fps is (10 sec: 55706.4, 60 sec: 48605.9, 300 sec: 47874.6). Total num frames: 1129971712. Throughput: 0: 46815.6. Samples: 1131059700. Policy #0 lag: (min: 2.0, avg: 33.5, max: 69.0) [2024-03-21 05:27:40,522][03784] Avg episode reward: [(0, '1.486')] [2024-03-21 05:27:45,521][03784] Fps is (10 sec: 45875.4, 60 sec: 49698.1, 300 sec: 47874.6). Total num frames: 1130135552. Throughput: 0: 47017.7. Samples: 1131202900. Policy #0 lag: (min: 2.0, avg: 33.5, max: 69.0) [2024-03-21 05:27:45,522][03784] Avg episode reward: [(0, '0.772')] [2024-03-21 05:27:49,851][04017] Updated weights for policy 0, policy_version 34492 (0.0014) [2024-03-21 05:27:50,521][03784] Fps is (10 sec: 26214.5, 60 sec: 48059.8, 300 sec: 47208.1). Total num frames: 1130233856. Throughput: 0: 47595.7. Samples: 1131513200. Policy #0 lag: (min: 2.0, avg: 33.5, max: 69.0) [2024-03-21 05:27:50,522][03784] Avg episode reward: [(0, '0.772')] [2024-03-21 05:27:55,521][03784] Fps is (10 sec: 39322.1, 60 sec: 48059.8, 300 sec: 47763.5). Total num frames: 1130528768. Throughput: 0: 47160.1. Samples: 1131792000. Policy #0 lag: (min: 2.0, avg: 33.5, max: 69.0) [2024-03-21 05:27:55,522][03784] Avg episode reward: [(0, '1.011')] [2024-03-21 05:27:56,831][04017] Updated weights for policy 0, policy_version 34502 (0.0010) [2024-03-21 05:28:00,521][03784] Fps is (10 sec: 55705.3, 60 sec: 48059.8, 300 sec: 47430.3). Total num frames: 1130790912. Throughput: 0: 47111.1. Samples: 1131935300. Policy #0 lag: (min: 3.0, avg: 48.4, max: 99.0) [2024-03-21 05:28:00,522][03784] Avg episode reward: [(0, '1.011')] [2024-03-21 05:28:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000034509_1130790912.pth... [2024-03-21 05:28:00,661][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000034162_1119420416.pth [2024-03-21 05:28:05,301][04017] Updated weights for policy 0, policy_version 34512 (0.0011) [2024-03-21 05:28:05,521][03784] Fps is (10 sec: 36044.1, 60 sec: 45329.0, 300 sec: 46652.7). Total num frames: 1130889216. Throughput: 0: 47655.5. Samples: 1132230100. Policy #0 lag: (min: 3.0, avg: 48.4, max: 99.0) [2024-03-21 05:28:05,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 05:28:10,521][03784] Fps is (10 sec: 32767.6, 60 sec: 47513.5, 300 sec: 46430.6). Total num frames: 1131118592. Throughput: 0: 48146.5. Samples: 1132510700. Policy #0 lag: (min: 3.0, avg: 48.4, max: 99.0) [2024-03-21 05:28:10,523][03784] Avg episode reward: [(0, '1.085')] [2024-03-21 05:28:14,013][04017] Updated weights for policy 0, policy_version 34522 (0.0010) [2024-03-21 05:28:15,521][03784] Fps is (10 sec: 45875.9, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 1131347968. Throughput: 0: 48049.0. Samples: 1132644400. Policy #0 lag: (min: 3.0, avg: 48.4, max: 99.0) [2024-03-21 05:28:15,522][03784] Avg episode reward: [(0, '0.576')] [2024-03-21 05:28:18,821][04017] Updated weights for policy 0, policy_version 34532 (0.0011) [2024-03-21 05:28:20,521][03784] Fps is (10 sec: 58983.1, 60 sec: 49152.0, 300 sec: 47541.4). Total num frames: 1131708416. Throughput: 0: 47702.4. Samples: 1132920200. Policy #0 lag: (min: 3.0, avg: 48.4, max: 99.0) [2024-03-21 05:28:20,522][03784] Avg episode reward: [(0, '1.230')] [2024-03-21 05:28:22,674][04017] Updated weights for policy 0, policy_version 34542 (0.0013) [2024-03-21 05:28:25,521][03784] Fps is (10 sec: 65536.2, 60 sec: 46967.5, 300 sec: 47874.6). Total num frames: 1132003328. Throughput: 0: 47557.8. Samples: 1133199800. Policy #0 lag: (min: 4.0, avg: 40.1, max: 74.0) [2024-03-21 05:28:25,522][03784] Avg episode reward: [(0, '1.096')] [2024-03-21 05:28:30,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45875.3, 300 sec: 47319.2). Total num frames: 1132167168. Throughput: 0: 47886.7. Samples: 1133357800. Policy #0 lag: (min: 4.0, avg: 40.1, max: 74.0) [2024-03-21 05:28:30,522][03784] Avg episode reward: [(0, '1.096')] [2024-03-21 05:28:31,552][04017] Updated weights for policy 0, policy_version 34552 (0.0011) [2024-03-21 05:28:33,668][03995] Signal inference workers to stop experience collection... (22800 times) [2024-03-21 05:28:33,739][03995] Signal inference workers to resume experience collection... (22800 times) [2024-03-21 05:28:33,745][04017] InferenceWorker_p0-w0: stopping experience collection (22800 times) [2024-03-21 05:28:33,789][04017] InferenceWorker_p0-w0: resuming experience collection (22800 times) [2024-03-21 05:28:35,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45875.4, 300 sec: 47652.5). Total num frames: 1132429312. Throughput: 0: 47717.8. Samples: 1133660500. Policy #0 lag: (min: 4.0, avg: 40.1, max: 74.0) [2024-03-21 05:28:35,522][03784] Avg episode reward: [(0, '0.900')] [2024-03-21 05:28:36,981][04017] Updated weights for policy 0, policy_version 34562 (0.0011) [2024-03-21 05:28:40,521][03784] Fps is (10 sec: 55705.8, 60 sec: 45875.2, 300 sec: 47541.4). Total num frames: 1132724224. Throughput: 0: 47782.2. Samples: 1133942200. Policy #0 lag: (min: 4.0, avg: 40.1, max: 74.0) [2024-03-21 05:28:40,522][03784] Avg episode reward: [(0, '1.483')] [2024-03-21 05:28:42,722][04017] Updated weights for policy 0, policy_version 34572 (0.0017) [2024-03-21 05:28:45,521][03784] Fps is (10 sec: 52428.6, 60 sec: 46967.6, 300 sec: 47985.7). Total num frames: 1132953600. Throughput: 0: 47517.8. Samples: 1134073600. Policy #0 lag: (min: 4.0, avg: 40.2, max: 80.0) [2024-03-21 05:28:45,522][03784] Avg episode reward: [(0, '1.480')] [2024-03-21 05:28:50,521][03784] Fps is (10 sec: 36044.7, 60 sec: 47513.5, 300 sec: 47541.4). Total num frames: 1133084672. Throughput: 0: 47426.8. Samples: 1134364300. Policy #0 lag: (min: 4.0, avg: 40.2, max: 80.0) [2024-03-21 05:28:50,522][03784] Avg episode reward: [(0, '1.191')] [2024-03-21 05:28:51,661][04017] Updated weights for policy 0, policy_version 34582 (0.0019) [2024-03-21 05:28:55,521][03784] Fps is (10 sec: 45874.9, 60 sec: 48059.7, 300 sec: 47541.4). Total num frames: 1133412352. Throughput: 0: 47429.0. Samples: 1134645000. Policy #0 lag: (min: 4.0, avg: 40.2, max: 80.0) [2024-03-21 05:28:55,522][03784] Avg episode reward: [(0, '0.796')] [2024-03-21 05:28:57,209][04017] Updated weights for policy 0, policy_version 34592 (0.0011) [2024-03-21 05:29:00,521][03784] Fps is (10 sec: 58982.9, 60 sec: 48059.8, 300 sec: 47652.5). Total num frames: 1133674496. Throughput: 0: 47593.3. Samples: 1134786100. Policy #0 lag: (min: 4.0, avg: 40.2, max: 80.0) [2024-03-21 05:29:00,522][03784] Avg episode reward: [(0, '1.088')] [2024-03-21 05:29:04,078][04017] Updated weights for policy 0, policy_version 34602 (0.0012) [2024-03-21 05:29:05,521][03784] Fps is (10 sec: 49152.2, 60 sec: 50244.4, 300 sec: 47430.3). Total num frames: 1133903872. Throughput: 0: 47353.3. Samples: 1135051100. Policy #0 lag: (min: 4.0, avg: 40.2, max: 80.0) [2024-03-21 05:29:05,522][03784] Avg episode reward: [(0, '1.088')] [2024-03-21 05:29:10,521][03784] Fps is (10 sec: 42598.6, 60 sec: 49698.3, 300 sec: 46986.0). Total num frames: 1134100480. Throughput: 0: 47604.5. Samples: 1135342000. Policy #0 lag: (min: 0.0, avg: 40.5, max: 87.0) [2024-03-21 05:29:10,522][03784] Avg episode reward: [(0, '1.168')] [2024-03-21 05:29:12,234][04017] Updated weights for policy 0, policy_version 34612 (0.0010) [2024-03-21 05:29:15,521][03784] Fps is (10 sec: 45875.5, 60 sec: 50244.3, 300 sec: 47541.4). Total num frames: 1134362624. Throughput: 0: 46746.8. Samples: 1135461400. Policy #0 lag: (min: 0.0, avg: 40.5, max: 87.0) [2024-03-21 05:29:15,522][03784] Avg episode reward: [(0, '1.362')] [2024-03-21 05:29:20,521][03784] Fps is (10 sec: 36044.4, 60 sec: 45875.2, 300 sec: 47208.1). Total num frames: 1134460928. Throughput: 0: 46742.1. Samples: 1135763900. Policy #0 lag: (min: 0.0, avg: 40.5, max: 87.0) [2024-03-21 05:29:20,522][03784] Avg episode reward: [(0, '1.353')] [2024-03-21 05:29:20,780][04017] Updated weights for policy 0, policy_version 34622 (0.0019) [2024-03-21 05:29:25,521][03784] Fps is (10 sec: 42597.7, 60 sec: 46421.2, 300 sec: 47763.5). Total num frames: 1134788608. Throughput: 0: 47055.5. Samples: 1136059700. Policy #0 lag: (min: 0.0, avg: 40.5, max: 87.0) [2024-03-21 05:29:25,522][03784] Avg episode reward: [(0, '1.178')] [2024-03-21 05:29:25,802][04017] Updated weights for policy 0, policy_version 34632 (0.0017) [2024-03-21 05:29:30,411][03995] Signal inference workers to stop experience collection... (22850 times) [2024-03-21 05:29:30,457][04017] InferenceWorker_p0-w0: stopping experience collection (22850 times) [2024-03-21 05:29:30,471][03995] Signal inference workers to resume experience collection... (22850 times) [2024-03-21 05:29:30,508][04017] InferenceWorker_p0-w0: resuming experience collection (22850 times) [2024-03-21 05:29:30,521][03784] Fps is (10 sec: 49151.8, 60 sec: 46421.3, 300 sec: 47652.4). Total num frames: 1134952448. Throughput: 0: 47657.7. Samples: 1136218200. Policy #0 lag: (min: 0.0, avg: 34.2, max: 70.0) [2024-03-21 05:29:30,522][03784] Avg episode reward: [(0, '1.178')] [2024-03-21 05:29:33,672][04017] Updated weights for policy 0, policy_version 34642 (0.0019) [2024-03-21 05:29:35,521][03784] Fps is (10 sec: 49152.5, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 1135280128. Throughput: 0: 47584.5. Samples: 1136505600. Policy #0 lag: (min: 0.0, avg: 34.2, max: 70.0) [2024-03-21 05:29:35,522][03784] Avg episode reward: [(0, '1.433')] [2024-03-21 05:29:38,632][04017] Updated weights for policy 0, policy_version 34652 (0.0011) [2024-03-21 05:29:40,521][03784] Fps is (10 sec: 65536.3, 60 sec: 48059.7, 300 sec: 47874.6). Total num frames: 1135607808. Throughput: 0: 47242.2. Samples: 1136770900. Policy #0 lag: (min: 0.0, avg: 34.2, max: 70.0) [2024-03-21 05:29:40,522][03784] Avg episode reward: [(0, '1.389')] [2024-03-21 05:29:45,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46421.4, 300 sec: 47430.3). Total num frames: 1135738880. Throughput: 0: 47435.6. Samples: 1136920700. Policy #0 lag: (min: 0.0, avg: 34.2, max: 70.0) [2024-03-21 05:29:45,522][03784] Avg episode reward: [(0, '1.372')] [2024-03-21 05:29:45,860][04017] Updated weights for policy 0, policy_version 34662 (0.0015) [2024-03-21 05:29:50,521][03784] Fps is (10 sec: 39321.8, 60 sec: 48605.9, 300 sec: 47319.2). Total num frames: 1136001024. Throughput: 0: 47513.4. Samples: 1137189200. Policy #0 lag: (min: 0.0, avg: 34.2, max: 70.0) [2024-03-21 05:29:50,522][03784] Avg episode reward: [(0, '0.831')] [2024-03-21 05:29:52,411][04017] Updated weights for policy 0, policy_version 34672 (0.0015) [2024-03-21 05:29:55,521][03784] Fps is (10 sec: 42597.9, 60 sec: 45875.2, 300 sec: 47208.1). Total num frames: 1136164864. Throughput: 0: 47677.6. Samples: 1137487500. Policy #0 lag: (min: 0.0, avg: 33.3, max: 76.0) [2024-03-21 05:29:55,522][03784] Avg episode reward: [(0, '1.388')] [2024-03-21 05:30:00,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45875.2, 300 sec: 47541.4). Total num frames: 1136427008. Throughput: 0: 48071.1. Samples: 1137624600. Policy #0 lag: (min: 0.0, avg: 33.3, max: 76.0) [2024-03-21 05:30:00,522][03784] Avg episode reward: [(0, '1.132')] [2024-03-21 05:30:00,532][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000034681_1136427008.pth... [2024-03-21 05:30:00,650][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000034335_1125089280.pth [2024-03-21 05:30:01,692][04017] Updated weights for policy 0, policy_version 34682 (0.0011) [2024-03-21 05:30:05,521][03784] Fps is (10 sec: 55706.1, 60 sec: 46967.5, 300 sec: 47874.6). Total num frames: 1136721920. Throughput: 0: 48020.1. Samples: 1137924800. Policy #0 lag: (min: 0.0, avg: 33.3, max: 76.0) [2024-03-21 05:30:05,522][03784] Avg episode reward: [(0, '0.681')] [2024-03-21 05:30:06,464][04017] Updated weights for policy 0, policy_version 34692 (0.0011) [2024-03-21 05:30:10,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46967.4, 300 sec: 47652.5). Total num frames: 1136918528. Throughput: 0: 48324.6. Samples: 1138234300. Policy #0 lag: (min: 0.0, avg: 33.3, max: 76.0) [2024-03-21 05:30:10,522][03784] Avg episode reward: [(0, '1.110')] [2024-03-21 05:30:14,529][04017] Updated weights for policy 0, policy_version 34702 (0.0020) [2024-03-21 05:30:15,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46421.3, 300 sec: 47985.7). Total num frames: 1137147904. Throughput: 0: 48035.7. Samples: 1138379800. Policy #0 lag: (min: 0.0, avg: 40.7, max: 116.0) [2024-03-21 05:30:15,522][03784] Avg episode reward: [(0, '1.197')] [2024-03-21 05:30:20,387][03995] Signal inference workers to stop experience collection... (22900 times) [2024-03-21 05:30:20,387][03995] Signal inference workers to resume experience collection... (22900 times) [2024-03-21 05:30:20,521][03784] Fps is (10 sec: 45874.9, 60 sec: 48605.9, 300 sec: 47874.6). Total num frames: 1137377280. Throughput: 0: 47657.7. Samples: 1138650200. Policy #0 lag: (min: 0.0, avg: 40.7, max: 116.0) [2024-03-21 05:30:20,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 05:30:20,531][04017] InferenceWorker_p0-w0: stopping experience collection (22900 times) [2024-03-21 05:30:20,531][04017] InferenceWorker_p0-w0: resuming experience collection (22900 times) [2024-03-21 05:30:21,056][04017] Updated weights for policy 0, policy_version 34712 (0.0012) [2024-03-21 05:30:25,521][03784] Fps is (10 sec: 49151.8, 60 sec: 47513.7, 300 sec: 47652.4). Total num frames: 1137639424. Throughput: 0: 47633.4. Samples: 1138914400. Policy #0 lag: (min: 0.0, avg: 40.7, max: 116.0) [2024-03-21 05:30:25,522][03784] Avg episode reward: [(0, '0.590')] [2024-03-21 05:30:27,839][04017] Updated weights for policy 0, policy_version 34722 (0.0016) [2024-03-21 05:30:30,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 47652.4). Total num frames: 1137901568. Throughput: 0: 46759.9. Samples: 1139024900. Policy #0 lag: (min: 0.0, avg: 40.7, max: 116.0) [2024-03-21 05:30:30,522][03784] Avg episode reward: [(0, '0.821')] [2024-03-21 05:30:35,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46421.3, 300 sec: 47208.1). Total num frames: 1138065408. Throughput: 0: 46957.7. Samples: 1139302300. Policy #0 lag: (min: 0.0, avg: 40.7, max: 116.0) [2024-03-21 05:30:35,522][03784] Avg episode reward: [(0, '0.678')] [2024-03-21 05:30:36,378][04017] Updated weights for policy 0, policy_version 34732 (0.0013) [2024-03-21 05:30:40,516][04017] Updated weights for policy 0, policy_version 34742 (0.0012) [2024-03-21 05:30:40,521][03784] Fps is (10 sec: 52429.2, 60 sec: 46967.5, 300 sec: 47541.4). Total num frames: 1138425856. Throughput: 0: 46215.6. Samples: 1139567200. Policy #0 lag: (min: 0.0, avg: 41.6, max: 106.0) [2024-03-21 05:30:40,522][03784] Avg episode reward: [(0, '0.721')] [2024-03-21 05:30:45,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46421.3, 300 sec: 47208.1). Total num frames: 1138524160. Throughput: 0: 46622.2. Samples: 1139722600. Policy #0 lag: (min: 0.0, avg: 41.6, max: 106.0) [2024-03-21 05:30:45,522][03784] Avg episode reward: [(0, '0.803')] [2024-03-21 05:30:49,135][04017] Updated weights for policy 0, policy_version 34752 (0.0015) [2024-03-21 05:30:50,521][03784] Fps is (10 sec: 39321.7, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 1138819072. Throughput: 0: 46800.0. Samples: 1140030800. Policy #0 lag: (min: 0.0, avg: 41.6, max: 106.0) [2024-03-21 05:30:50,522][03784] Avg episode reward: [(0, '0.803')] [2024-03-21 05:30:55,521][03784] Fps is (10 sec: 52429.0, 60 sec: 48059.8, 300 sec: 47430.3). Total num frames: 1139048448. Throughput: 0: 46557.8. Samples: 1140329400. Policy #0 lag: (min: 0.0, avg: 41.6, max: 106.0) [2024-03-21 05:30:55,522][03784] Avg episode reward: [(0, '0.933')] [2024-03-21 05:30:55,561][04017] Updated weights for policy 0, policy_version 34762 (0.0015) [2024-03-21 05:30:59,523][04017] Updated weights for policy 0, policy_version 34772 (0.0011) [2024-03-21 05:31:00,521][03784] Fps is (10 sec: 65536.3, 60 sec: 50790.5, 300 sec: 48541.1). Total num frames: 1139474432. Throughput: 0: 46413.4. Samples: 1140468400. Policy #0 lag: (min: 2.0, avg: 50.5, max: 115.0) [2024-03-21 05:31:00,522][03784] Avg episode reward: [(0, '1.371')] [2024-03-21 05:31:05,521][03784] Fps is (10 sec: 65536.0, 60 sec: 49698.1, 300 sec: 48763.2). Total num frames: 1139703808. Throughput: 0: 47106.7. Samples: 1140770000. Policy #0 lag: (min: 2.0, avg: 50.5, max: 115.0) [2024-03-21 05:31:05,522][03784] Avg episode reward: [(0, '1.371')] [2024-03-21 05:31:06,679][04017] Updated weights for policy 0, policy_version 34782 (0.0011) [2024-03-21 05:31:07,097][03995] Signal inference workers to stop experience collection... (22950 times) [2024-03-21 05:31:07,098][03995] Signal inference workers to resume experience collection... (22950 times) [2024-03-21 05:31:07,171][04017] InferenceWorker_p0-w0: stopping experience collection (22950 times) [2024-03-21 05:31:07,171][04017] InferenceWorker_p0-w0: resuming experience collection (22950 times) [2024-03-21 05:31:10,521][03784] Fps is (10 sec: 36044.4, 60 sec: 48605.8, 300 sec: 48430.0). Total num frames: 1139834880. Throughput: 0: 47704.4. Samples: 1141061100. Policy #0 lag: (min: 2.0, avg: 50.5, max: 115.0) [2024-03-21 05:31:10,522][03784] Avg episode reward: [(0, '0.786')] [2024-03-21 05:31:15,521][03784] Fps is (10 sec: 26214.5, 60 sec: 46967.5, 300 sec: 47541.4). Total num frames: 1139965952. Throughput: 0: 48446.8. Samples: 1141205000. Policy #0 lag: (min: 2.0, avg: 50.5, max: 115.0) [2024-03-21 05:31:15,522][03784] Avg episode reward: [(0, '0.886')] [2024-03-21 05:31:19,167][04017] Updated weights for policy 0, policy_version 34792 (0.0014) [2024-03-21 05:31:20,521][03784] Fps is (10 sec: 29491.1, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 1140129792. Throughput: 0: 48420.0. Samples: 1141481200. Policy #0 lag: (min: 2.0, avg: 50.5, max: 115.0) [2024-03-21 05:31:20,522][03784] Avg episode reward: [(0, '1.164')] [2024-03-21 05:31:23,724][04017] Updated weights for policy 0, policy_version 34802 (0.0019) [2024-03-21 05:31:25,521][03784] Fps is (10 sec: 55705.6, 60 sec: 48059.8, 300 sec: 47874.6). Total num frames: 1140523008. Throughput: 0: 48922.3. Samples: 1141768700. Policy #0 lag: (min: 3.0, avg: 40.0, max: 76.0) [2024-03-21 05:31:25,522][03784] Avg episode reward: [(0, '0.949')] [2024-03-21 05:31:30,521][03784] Fps is (10 sec: 49152.0, 60 sec: 45329.1, 300 sec: 47208.1). Total num frames: 1140621312. Throughput: 0: 48937.7. Samples: 1141924800. Policy #0 lag: (min: 3.0, avg: 40.0, max: 76.0) [2024-03-21 05:31:30,522][03784] Avg episode reward: [(0, '1.227')] [2024-03-21 05:31:33,155][04017] Updated weights for policy 0, policy_version 34812 (0.0013) [2024-03-21 05:31:35,521][03784] Fps is (10 sec: 26214.1, 60 sec: 45329.0, 300 sec: 46541.7). Total num frames: 1140785152. Throughput: 0: 48922.1. Samples: 1142232300. Policy #0 lag: (min: 3.0, avg: 40.0, max: 76.0) [2024-03-21 05:31:35,522][03784] Avg episode reward: [(0, '1.452')] [2024-03-21 05:31:39,218][04017] Updated weights for policy 0, policy_version 34822 (0.0011) [2024-03-21 05:31:40,521][03784] Fps is (10 sec: 52429.4, 60 sec: 45329.1, 300 sec: 47430.3). Total num frames: 1141145600. Throughput: 0: 48615.6. Samples: 1142517100. Policy #0 lag: (min: 3.0, avg: 40.0, max: 76.0) [2024-03-21 05:31:40,522][03784] Avg episode reward: [(0, '1.150')] [2024-03-21 05:31:45,521][03784] Fps is (10 sec: 55705.9, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 1141342208. Throughput: 0: 48695.5. Samples: 1142659700. Policy #0 lag: (min: 3.0, avg: 40.0, max: 76.0) [2024-03-21 05:31:45,522][03784] Avg episode reward: [(0, '1.046')] [2024-03-21 05:31:46,329][04017] Updated weights for policy 0, policy_version 34832 (0.0010) [2024-03-21 05:31:50,521][03784] Fps is (10 sec: 55704.8, 60 sec: 48059.6, 300 sec: 47652.4). Total num frames: 1141702656. Throughput: 0: 48313.2. Samples: 1142944100. Policy #0 lag: (min: 0.0, avg: 30.3, max: 63.0) [2024-03-21 05:31:50,522][03784] Avg episode reward: [(0, '1.261')] [2024-03-21 05:31:50,543][04017] Updated weights for policy 0, policy_version 34842 (0.0019) [2024-03-21 05:31:55,521][03784] Fps is (10 sec: 65537.1, 60 sec: 49152.1, 300 sec: 47763.6). Total num frames: 1141997568. Throughput: 0: 48162.4. Samples: 1143228400. Policy #0 lag: (min: 0.0, avg: 30.3, max: 63.0) [2024-03-21 05:31:55,521][03784] Avg episode reward: [(0, '1.250')] [2024-03-21 05:31:55,822][04017] Updated weights for policy 0, policy_version 34852 (0.0016) [2024-03-21 05:32:00,521][03784] Fps is (10 sec: 58983.3, 60 sec: 46967.4, 300 sec: 47874.6). Total num frames: 1142292480. Throughput: 0: 47566.7. Samples: 1143345500. Policy #0 lag: (min: 0.0, avg: 30.3, max: 63.0) [2024-03-21 05:32:00,522][03784] Avg episode reward: [(0, '1.267')] [2024-03-21 05:32:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000034860_1142292480.pth... [2024-03-21 05:32:00,647][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000034509_1130790912.pth [2024-03-21 05:32:03,719][03995] Signal inference workers to stop experience collection... (23000 times) [2024-03-21 05:32:03,720][03995] Signal inference workers to resume experience collection... (23000 times) [2024-03-21 05:32:03,799][04017] InferenceWorker_p0-w0: stopping experience collection (23000 times) [2024-03-21 05:32:03,799][04017] InferenceWorker_p0-w0: resuming experience collection (23000 times) [2024-03-21 05:32:04,047][04017] Updated weights for policy 0, policy_version 34862 (0.0016) [2024-03-21 05:32:05,521][03784] Fps is (10 sec: 42597.9, 60 sec: 45329.1, 300 sec: 47985.7). Total num frames: 1142423552. Throughput: 0: 48309.0. Samples: 1143655100. Policy #0 lag: (min: 0.0, avg: 30.3, max: 63.0) [2024-03-21 05:32:05,522][03784] Avg episode reward: [(0, '0.649')] [2024-03-21 05:32:09,518][04017] Updated weights for policy 0, policy_version 34872 (0.0011) [2024-03-21 05:32:10,521][03784] Fps is (10 sec: 45875.3, 60 sec: 48606.0, 300 sec: 48318.9). Total num frames: 1142751232. Throughput: 0: 48433.4. Samples: 1143948200. Policy #0 lag: (min: 2.0, avg: 43.3, max: 87.0) [2024-03-21 05:32:10,522][03784] Avg episode reward: [(0, '1.120')] [2024-03-21 05:32:15,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48605.9, 300 sec: 47874.6). Total num frames: 1142882304. Throughput: 0: 48142.3. Samples: 1144091200. Policy #0 lag: (min: 2.0, avg: 43.3, max: 87.0) [2024-03-21 05:32:15,522][03784] Avg episode reward: [(0, '1.032')] [2024-03-21 05:32:17,345][04017] Updated weights for policy 0, policy_version 34882 (0.0012) [2024-03-21 05:32:20,521][03784] Fps is (10 sec: 36044.6, 60 sec: 49698.2, 300 sec: 47208.1). Total num frames: 1143111680. Throughput: 0: 47829.0. Samples: 1144384600. Policy #0 lag: (min: 2.0, avg: 43.3, max: 87.0) [2024-03-21 05:32:20,522][03784] Avg episode reward: [(0, '0.637')] [2024-03-21 05:32:25,521][03784] Fps is (10 sec: 39320.9, 60 sec: 45875.0, 300 sec: 46986.0). Total num frames: 1143275520. Throughput: 0: 48226.4. Samples: 1144687300. Policy #0 lag: (min: 2.0, avg: 43.3, max: 87.0) [2024-03-21 05:32:25,523][03784] Avg episode reward: [(0, '1.321')] [2024-03-21 05:32:26,135][04017] Updated weights for policy 0, policy_version 34892 (0.0023) [2024-03-21 05:32:30,521][03784] Fps is (10 sec: 36044.6, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 1143472128. Throughput: 0: 48404.4. Samples: 1144837900. Policy #0 lag: (min: 2.0, avg: 43.3, max: 87.0) [2024-03-21 05:32:30,522][03784] Avg episode reward: [(0, '1.321')] [2024-03-21 05:32:33,725][04017] Updated weights for policy 0, policy_version 34902 (0.0020) [2024-03-21 05:32:35,521][03784] Fps is (10 sec: 52429.4, 60 sec: 50244.3, 300 sec: 46874.9). Total num frames: 1143799808. Throughput: 0: 48395.6. Samples: 1145121900. Policy #0 lag: (min: 1.0, avg: 27.0, max: 66.0) [2024-03-21 05:32:35,522][03784] Avg episode reward: [(0, '0.992')] [2024-03-21 05:32:38,900][04017] Updated weights for policy 0, policy_version 34912 (0.0011) [2024-03-21 05:32:40,521][03784] Fps is (10 sec: 62259.3, 60 sec: 49151.9, 300 sec: 47319.2). Total num frames: 1144094720. Throughput: 0: 48384.3. Samples: 1145405700. Policy #0 lag: (min: 1.0, avg: 27.0, max: 66.0) [2024-03-21 05:32:40,523][03784] Avg episode reward: [(0, '1.224')] [2024-03-21 05:32:45,496][04017] Updated weights for policy 0, policy_version 34922 (0.0011) [2024-03-21 05:32:45,521][03784] Fps is (10 sec: 52428.8, 60 sec: 49698.1, 300 sec: 47763.5). Total num frames: 1144324096. Throughput: 0: 49313.2. Samples: 1145564600. Policy #0 lag: (min: 1.0, avg: 27.0, max: 66.0) [2024-03-21 05:32:45,522][03784] Avg episode reward: [(0, '0.907')] [2024-03-21 05:32:50,521][03784] Fps is (10 sec: 52428.8, 60 sec: 48605.9, 300 sec: 47763.5). Total num frames: 1144619008. Throughput: 0: 48775.5. Samples: 1145850000. Policy #0 lag: (min: 1.0, avg: 27.0, max: 66.0) [2024-03-21 05:32:50,522][03784] Avg episode reward: [(0, '1.155')] [2024-03-21 05:32:52,167][04017] Updated weights for policy 0, policy_version 34932 (0.0012) [2024-03-21 05:32:55,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45875.1, 300 sec: 47319.2). Total num frames: 1144750080. Throughput: 0: 49224.3. Samples: 1146163300. Policy #0 lag: (min: 0.0, avg: 30.1, max: 70.0) [2024-03-21 05:32:55,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 05:32:58,824][03995] Signal inference workers to stop experience collection... (23050 times) [2024-03-21 05:32:58,893][04017] InferenceWorker_p0-w0: stopping experience collection (23050 times) [2024-03-21 05:32:59,052][03995] Signal inference workers to resume experience collection... (23050 times) [2024-03-21 05:32:59,053][04017] InferenceWorker_p0-w0: resuming experience collection (23050 times) [2024-03-21 05:32:59,353][04017] Updated weights for policy 0, policy_version 34942 (0.0011) [2024-03-21 05:33:00,521][03784] Fps is (10 sec: 45875.5, 60 sec: 46421.3, 300 sec: 48096.8). Total num frames: 1145077760. Throughput: 0: 49148.9. Samples: 1146302900. Policy #0 lag: (min: 0.0, avg: 30.1, max: 70.0) [2024-03-21 05:33:00,522][03784] Avg episode reward: [(0, '0.745')] [2024-03-21 05:33:05,521][03784] Fps is (10 sec: 52428.9, 60 sec: 47513.6, 300 sec: 47985.7). Total num frames: 1145274368. Throughput: 0: 48653.3. Samples: 1146574000. Policy #0 lag: (min: 0.0, avg: 30.1, max: 70.0) [2024-03-21 05:33:05,522][03784] Avg episode reward: [(0, '1.334')] [2024-03-21 05:33:05,598][04017] Updated weights for policy 0, policy_version 34952 (0.0017) [2024-03-21 05:33:10,521][03784] Fps is (10 sec: 39321.2, 60 sec: 45329.0, 300 sec: 47874.6). Total num frames: 1145470976. Throughput: 0: 48204.6. Samples: 1146856500. Policy #0 lag: (min: 0.0, avg: 30.1, max: 70.0) [2024-03-21 05:33:10,522][03784] Avg episode reward: [(0, '0.742')] [2024-03-21 05:33:12,937][04017] Updated weights for policy 0, policy_version 34962 (0.0012) [2024-03-21 05:33:15,521][03784] Fps is (10 sec: 49151.9, 60 sec: 48059.7, 300 sec: 47652.4). Total num frames: 1145765888. Throughput: 0: 47944.5. Samples: 1146995400. Policy #0 lag: (min: 0.0, avg: 30.1, max: 70.0) [2024-03-21 05:33:15,522][03784] Avg episode reward: [(0, '1.414')] [2024-03-21 05:33:19,678][04017] Updated weights for policy 0, policy_version 34972 (0.0011) [2024-03-21 05:33:20,521][03784] Fps is (10 sec: 52429.2, 60 sec: 48059.7, 300 sec: 47430.3). Total num frames: 1145995264. Throughput: 0: 48193.4. Samples: 1147290600. Policy #0 lag: (min: 0.0, avg: 41.9, max: 90.0) [2024-03-21 05:33:20,522][03784] Avg episode reward: [(0, '0.946')] [2024-03-21 05:33:25,521][03784] Fps is (10 sec: 36045.3, 60 sec: 47513.8, 300 sec: 47319.2). Total num frames: 1146126336. Throughput: 0: 48137.9. Samples: 1147571900. Policy #0 lag: (min: 0.0, avg: 41.9, max: 90.0) [2024-03-21 05:33:25,521][03784] Avg episode reward: [(0, '1.298')] [2024-03-21 05:33:27,274][04017] Updated weights for policy 0, policy_version 34982 (0.0019) [2024-03-21 05:33:30,521][03784] Fps is (10 sec: 45875.1, 60 sec: 49698.2, 300 sec: 47541.4). Total num frames: 1146454016. Throughput: 0: 47397.8. Samples: 1147697500. Policy #0 lag: (min: 0.0, avg: 41.9, max: 90.0) [2024-03-21 05:33:30,522][03784] Avg episode reward: [(0, '1.035')] [2024-03-21 05:33:33,303][04017] Updated weights for policy 0, policy_version 34992 (0.0015) [2024-03-21 05:33:35,521][03784] Fps is (10 sec: 58981.3, 60 sec: 48605.8, 300 sec: 47430.3). Total num frames: 1146716160. Throughput: 0: 46951.1. Samples: 1147962800. Policy #0 lag: (min: 0.0, avg: 41.9, max: 90.0) [2024-03-21 05:33:35,522][03784] Avg episode reward: [(0, '1.011')] [2024-03-21 05:33:40,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46967.5, 300 sec: 47319.2). Total num frames: 1146912768. Throughput: 0: 46511.2. Samples: 1148256300. Policy #0 lag: (min: 0.0, avg: 41.9, max: 90.0) [2024-03-21 05:33:40,522][03784] Avg episode reward: [(0, '1.011')] [2024-03-21 05:33:42,314][04017] Updated weights for policy 0, policy_version 35003 (0.0015) [2024-03-21 05:33:45,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45875.2, 300 sec: 47430.3). Total num frames: 1147076608. Throughput: 0: 46484.3. Samples: 1148394700. Policy #0 lag: (min: 0.0, avg: 47.6, max: 109.0) [2024-03-21 05:33:45,522][03784] Avg episode reward: [(0, '0.498')] [2024-03-21 05:33:47,224][03995] Signal inference workers to stop experience collection... (23100 times) [2024-03-21 05:33:47,225][03995] Signal inference workers to resume experience collection... (23100 times) [2024-03-21 05:33:47,275][04017] InferenceWorker_p0-w0: stopping experience collection (23100 times) [2024-03-21 05:33:47,275][04017] InferenceWorker_p0-w0: resuming experience collection (23100 times) [2024-03-21 05:33:49,461][04017] Updated weights for policy 0, policy_version 35013 (0.0015) [2024-03-21 05:33:50,521][03784] Fps is (10 sec: 42598.0, 60 sec: 45329.0, 300 sec: 47208.1). Total num frames: 1147338752. Throughput: 0: 46846.6. Samples: 1148682100. Policy #0 lag: (min: 0.0, avg: 47.6, max: 109.0) [2024-03-21 05:33:50,522][03784] Avg episode reward: [(0, '1.238')] [2024-03-21 05:33:53,953][04017] Updated weights for policy 0, policy_version 35023 (0.0010) [2024-03-21 05:33:55,521][03784] Fps is (10 sec: 55706.1, 60 sec: 48059.8, 300 sec: 47319.2). Total num frames: 1147633664. Throughput: 0: 46933.4. Samples: 1148968500. Policy #0 lag: (min: 0.0, avg: 47.6, max: 109.0) [2024-03-21 05:33:55,522][03784] Avg episode reward: [(0, '1.348')] [2024-03-21 05:34:00,521][03784] Fps is (10 sec: 55705.6, 60 sec: 46967.4, 300 sec: 47430.3). Total num frames: 1147895808. Throughput: 0: 46933.3. Samples: 1149107400. Policy #0 lag: (min: 0.0, avg: 47.6, max: 109.0) [2024-03-21 05:34:00,522][03784] Avg episode reward: [(0, '1.492')] [2024-03-21 05:34:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000035031_1147895808.pth... [2024-03-21 05:34:00,663][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000034681_1136427008.pth [2024-03-21 05:34:03,009][04017] Updated weights for policy 0, policy_version 35033 (0.0015) [2024-03-21 05:34:05,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 1148092416. Throughput: 0: 46922.2. Samples: 1149402100. Policy #0 lag: (min: 0.0, avg: 47.6, max: 109.0) [2024-03-21 05:34:05,522][03784] Avg episode reward: [(0, '0.732')] [2024-03-21 05:34:10,038][04017] Updated weights for policy 0, policy_version 35043 (0.0010) [2024-03-21 05:34:10,521][03784] Fps is (10 sec: 42598.7, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 1148321792. Throughput: 0: 47268.8. Samples: 1149699000. Policy #0 lag: (min: 0.0, avg: 51.5, max: 115.0) [2024-03-21 05:34:10,522][03784] Avg episode reward: [(0, '1.206')] [2024-03-21 05:34:15,521][03784] Fps is (10 sec: 49151.8, 60 sec: 46967.5, 300 sec: 47874.6). Total num frames: 1148583936. Throughput: 0: 47286.6. Samples: 1149825400. Policy #0 lag: (min: 0.0, avg: 51.5, max: 115.0) [2024-03-21 05:34:15,522][03784] Avg episode reward: [(0, '1.206')] [2024-03-21 05:34:15,860][04017] Updated weights for policy 0, policy_version 35053 (0.0015) [2024-03-21 05:34:20,521][03784] Fps is (10 sec: 52428.6, 60 sec: 47513.6, 300 sec: 47652.5). Total num frames: 1148846080. Throughput: 0: 47713.4. Samples: 1150109900. Policy #0 lag: (min: 0.0, avg: 51.5, max: 115.0) [2024-03-21 05:34:20,522][03784] Avg episode reward: [(0, '0.495')] [2024-03-21 05:34:24,274][04017] Updated weights for policy 0, policy_version 35063 (0.0011) [2024-03-21 05:34:25,521][03784] Fps is (10 sec: 45874.6, 60 sec: 48605.7, 300 sec: 47763.5). Total num frames: 1149042688. Throughput: 0: 47079.8. Samples: 1150374900. Policy #0 lag: (min: 0.0, avg: 51.5, max: 115.0) [2024-03-21 05:34:25,522][03784] Avg episode reward: [(0, '1.157')] [2024-03-21 05:34:30,260][04017] Updated weights for policy 0, policy_version 35073 (0.0011) [2024-03-21 05:34:30,521][03784] Fps is (10 sec: 42598.0, 60 sec: 46967.4, 300 sec: 47430.3). Total num frames: 1149272064. Throughput: 0: 47351.0. Samples: 1150525500. Policy #0 lag: (min: 1.0, avg: 41.8, max: 114.0) [2024-03-21 05:34:30,522][03784] Avg episode reward: [(0, '1.391')] [2024-03-21 05:34:34,774][04017] Updated weights for policy 0, policy_version 35083 (0.0012) [2024-03-21 05:34:35,019][03995] Signal inference workers to stop experience collection... (23150 times) [2024-03-21 05:34:35,103][03995] Signal inference workers to resume experience collection... (23150 times) [2024-03-21 05:34:35,117][04017] InferenceWorker_p0-w0: stopping experience collection (23150 times) [2024-03-21 05:34:35,173][04017] InferenceWorker_p0-w0: resuming experience collection (23150 times) [2024-03-21 05:34:35,521][03784] Fps is (10 sec: 62260.3, 60 sec: 49152.1, 300 sec: 47652.5). Total num frames: 1149665280. Throughput: 0: 47289.0. Samples: 1150810100. Policy #0 lag: (min: 1.0, avg: 41.8, max: 114.0) [2024-03-21 05:34:35,522][03784] Avg episode reward: [(0, '0.932')] [2024-03-21 05:34:40,521][03784] Fps is (10 sec: 49152.0, 60 sec: 47513.5, 300 sec: 47541.3). Total num frames: 1149763584. Throughput: 0: 47810.9. Samples: 1151120000. Policy #0 lag: (min: 1.0, avg: 41.8, max: 114.0) [2024-03-21 05:34:40,522][03784] Avg episode reward: [(0, '0.598')] [2024-03-21 05:34:43,144][04017] Updated weights for policy 0, policy_version 35093 (0.0010) [2024-03-21 05:34:45,521][03784] Fps is (10 sec: 29490.7, 60 sec: 48059.7, 300 sec: 47319.2). Total num frames: 1149960192. Throughput: 0: 47873.3. Samples: 1151261700. Policy #0 lag: (min: 1.0, avg: 41.8, max: 114.0) [2024-03-21 05:34:45,522][03784] Avg episode reward: [(0, '0.598')] [2024-03-21 05:34:50,521][03784] Fps is (10 sec: 42599.1, 60 sec: 47513.7, 300 sec: 47541.4). Total num frames: 1150189568. Throughput: 0: 47135.6. Samples: 1151523200. Policy #0 lag: (min: 1.0, avg: 41.8, max: 114.0) [2024-03-21 05:34:50,522][03784] Avg episode reward: [(0, '1.349')] [2024-03-21 05:34:52,692][04017] Updated weights for policy 0, policy_version 35103 (0.0013) [2024-03-21 05:34:55,521][03784] Fps is (10 sec: 49152.4, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 1150451712. Throughput: 0: 47022.2. Samples: 1151815000. Policy #0 lag: (min: 1.0, avg: 28.5, max: 62.0) [2024-03-21 05:34:55,522][03784] Avg episode reward: [(0, '1.382')] [2024-03-21 05:35:00,521][03784] Fps is (10 sec: 32768.0, 60 sec: 43690.7, 300 sec: 46763.8). Total num frames: 1150517248. Throughput: 0: 47846.7. Samples: 1151978500. Policy #0 lag: (min: 1.0, avg: 28.5, max: 62.0) [2024-03-21 05:35:00,522][03784] Avg episode reward: [(0, '1.515')] [2024-03-21 05:35:00,909][04017] Updated weights for policy 0, policy_version 35113 (0.0011) [2024-03-21 05:35:05,521][03784] Fps is (10 sec: 39321.7, 60 sec: 45875.2, 300 sec: 47208.1). Total num frames: 1150844928. Throughput: 0: 47831.1. Samples: 1152262300. Policy #0 lag: (min: 1.0, avg: 28.5, max: 62.0) [2024-03-21 05:35:05,522][03784] Avg episode reward: [(0, '1.515')] [2024-03-21 05:35:06,676][04017] Updated weights for policy 0, policy_version 35123 (0.0012) [2024-03-21 05:35:10,521][03784] Fps is (10 sec: 62259.4, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 1151139840. Throughput: 0: 47771.4. Samples: 1152524600. Policy #0 lag: (min: 1.0, avg: 28.5, max: 62.0) [2024-03-21 05:35:10,521][03784] Avg episode reward: [(0, '1.480')] [2024-03-21 05:35:12,788][04017] Updated weights for policy 0, policy_version 35133 (0.0016) [2024-03-21 05:35:15,521][03784] Fps is (10 sec: 52428.6, 60 sec: 46421.3, 300 sec: 47430.3). Total num frames: 1151369216. Throughput: 0: 47464.5. Samples: 1152661400. Policy #0 lag: (min: 1.0, avg: 28.5, max: 62.0) [2024-03-21 05:35:15,522][03784] Avg episode reward: [(0, '1.170')] [2024-03-21 05:35:18,381][04017] Updated weights for policy 0, policy_version 35143 (0.0011) [2024-03-21 05:35:20,522][03784] Fps is (10 sec: 55702.6, 60 sec: 47513.3, 300 sec: 47652.4). Total num frames: 1151696896. Throughput: 0: 47381.7. Samples: 1152942300. Policy #0 lag: (min: 0.0, avg: 49.4, max: 108.0) [2024-03-21 05:35:20,522][03784] Avg episode reward: [(0, '0.758')] [2024-03-21 05:35:24,811][04017] Updated weights for policy 0, policy_version 35153 (0.0023) [2024-03-21 05:35:25,521][03784] Fps is (10 sec: 58982.5, 60 sec: 48605.9, 300 sec: 47652.4). Total num frames: 1151959040. Throughput: 0: 46953.4. Samples: 1153232900. Policy #0 lag: (min: 0.0, avg: 49.4, max: 108.0) [2024-03-21 05:35:25,522][03784] Avg episode reward: [(0, '1.309')] [2024-03-21 05:35:29,256][03995] Signal inference workers to stop experience collection... (23200 times) [2024-03-21 05:35:29,257][03995] Signal inference workers to resume experience collection... (23200 times) [2024-03-21 05:35:29,323][04017] InferenceWorker_p0-w0: stopping experience collection (23200 times) [2024-03-21 05:35:29,323][04017] InferenceWorker_p0-w0: resuming experience collection (23200 times) [2024-03-21 05:35:29,861][04017] Updated weights for policy 0, policy_version 35163 (0.0015) [2024-03-21 05:35:30,521][03784] Fps is (10 sec: 55708.0, 60 sec: 49698.2, 300 sec: 48096.8). Total num frames: 1152253952. Throughput: 0: 46989.0. Samples: 1153376200. Policy #0 lag: (min: 0.0, avg: 49.4, max: 108.0) [2024-03-21 05:35:30,522][03784] Avg episode reward: [(0, '1.321')] [2024-03-21 05:35:35,521][03784] Fps is (10 sec: 52429.2, 60 sec: 46967.5, 300 sec: 47652.4). Total num frames: 1152483328. Throughput: 0: 47675.5. Samples: 1153668600. Policy #0 lag: (min: 0.0, avg: 49.4, max: 108.0) [2024-03-21 05:35:35,522][03784] Avg episode reward: [(0, '0.955')] [2024-03-21 05:35:38,613][04017] Updated weights for policy 0, policy_version 35173 (0.0011) [2024-03-21 05:35:40,521][03784] Fps is (10 sec: 29491.3, 60 sec: 46421.4, 300 sec: 47541.4). Total num frames: 1152548864. Throughput: 0: 48009.0. Samples: 1153975400. Policy #0 lag: (min: 1.0, avg: 30.4, max: 65.0) [2024-03-21 05:35:40,522][03784] Avg episode reward: [(0, '1.111')] [2024-03-21 05:35:45,521][03784] Fps is (10 sec: 36044.9, 60 sec: 48059.9, 300 sec: 47541.4). Total num frames: 1152843776. Throughput: 0: 47235.6. Samples: 1154104100. Policy #0 lag: (min: 1.0, avg: 30.4, max: 65.0) [2024-03-21 05:35:45,522][03784] Avg episode reward: [(0, '0.739')] [2024-03-21 05:35:45,679][04017] Updated weights for policy 0, policy_version 35183 (0.0012) [2024-03-21 05:35:50,521][03784] Fps is (10 sec: 62258.4, 60 sec: 49698.0, 300 sec: 47874.6). Total num frames: 1153171456. Throughput: 0: 47586.6. Samples: 1154403700. Policy #0 lag: (min: 1.0, avg: 30.4, max: 65.0) [2024-03-21 05:35:50,522][03784] Avg episode reward: [(0, '0.739')] [2024-03-21 05:35:50,816][04017] Updated weights for policy 0, policy_version 35193 (0.0011) [2024-03-21 05:35:55,521][03784] Fps is (10 sec: 45875.0, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 1153302528. Throughput: 0: 48537.7. Samples: 1154708800. Policy #0 lag: (min: 1.0, avg: 30.4, max: 65.0) [2024-03-21 05:35:55,522][03784] Avg episode reward: [(0, '1.424')] [2024-03-21 05:35:58,198][04017] Updated weights for policy 0, policy_version 35203 (0.0020) [2024-03-21 05:36:00,521][03784] Fps is (10 sec: 49152.5, 60 sec: 52428.7, 300 sec: 47319.2). Total num frames: 1153662976. Throughput: 0: 48620.1. Samples: 1154849300. Policy #0 lag: (min: 1.0, avg: 30.4, max: 65.0) [2024-03-21 05:36:00,522][03784] Avg episode reward: [(0, '1.424')] [2024-03-21 05:36:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000035207_1153662976.pth... [2024-03-21 05:36:00,658][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000034860_1142292480.pth [2024-03-21 05:36:05,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49698.1, 300 sec: 47430.3). Total num frames: 1153826816. Throughput: 0: 48947.2. Samples: 1155144900. Policy #0 lag: (min: 0.0, avg: 47.1, max: 98.0) [2024-03-21 05:36:05,522][03784] Avg episode reward: [(0, '1.487')] [2024-03-21 05:36:07,432][04017] Updated weights for policy 0, policy_version 35213 (0.0011) [2024-03-21 05:36:10,521][03784] Fps is (10 sec: 32767.7, 60 sec: 47513.5, 300 sec: 47541.3). Total num frames: 1153990656. Throughput: 0: 48735.5. Samples: 1155426000. Policy #0 lag: (min: 0.0, avg: 47.1, max: 98.0) [2024-03-21 05:36:10,522][03784] Avg episode reward: [(0, '1.636')] [2024-03-21 05:36:13,655][04017] Updated weights for policy 0, policy_version 35223 (0.0012) [2024-03-21 05:36:15,521][03784] Fps is (10 sec: 52428.6, 60 sec: 49698.1, 300 sec: 48207.8). Total num frames: 1154351104. Throughput: 0: 48671.1. Samples: 1155566400. Policy #0 lag: (min: 0.0, avg: 47.1, max: 98.0) [2024-03-21 05:36:15,522][03784] Avg episode reward: [(0, '1.018')] [2024-03-21 05:36:20,262][04017] Updated weights for policy 0, policy_version 35233 (0.0014) [2024-03-21 05:36:20,521][03784] Fps is (10 sec: 52429.3, 60 sec: 46967.8, 300 sec: 47430.3). Total num frames: 1154514944. Throughput: 0: 48537.8. Samples: 1155852800. Policy #0 lag: (min: 0.0, avg: 47.1, max: 98.0) [2024-03-21 05:36:20,522][03784] Avg episode reward: [(0, '1.459')] [2024-03-21 05:36:25,521][03784] Fps is (10 sec: 39321.9, 60 sec: 46421.4, 300 sec: 47874.6). Total num frames: 1154744320. Throughput: 0: 47422.2. Samples: 1156109400. Policy #0 lag: (min: 0.0, avg: 47.1, max: 98.0) [2024-03-21 05:36:25,522][03784] Avg episode reward: [(0, '0.757')] [2024-03-21 05:36:28,733][03995] Signal inference workers to stop experience collection... (23250 times) [2024-03-21 05:36:28,749][04017] InferenceWorker_p0-w0: stopping experience collection (23250 times) [2024-03-21 05:36:28,967][03995] Signal inference workers to resume experience collection... (23250 times) [2024-03-21 05:36:28,967][04017] InferenceWorker_p0-w0: resuming experience collection (23250 times) [2024-03-21 05:36:29,237][04017] Updated weights for policy 0, policy_version 35243 (0.0014) [2024-03-21 05:36:30,521][03784] Fps is (10 sec: 39321.8, 60 sec: 44236.9, 300 sec: 47874.6). Total num frames: 1154908160. Throughput: 0: 48102.2. Samples: 1156268700. Policy #0 lag: (min: 0.0, avg: 38.0, max: 89.0) [2024-03-21 05:36:30,522][03784] Avg episode reward: [(0, '1.046')] [2024-03-21 05:36:34,967][04017] Updated weights for policy 0, policy_version 35253 (0.0033) [2024-03-21 05:36:35,521][03784] Fps is (10 sec: 45875.6, 60 sec: 45329.1, 300 sec: 47652.5). Total num frames: 1155203072. Throughput: 0: 47891.3. Samples: 1156558800. Policy #0 lag: (min: 0.0, avg: 38.0, max: 89.0) [2024-03-21 05:36:35,522][03784] Avg episode reward: [(0, '1.010')] [2024-03-21 05:36:39,723][04017] Updated weights for policy 0, policy_version 35263 (0.0009) [2024-03-21 05:36:40,521][03784] Fps is (10 sec: 62258.9, 60 sec: 49698.1, 300 sec: 48096.8). Total num frames: 1155530752. Throughput: 0: 47173.3. Samples: 1156831600. Policy #0 lag: (min: 0.0, avg: 38.0, max: 89.0) [2024-03-21 05:36:40,522][03784] Avg episode reward: [(0, '1.497')] [2024-03-21 05:36:45,190][04017] Updated weights for policy 0, policy_version 35273 (0.0016) [2024-03-21 05:36:45,521][03784] Fps is (10 sec: 62258.7, 60 sec: 49698.1, 300 sec: 47874.6). Total num frames: 1155825664. Throughput: 0: 47320.0. Samples: 1156978700. Policy #0 lag: (min: 0.0, avg: 38.0, max: 89.0) [2024-03-21 05:36:45,522][03784] Avg episode reward: [(0, '1.436')] [2024-03-21 05:36:50,521][03784] Fps is (10 sec: 49152.3, 60 sec: 47513.7, 300 sec: 47541.4). Total num frames: 1156022272. Throughput: 0: 47120.1. Samples: 1157265300. Policy #0 lag: (min: 0.0, avg: 38.0, max: 89.0) [2024-03-21 05:36:50,521][03784] Avg episode reward: [(0, '0.849')] [2024-03-21 05:36:53,710][04017] Updated weights for policy 0, policy_version 35283 (0.0017) [2024-03-21 05:36:55,521][03784] Fps is (10 sec: 39321.4, 60 sec: 48605.8, 300 sec: 47208.1). Total num frames: 1156218880. Throughput: 0: 47053.4. Samples: 1157543400. Policy #0 lag: (min: 0.0, avg: 38.1, max: 72.0) [2024-03-21 05:36:55,522][03784] Avg episode reward: [(0, '1.198')] [2024-03-21 05:37:00,521][03784] Fps is (10 sec: 42597.4, 60 sec: 46421.2, 300 sec: 47541.3). Total num frames: 1156448256. Throughput: 0: 46993.2. Samples: 1157681100. Policy #0 lag: (min: 0.0, avg: 38.1, max: 72.0) [2024-03-21 05:37:00,522][03784] Avg episode reward: [(0, '0.867')] [2024-03-21 05:37:00,862][04017] Updated weights for policy 0, policy_version 35293 (0.0011) [2024-03-21 05:37:05,521][03784] Fps is (10 sec: 49151.9, 60 sec: 48059.7, 300 sec: 47319.2). Total num frames: 1156710400. Throughput: 0: 47622.2. Samples: 1157995800. Policy #0 lag: (min: 0.0, avg: 38.1, max: 72.0) [2024-03-21 05:37:05,522][03784] Avg episode reward: [(0, '1.458')] [2024-03-21 05:37:06,251][04017] Updated weights for policy 0, policy_version 35303 (0.0010) [2024-03-21 05:37:09,380][03995] Signal inference workers to stop experience collection... (23300 times) [2024-03-21 05:37:09,434][04017] InferenceWorker_p0-w0: stopping experience collection (23300 times) [2024-03-21 05:37:09,604][03995] Signal inference workers to resume experience collection... (23300 times) [2024-03-21 05:37:09,604][04017] InferenceWorker_p0-w0: resuming experience collection (23300 times) [2024-03-21 05:37:10,521][03784] Fps is (10 sec: 58983.0, 60 sec: 50790.4, 300 sec: 47985.7). Total num frames: 1157038080. Throughput: 0: 47806.6. Samples: 1158260700. Policy #0 lag: (min: 0.0, avg: 38.1, max: 72.0) [2024-03-21 05:37:10,522][03784] Avg episode reward: [(0, '0.967')] [2024-03-21 05:37:11,278][04017] Updated weights for policy 0, policy_version 35313 (0.0012) [2024-03-21 05:37:15,521][03784] Fps is (10 sec: 55705.5, 60 sec: 48605.9, 300 sec: 47985.7). Total num frames: 1157267456. Throughput: 0: 47691.0. Samples: 1158414800. Policy #0 lag: (min: 0.0, avg: 38.1, max: 72.0) [2024-03-21 05:37:15,522][03784] Avg episode reward: [(0, '0.967')] [2024-03-21 05:37:20,521][03784] Fps is (10 sec: 36044.6, 60 sec: 48059.6, 300 sec: 47874.6). Total num frames: 1157398528. Throughput: 0: 48008.7. Samples: 1158719200. Policy #0 lag: (min: 0.0, avg: 40.6, max: 78.0) [2024-03-21 05:37:20,522][03784] Avg episode reward: [(0, '0.990')] [2024-03-21 05:37:21,090][04017] Updated weights for policy 0, policy_version 35323 (0.0011) [2024-03-21 05:37:25,521][03784] Fps is (10 sec: 39321.9, 60 sec: 48605.9, 300 sec: 48096.8). Total num frames: 1157660672. Throughput: 0: 48806.7. Samples: 1159027900. Policy #0 lag: (min: 0.0, avg: 40.6, max: 78.0) [2024-03-21 05:37:25,522][03784] Avg episode reward: [(0, '1.585')] [2024-03-21 05:37:29,587][04017] Updated weights for policy 0, policy_version 35333 (0.0014) [2024-03-21 05:37:30,521][03784] Fps is (10 sec: 45875.5, 60 sec: 49151.9, 300 sec: 47652.4). Total num frames: 1157857280. Throughput: 0: 48739.9. Samples: 1159172000. Policy #0 lag: (min: 0.0, avg: 40.6, max: 78.0) [2024-03-21 05:37:30,522][03784] Avg episode reward: [(0, '1.585')] [2024-03-21 05:37:35,521][03784] Fps is (10 sec: 42598.1, 60 sec: 48059.6, 300 sec: 47430.3). Total num frames: 1158086656. Throughput: 0: 48688.8. Samples: 1159456300. Policy #0 lag: (min: 0.0, avg: 40.6, max: 78.0) [2024-03-21 05:37:35,522][03784] Avg episode reward: [(0, '1.478')] [2024-03-21 05:37:36,590][04017] Updated weights for policy 0, policy_version 35343 (0.0011) [2024-03-21 05:37:40,521][03784] Fps is (10 sec: 49151.7, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 1158348800. Throughput: 0: 48277.7. Samples: 1159715900. Policy #0 lag: (min: 0.0, avg: 42.8, max: 96.0) [2024-03-21 05:37:40,522][03784] Avg episode reward: [(0, '1.142')] [2024-03-21 05:37:42,615][04017] Updated weights for policy 0, policy_version 35353 (0.0023) [2024-03-21 05:37:45,521][03784] Fps is (10 sec: 39321.9, 60 sec: 44236.8, 300 sec: 46986.0). Total num frames: 1158479872. Throughput: 0: 48271.3. Samples: 1159853300. Policy #0 lag: (min: 0.0, avg: 42.8, max: 96.0) [2024-03-21 05:37:45,522][03784] Avg episode reward: [(0, '1.082')] [2024-03-21 05:37:50,002][04017] Updated weights for policy 0, policy_version 35363 (0.0011) [2024-03-21 05:37:50,521][03784] Fps is (10 sec: 45875.9, 60 sec: 46421.3, 300 sec: 47652.5). Total num frames: 1158807552. Throughput: 0: 48140.1. Samples: 1160162100. Policy #0 lag: (min: 0.0, avg: 42.8, max: 96.0) [2024-03-21 05:37:50,522][03784] Avg episode reward: [(0, '1.281')] [2024-03-21 05:37:55,521][03784] Fps is (10 sec: 58981.8, 60 sec: 47513.6, 300 sec: 47430.3). Total num frames: 1159069696. Throughput: 0: 48586.7. Samples: 1160447100. Policy #0 lag: (min: 0.0, avg: 42.8, max: 96.0) [2024-03-21 05:37:55,522][03784] Avg episode reward: [(0, '1.472')] [2024-03-21 05:37:55,681][04017] Updated weights for policy 0, policy_version 35373 (0.0011) [2024-03-21 05:38:00,521][03784] Fps is (10 sec: 58982.1, 60 sec: 49152.1, 300 sec: 47874.6). Total num frames: 1159397376. Throughput: 0: 47728.9. Samples: 1160562600. Policy #0 lag: (min: 0.0, avg: 42.8, max: 96.0) [2024-03-21 05:38:00,522][03784] Avg episode reward: [(0, '0.861')] [2024-03-21 05:38:00,540][04017] Updated weights for policy 0, policy_version 35383 (0.0017) [2024-03-21 05:38:00,876][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000035384_1159462912.pth... [2024-03-21 05:38:01,002][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000035031_1147895808.pth [2024-03-21 05:38:02,111][03995] Signal inference workers to stop experience collection... (23350 times) [2024-03-21 05:38:02,111][03995] Signal inference workers to resume experience collection... (23350 times) [2024-03-21 05:38:02,169][04017] InferenceWorker_p0-w0: stopping experience collection (23350 times) [2024-03-21 05:38:02,169][04017] InferenceWorker_p0-w0: resuming experience collection (23350 times) [2024-03-21 05:38:05,521][03784] Fps is (10 sec: 65535.6, 60 sec: 50244.2, 300 sec: 48318.9). Total num frames: 1159725056. Throughput: 0: 47211.1. Samples: 1160843700. Policy #0 lag: (min: 1.0, avg: 38.6, max: 66.0) [2024-03-21 05:38:05,522][03784] Avg episode reward: [(0, '0.703')] [2024-03-21 05:38:05,746][04017] Updated weights for policy 0, policy_version 35393 (0.0014) [2024-03-21 05:38:10,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46967.5, 300 sec: 47763.5). Total num frames: 1159856128. Throughput: 0: 47199.9. Samples: 1161151900. Policy #0 lag: (min: 1.0, avg: 38.6, max: 66.0) [2024-03-21 05:38:10,522][03784] Avg episode reward: [(0, '0.796')] [2024-03-21 05:38:13,187][04017] Updated weights for policy 0, policy_version 35403 (0.0015) [2024-03-21 05:38:15,521][03784] Fps is (10 sec: 39322.3, 60 sec: 47513.7, 300 sec: 47874.6). Total num frames: 1160118272. Throughput: 0: 47004.6. Samples: 1161287200. Policy #0 lag: (min: 1.0, avg: 38.6, max: 66.0) [2024-03-21 05:38:15,522][03784] Avg episode reward: [(0, '0.914')] [2024-03-21 05:38:20,521][03784] Fps is (10 sec: 45874.8, 60 sec: 48605.8, 300 sec: 48096.7). Total num frames: 1160314880. Throughput: 0: 47228.8. Samples: 1161581600. Policy #0 lag: (min: 1.0, avg: 38.6, max: 66.0) [2024-03-21 05:38:20,522][03784] Avg episode reward: [(0, '0.932')] [2024-03-21 05:38:21,456][04017] Updated weights for policy 0, policy_version 35413 (0.0016) [2024-03-21 05:38:25,521][03784] Fps is (10 sec: 45874.6, 60 sec: 48605.8, 300 sec: 47874.6). Total num frames: 1160577024. Throughput: 0: 47551.2. Samples: 1161855700. Policy #0 lag: (min: 1.0, avg: 38.6, max: 66.0) [2024-03-21 05:38:25,522][03784] Avg episode reward: [(0, '0.940')] [2024-03-21 05:38:30,521][03784] Fps is (10 sec: 32768.6, 60 sec: 46421.4, 300 sec: 47208.2). Total num frames: 1160642560. Throughput: 0: 47593.3. Samples: 1161995000. Policy #0 lag: (min: 0.0, avg: 35.2, max: 76.0) [2024-03-21 05:38:30,522][03784] Avg episode reward: [(0, '1.219')] [2024-03-21 05:38:32,315][04017] Updated weights for policy 0, policy_version 35423 (0.0012) [2024-03-21 05:38:35,521][03784] Fps is (10 sec: 26214.7, 60 sec: 45875.3, 300 sec: 47208.1). Total num frames: 1160839168. Throughput: 0: 47193.4. Samples: 1162285800. Policy #0 lag: (min: 0.0, avg: 35.2, max: 76.0) [2024-03-21 05:38:35,522][03784] Avg episode reward: [(0, '1.114')] [2024-03-21 05:38:40,050][04017] Updated weights for policy 0, policy_version 35433 (0.0012) [2024-03-21 05:38:40,521][03784] Fps is (10 sec: 45874.5, 60 sec: 45875.2, 300 sec: 47541.4). Total num frames: 1161101312. Throughput: 0: 47197.7. Samples: 1162571000. Policy #0 lag: (min: 0.0, avg: 35.2, max: 76.0) [2024-03-21 05:38:40,522][03784] Avg episode reward: [(0, '0.638')] [2024-03-21 05:38:45,521][03784] Fps is (10 sec: 49152.1, 60 sec: 47513.6, 300 sec: 47430.3). Total num frames: 1161330688. Throughput: 0: 47457.9. Samples: 1162698200. Policy #0 lag: (min: 0.0, avg: 35.2, max: 76.0) [2024-03-21 05:38:45,522][03784] Avg episode reward: [(0, '1.249')] [2024-03-21 05:38:46,218][04017] Updated weights for policy 0, policy_version 35443 (0.0019) [2024-03-21 05:38:50,521][03784] Fps is (10 sec: 52429.5, 60 sec: 46967.4, 300 sec: 47430.3). Total num frames: 1161625600. Throughput: 0: 47355.7. Samples: 1162974700. Policy #0 lag: (min: 0.0, avg: 35.2, max: 76.0) [2024-03-21 05:38:50,522][03784] Avg episode reward: [(0, '1.061')] [2024-03-21 05:38:51,647][04017] Updated weights for policy 0, policy_version 35453 (0.0011) [2024-03-21 05:38:53,611][03995] Signal inference workers to stop experience collection... (23400 times) [2024-03-21 05:38:53,679][03995] Signal inference workers to resume experience collection... (23400 times) [2024-03-21 05:38:53,712][04017] InferenceWorker_p0-w0: stopping experience collection (23400 times) [2024-03-21 05:38:53,757][04017] InferenceWorker_p0-w0: resuming experience collection (23400 times) [2024-03-21 05:38:55,521][03784] Fps is (10 sec: 62259.2, 60 sec: 48059.8, 300 sec: 47652.5). Total num frames: 1161953280. Throughput: 0: 46602.4. Samples: 1163249000. Policy #0 lag: (min: 0.0, avg: 25.2, max: 46.0) [2024-03-21 05:38:55,522][03784] Avg episode reward: [(0, '1.278')] [2024-03-21 05:38:56,313][04017] Updated weights for policy 0, policy_version 35463 (0.0013) [2024-03-21 05:39:00,521][03784] Fps is (10 sec: 65534.9, 60 sec: 48059.6, 300 sec: 48096.7). Total num frames: 1162280960. Throughput: 0: 46644.2. Samples: 1163386200. Policy #0 lag: (min: 0.0, avg: 25.2, max: 46.0) [2024-03-21 05:39:00,522][03784] Avg episode reward: [(0, '1.221')] [2024-03-21 05:39:01,394][04017] Updated weights for policy 0, policy_version 35473 (0.0017) [2024-03-21 05:39:05,521][03784] Fps is (10 sec: 45874.6, 60 sec: 44783.0, 300 sec: 47763.5). Total num frames: 1162412032. Throughput: 0: 46769.0. Samples: 1163686200. Policy #0 lag: (min: 0.0, avg: 25.2, max: 46.0) [2024-03-21 05:39:05,522][03784] Avg episode reward: [(0, '1.359')] [2024-03-21 05:39:10,521][03784] Fps is (10 sec: 39322.3, 60 sec: 46967.5, 300 sec: 47763.5). Total num frames: 1162674176. Throughput: 0: 46615.7. Samples: 1163953400. Policy #0 lag: (min: 0.0, avg: 25.2, max: 46.0) [2024-03-21 05:39:10,522][03784] Avg episode reward: [(0, '1.173')] [2024-03-21 05:39:11,600][04017] Updated weights for policy 0, policy_version 35483 (0.0020) [2024-03-21 05:39:15,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46421.2, 300 sec: 47652.4). Total num frames: 1162903552. Throughput: 0: 46499.9. Samples: 1164087500. Policy #0 lag: (min: 0.0, avg: 41.3, max: 72.0) [2024-03-21 05:39:15,522][03784] Avg episode reward: [(0, '1.293')] [2024-03-21 05:39:17,727][04017] Updated weights for policy 0, policy_version 35493 (0.0024) [2024-03-21 05:39:20,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46967.6, 300 sec: 47763.5). Total num frames: 1163132928. Throughput: 0: 46142.2. Samples: 1164362200. Policy #0 lag: (min: 0.0, avg: 41.3, max: 72.0) [2024-03-21 05:39:20,522][03784] Avg episode reward: [(0, '1.541')] [2024-03-21 05:39:25,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45875.2, 300 sec: 47652.5). Total num frames: 1163329536. Throughput: 0: 45868.9. Samples: 1164635100. Policy #0 lag: (min: 0.0, avg: 41.3, max: 72.0) [2024-03-21 05:39:25,522][03784] Avg episode reward: [(0, '1.418')] [2024-03-21 05:39:25,630][04017] Updated weights for policy 0, policy_version 35503 (0.0011) [2024-03-21 05:39:30,521][03784] Fps is (10 sec: 42598.2, 60 sec: 48605.8, 300 sec: 47097.0). Total num frames: 1163558912. Throughput: 0: 46386.5. Samples: 1164785600. Policy #0 lag: (min: 0.0, avg: 41.3, max: 72.0) [2024-03-21 05:39:30,522][03784] Avg episode reward: [(0, '0.666')] [2024-03-21 05:39:34,521][04017] Updated weights for policy 0, policy_version 35513 (0.0011) [2024-03-21 05:39:35,521][03784] Fps is (10 sec: 42598.7, 60 sec: 48605.8, 300 sec: 47430.3). Total num frames: 1163755520. Throughput: 0: 47008.9. Samples: 1165090100. Policy #0 lag: (min: 0.0, avg: 41.3, max: 72.0) [2024-03-21 05:39:35,522][03784] Avg episode reward: [(0, '0.868')] [2024-03-21 05:39:40,521][03784] Fps is (10 sec: 39321.6, 60 sec: 47513.7, 300 sec: 47430.3). Total num frames: 1163952128. Throughput: 0: 47148.8. Samples: 1165370700. Policy #0 lag: (min: 0.0, avg: 33.3, max: 95.0) [2024-03-21 05:39:40,522][03784] Avg episode reward: [(0, '1.395')] [2024-03-21 05:39:41,163][04017] Updated weights for policy 0, policy_version 35523 (0.0021) [2024-03-21 05:39:45,521][03784] Fps is (10 sec: 42598.3, 60 sec: 47513.5, 300 sec: 47430.3). Total num frames: 1164181504. Throughput: 0: 47006.8. Samples: 1165501500. Policy #0 lag: (min: 0.0, avg: 33.3, max: 95.0) [2024-03-21 05:39:45,522][03784] Avg episode reward: [(0, '1.464')] [2024-03-21 05:39:47,531][04017] Updated weights for policy 0, policy_version 35533 (0.0022) [2024-03-21 05:39:50,398][03995] Signal inference workers to stop experience collection... (23450 times) [2024-03-21 05:39:50,471][03995] Signal inference workers to resume experience collection... (23450 times) [2024-03-21 05:39:50,491][04017] InferenceWorker_p0-w0: stopping experience collection (23450 times) [2024-03-21 05:39:50,521][03784] Fps is (10 sec: 52428.7, 60 sec: 47513.5, 300 sec: 47541.4). Total num frames: 1164476416. Throughput: 0: 46122.2. Samples: 1165761700. Policy #0 lag: (min: 0.0, avg: 33.3, max: 95.0) [2024-03-21 05:39:50,522][03784] Avg episode reward: [(0, '1.037')] [2024-03-21 05:39:50,537][04017] InferenceWorker_p0-w0: resuming experience collection (23450 times) [2024-03-21 05:39:55,521][03784] Fps is (10 sec: 45874.9, 60 sec: 44782.8, 300 sec: 47874.6). Total num frames: 1164640256. Throughput: 0: 46859.9. Samples: 1166062100. Policy #0 lag: (min: 0.0, avg: 33.3, max: 95.0) [2024-03-21 05:39:55,522][03784] Avg episode reward: [(0, '0.583')] [2024-03-21 05:39:55,755][04017] Updated weights for policy 0, policy_version 35543 (0.0015) [2024-03-21 05:40:00,512][04017] Updated weights for policy 0, policy_version 35553 (0.0024) [2024-03-21 05:40:00,521][03784] Fps is (10 sec: 52429.5, 60 sec: 45329.2, 300 sec: 47985.7). Total num frames: 1165000704. Throughput: 0: 46920.1. Samples: 1166198900. Policy #0 lag: (min: 0.0, avg: 33.3, max: 95.0) [2024-03-21 05:40:00,522][03784] Avg episode reward: [(0, '0.924')] [2024-03-21 05:40:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000035553_1165000704.pth... [2024-03-21 05:40:00,650][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000035207_1153662976.pth [2024-03-21 05:40:05,521][03784] Fps is (10 sec: 58982.3, 60 sec: 46967.4, 300 sec: 47763.5). Total num frames: 1165230080. Throughput: 0: 47486.6. Samples: 1166499100. Policy #0 lag: (min: 0.0, avg: 58.3, max: 123.0) [2024-03-21 05:40:05,522][03784] Avg episode reward: [(0, '0.941')] [2024-03-21 05:40:06,398][04017] Updated weights for policy 0, policy_version 35563 (0.0012) [2024-03-21 05:40:10,521][03784] Fps is (10 sec: 55705.6, 60 sec: 48059.8, 300 sec: 48096.8). Total num frames: 1165557760. Throughput: 0: 47464.6. Samples: 1166771000. Policy #0 lag: (min: 0.0, avg: 58.3, max: 123.0) [2024-03-21 05:40:10,522][03784] Avg episode reward: [(0, '1.521')] [2024-03-21 05:40:12,431][04017] Updated weights for policy 0, policy_version 35573 (0.0010) [2024-03-21 05:40:15,521][03784] Fps is (10 sec: 58982.7, 60 sec: 48605.9, 300 sec: 47874.7). Total num frames: 1165819904. Throughput: 0: 47444.5. Samples: 1166920600. Policy #0 lag: (min: 0.0, avg: 58.3, max: 123.0) [2024-03-21 05:40:15,522][03784] Avg episode reward: [(0, '1.804')] [2024-03-21 05:40:15,523][03995] Saving new best policy, reward=1.804! [2024-03-21 05:40:20,339][04017] Updated weights for policy 0, policy_version 35583 (0.0011) [2024-03-21 05:40:20,521][03784] Fps is (10 sec: 42598.3, 60 sec: 47513.6, 300 sec: 47541.4). Total num frames: 1165983744. Throughput: 0: 47588.9. Samples: 1167231600. Policy #0 lag: (min: 0.0, avg: 58.3, max: 123.0) [2024-03-21 05:40:20,522][03784] Avg episode reward: [(0, '1.225')] [2024-03-21 05:40:25,521][03784] Fps is (10 sec: 39321.7, 60 sec: 48059.8, 300 sec: 47319.2). Total num frames: 1166213120. Throughput: 0: 48300.0. Samples: 1167544200. Policy #0 lag: (min: 0.0, avg: 37.1, max: 77.0) [2024-03-21 05:40:25,522][03784] Avg episode reward: [(0, '1.225')] [2024-03-21 05:40:28,123][04017] Updated weights for policy 0, policy_version 35593 (0.0015) [2024-03-21 05:40:30,521][03784] Fps is (10 sec: 39321.4, 60 sec: 46967.5, 300 sec: 47097.1). Total num frames: 1166376960. Throughput: 0: 48846.7. Samples: 1167699600. Policy #0 lag: (min: 0.0, avg: 37.1, max: 77.0) [2024-03-21 05:40:30,522][03784] Avg episode reward: [(0, '1.225')] [2024-03-21 05:40:33,101][04017] Updated weights for policy 0, policy_version 35603 (0.0014) [2024-03-21 05:40:35,521][03784] Fps is (10 sec: 58982.3, 60 sec: 50790.4, 300 sec: 48318.9). Total num frames: 1166802944. Throughput: 0: 48988.9. Samples: 1167966200. Policy #0 lag: (min: 0.0, avg: 37.1, max: 77.0) [2024-03-21 05:40:35,522][03784] Avg episode reward: [(0, '0.907')] [2024-03-21 05:40:37,203][04017] Updated weights for policy 0, policy_version 35613 (0.0013) [2024-03-21 05:40:39,749][03995] Signal inference workers to stop experience collection... (23500 times) [2024-03-21 05:40:39,817][04017] InferenceWorker_p0-w0: stopping experience collection (23500 times) [2024-03-21 05:40:39,819][03995] Signal inference workers to resume experience collection... (23500 times) [2024-03-21 05:40:39,856][04017] InferenceWorker_p0-w0: resuming experience collection (23500 times) [2024-03-21 05:40:40,521][03784] Fps is (10 sec: 72089.2, 60 sec: 52428.8, 300 sec: 48318.9). Total num frames: 1167097856. Throughput: 0: 48575.6. Samples: 1168248000. Policy #0 lag: (min: 0.0, avg: 37.1, max: 77.0) [2024-03-21 05:40:40,522][03784] Avg episode reward: [(0, '0.923')] [2024-03-21 05:40:45,521][03784] Fps is (10 sec: 42598.7, 60 sec: 50790.4, 300 sec: 47652.5). Total num frames: 1167228928. Throughput: 0: 48975.5. Samples: 1168402800. Policy #0 lag: (min: 0.0, avg: 37.1, max: 77.0) [2024-03-21 05:40:45,522][03784] Avg episode reward: [(0, '1.154')] [2024-03-21 05:40:50,521][03784] Fps is (10 sec: 13107.4, 60 sec: 45875.3, 300 sec: 47208.1). Total num frames: 1167228928. Throughput: 0: 49235.7. Samples: 1168714700. Policy #0 lag: (min: 0.0, avg: 37.1, max: 77.0) [2024-03-21 05:40:50,522][03784] Avg episode reward: [(0, '1.154')] [2024-03-21 05:40:53,004][04017] Updated weights for policy 0, policy_version 35623 (0.0012) [2024-03-21 05:40:55,521][03784] Fps is (10 sec: 26214.0, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 1167491072. Throughput: 0: 49395.4. Samples: 1168993800. Policy #0 lag: (min: 0.0, avg: 33.9, max: 86.0) [2024-03-21 05:40:55,522][03784] Avg episode reward: [(0, '1.502')] [2024-03-21 05:40:56,732][04017] Updated weights for policy 0, policy_version 35633 (0.0022) [2024-03-21 05:41:00,521][03784] Fps is (10 sec: 62258.3, 60 sec: 47513.5, 300 sec: 47541.4). Total num frames: 1167851520. Throughput: 0: 48928.8. Samples: 1169122400. Policy #0 lag: (min: 0.0, avg: 33.9, max: 86.0) [2024-03-21 05:41:00,522][03784] Avg episode reward: [(0, '0.984')] [2024-03-21 05:41:03,747][04017] Updated weights for policy 0, policy_version 35643 (0.0028) [2024-03-21 05:41:05,521][03784] Fps is (10 sec: 52429.7, 60 sec: 46421.5, 300 sec: 47541.4). Total num frames: 1168015360. Throughput: 0: 48406.7. Samples: 1169409900. Policy #0 lag: (min: 0.0, avg: 33.9, max: 86.0) [2024-03-21 05:41:05,522][03784] Avg episode reward: [(0, '0.468')] [2024-03-21 05:41:09,642][04017] Updated weights for policy 0, policy_version 35653 (0.0014) [2024-03-21 05:41:10,521][03784] Fps is (10 sec: 45875.7, 60 sec: 45875.2, 300 sec: 47319.2). Total num frames: 1168310272. Throughput: 0: 47440.0. Samples: 1169679000. Policy #0 lag: (min: 0.0, avg: 33.9, max: 86.0) [2024-03-21 05:41:10,522][03784] Avg episode reward: [(0, '0.935')] [2024-03-21 05:41:15,521][03784] Fps is (10 sec: 45874.8, 60 sec: 44236.8, 300 sec: 47319.2). Total num frames: 1168474112. Throughput: 0: 47680.0. Samples: 1169845200. Policy #0 lag: (min: 0.0, avg: 33.9, max: 86.0) [2024-03-21 05:41:15,522][03784] Avg episode reward: [(0, '1.254')] [2024-03-21 05:41:17,185][04017] Updated weights for policy 0, policy_version 35663 (0.0020) [2024-03-21 05:41:20,438][04017] Updated weights for policy 0, policy_version 35673 (0.0011) [2024-03-21 05:41:20,521][03784] Fps is (10 sec: 62259.6, 60 sec: 49152.0, 300 sec: 48096.8). Total num frames: 1168932864. Throughput: 0: 47566.8. Samples: 1170106700. Policy #0 lag: (min: 0.0, avg: 40.0, max: 117.0) [2024-03-21 05:41:20,522][03784] Avg episode reward: [(0, '1.208')] [2024-03-21 05:41:24,090][03995] Signal inference workers to stop experience collection... (23550 times) [2024-03-21 05:41:24,090][03995] Signal inference workers to resume experience collection... (23550 times) [2024-03-21 05:41:24,174][04017] InferenceWorker_p0-w0: stopping experience collection (23550 times) [2024-03-21 05:41:24,175][04017] InferenceWorker_p0-w0: resuming experience collection (23550 times) [2024-03-21 05:41:24,462][04017] Updated weights for policy 0, policy_version 35683 (0.0011) [2024-03-21 05:41:25,521][03784] Fps is (10 sec: 85196.8, 60 sec: 51882.7, 300 sec: 48874.3). Total num frames: 1169326080. Throughput: 0: 47531.2. Samples: 1170386900. Policy #0 lag: (min: 0.0, avg: 40.0, max: 117.0) [2024-03-21 05:41:25,522][03784] Avg episode reward: [(0, '0.844')] [2024-03-21 05:41:30,521][03784] Fps is (10 sec: 52428.5, 60 sec: 51336.5, 300 sec: 48318.9). Total num frames: 1169457152. Throughput: 0: 47568.8. Samples: 1170543400. Policy #0 lag: (min: 0.0, avg: 40.0, max: 117.0) [2024-03-21 05:41:30,522][03784] Avg episode reward: [(0, '0.897')] [2024-03-21 05:41:34,929][04017] Updated weights for policy 0, policy_version 35693 (0.0011) [2024-03-21 05:41:35,521][03784] Fps is (10 sec: 32768.0, 60 sec: 47513.6, 300 sec: 47874.6). Total num frames: 1169653760. Throughput: 0: 47597.7. Samples: 1170856600. Policy #0 lag: (min: 0.0, avg: 40.0, max: 117.0) [2024-03-21 05:41:35,522][03784] Avg episode reward: [(0, '1.301')] [2024-03-21 05:41:39,576][04017] Updated weights for policy 0, policy_version 35703 (0.0015) [2024-03-21 05:41:40,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46967.5, 300 sec: 47763.5). Total num frames: 1169915904. Throughput: 0: 47431.2. Samples: 1171128200. Policy #0 lag: (min: 1.0, avg: 54.4, max: 106.0) [2024-03-21 05:41:40,522][03784] Avg episode reward: [(0, '0.919')] [2024-03-21 05:41:45,521][03784] Fps is (10 sec: 45875.0, 60 sec: 48059.7, 300 sec: 47763.5). Total num frames: 1170112512. Throughput: 0: 47866.7. Samples: 1171276400. Policy #0 lag: (min: 1.0, avg: 54.4, max: 106.0) [2024-03-21 05:41:45,522][03784] Avg episode reward: [(0, '1.463')] [2024-03-21 05:41:47,564][04017] Updated weights for policy 0, policy_version 35713 (0.0015) [2024-03-21 05:41:50,521][03784] Fps is (10 sec: 49151.8, 60 sec: 52974.9, 300 sec: 48096.8). Total num frames: 1170407424. Throughput: 0: 47717.7. Samples: 1171557200. Policy #0 lag: (min: 1.0, avg: 54.4, max: 106.0) [2024-03-21 05:41:50,522][03784] Avg episode reward: [(0, '1.150')] [2024-03-21 05:41:54,615][04017] Updated weights for policy 0, policy_version 35723 (0.0012) [2024-03-21 05:41:55,521][03784] Fps is (10 sec: 45875.3, 60 sec: 51336.6, 300 sec: 47874.6). Total num frames: 1170571264. Throughput: 0: 48357.7. Samples: 1171855100. Policy #0 lag: (min: 1.0, avg: 54.4, max: 106.0) [2024-03-21 05:41:55,522][03784] Avg episode reward: [(0, '0.870')] [2024-03-21 05:42:00,521][03784] Fps is (10 sec: 32768.1, 60 sec: 48059.8, 300 sec: 47541.4). Total num frames: 1170735104. Throughput: 0: 47786.7. Samples: 1171995600. Policy #0 lag: (min: 1.0, avg: 54.4, max: 106.0) [2024-03-21 05:42:00,522][03784] Avg episode reward: [(0, '1.510')] [2024-03-21 05:42:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000035728_1170735104.pth... [2024-03-21 05:42:00,647][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000035384_1159462912.pth [2024-03-21 05:42:05,521][03784] Fps is (10 sec: 26214.4, 60 sec: 46967.4, 300 sec: 46763.8). Total num frames: 1170833408. Throughput: 0: 48168.8. Samples: 1172274300. Policy #0 lag: (min: 0.0, avg: 29.4, max: 66.0) [2024-03-21 05:42:05,522][03784] Avg episode reward: [(0, '1.565')] [2024-03-21 05:42:06,886][04017] Updated weights for policy 0, policy_version 35733 (0.0011) [2024-03-21 05:42:10,521][03784] Fps is (10 sec: 29491.2, 60 sec: 45329.1, 300 sec: 46652.8). Total num frames: 1171030016. Throughput: 0: 48113.4. Samples: 1172552000. Policy #0 lag: (min: 0.0, avg: 29.4, max: 66.0) [2024-03-21 05:42:10,522][03784] Avg episode reward: [(0, '1.086')] [2024-03-21 05:42:12,367][04017] Updated weights for policy 0, policy_version 35743 (0.0015) [2024-03-21 05:42:15,521][03784] Fps is (10 sec: 49152.4, 60 sec: 47513.6, 300 sec: 47208.2). Total num frames: 1171324928. Throughput: 0: 47655.6. Samples: 1172687900. Policy #0 lag: (min: 0.0, avg: 29.4, max: 66.0) [2024-03-21 05:42:15,522][03784] Avg episode reward: [(0, '1.536')] [2024-03-21 05:42:18,110][04017] Updated weights for policy 0, policy_version 35753 (0.0012) [2024-03-21 05:42:20,521][03784] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 47097.1). Total num frames: 1171554304. Throughput: 0: 46740.0. Samples: 1172959900. Policy #0 lag: (min: 0.0, avg: 29.4, max: 66.0) [2024-03-21 05:42:20,522][03784] Avg episode reward: [(0, '1.761')] [2024-03-21 05:42:22,078][03995] Signal inference workers to stop experience collection... (23600 times) [2024-03-21 05:42:22,145][04017] InferenceWorker_p0-w0: stopping experience collection (23600 times) [2024-03-21 05:42:22,305][03995] Signal inference workers to resume experience collection... (23600 times) [2024-03-21 05:42:22,305][04017] InferenceWorker_p0-w0: resuming experience collection (23600 times) [2024-03-21 05:42:25,521][03784] Fps is (10 sec: 42598.0, 60 sec: 40413.8, 300 sec: 47097.1). Total num frames: 1171750912. Throughput: 0: 47106.6. Samples: 1173248000. Policy #0 lag: (min: 0.0, avg: 29.4, max: 66.0) [2024-03-21 05:42:25,522][03784] Avg episode reward: [(0, '1.435')] [2024-03-21 05:42:27,583][04017] Updated weights for policy 0, policy_version 35763 (0.0016) [2024-03-21 05:42:30,521][03784] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 47430.3). Total num frames: 1172078592. Throughput: 0: 46873.4. Samples: 1173385700. Policy #0 lag: (min: 0.0, avg: 38.7, max: 83.0) [2024-03-21 05:42:30,522][03784] Avg episode reward: [(0, '0.564')] [2024-03-21 05:42:32,476][04017] Updated weights for policy 0, policy_version 35773 (0.0014) [2024-03-21 05:42:35,521][03784] Fps is (10 sec: 75367.2, 60 sec: 47513.7, 300 sec: 47985.7). Total num frames: 1172504576. Throughput: 0: 45722.3. Samples: 1173614700. Policy #0 lag: (min: 0.0, avg: 38.7, max: 83.0) [2024-03-21 05:42:35,522][03784] Avg episode reward: [(0, '1.365')] [2024-03-21 05:42:35,600][04017] Updated weights for policy 0, policy_version 35783 (0.0017) [2024-03-21 05:42:40,521][03784] Fps is (10 sec: 52428.7, 60 sec: 44782.9, 300 sec: 47874.6). Total num frames: 1172602880. Throughput: 0: 45866.7. Samples: 1173919100. Policy #0 lag: (min: 0.0, avg: 38.7, max: 83.0) [2024-03-21 05:42:40,530][03784] Avg episode reward: [(0, '1.365')] [2024-03-21 05:42:44,156][04017] Updated weights for policy 0, policy_version 35793 (0.0013) [2024-03-21 05:42:45,521][03784] Fps is (10 sec: 42597.5, 60 sec: 46967.4, 300 sec: 47874.6). Total num frames: 1172930560. Throughput: 0: 45466.5. Samples: 1174041600. Policy #0 lag: (min: 0.0, avg: 38.7, max: 83.0) [2024-03-21 05:42:45,522][03784] Avg episode reward: [(0, '1.110')] [2024-03-21 05:42:49,424][04017] Updated weights for policy 0, policy_version 35803 (0.0017) [2024-03-21 05:42:50,524][03784] Fps is (10 sec: 65517.4, 60 sec: 47511.3, 300 sec: 48096.3). Total num frames: 1173258240. Throughput: 0: 45481.6. Samples: 1174321100. Policy #0 lag: (min: 1.0, avg: 48.8, max: 109.0) [2024-03-21 05:42:50,525][03784] Avg episode reward: [(0, '1.160')] [2024-03-21 05:42:55,521][03784] Fps is (10 sec: 49152.7, 60 sec: 47513.6, 300 sec: 47541.4). Total num frames: 1173422080. Throughput: 0: 45724.4. Samples: 1174609600. Policy #0 lag: (min: 1.0, avg: 48.8, max: 109.0) [2024-03-21 05:42:55,522][03784] Avg episode reward: [(0, '1.216')] [2024-03-21 05:42:57,370][04017] Updated weights for policy 0, policy_version 35813 (0.0010) [2024-03-21 05:43:00,521][03784] Fps is (10 sec: 39332.9, 60 sec: 48605.8, 300 sec: 47208.2). Total num frames: 1173651456. Throughput: 0: 46079.9. Samples: 1174761500. Policy #0 lag: (min: 1.0, avg: 48.8, max: 109.0) [2024-03-21 05:43:00,522][03784] Avg episode reward: [(0, '1.216')] [2024-03-21 05:43:05,521][03784] Fps is (10 sec: 22937.4, 60 sec: 46967.4, 300 sec: 46763.8). Total num frames: 1173651456. Throughput: 0: 46984.3. Samples: 1175074200. Policy #0 lag: (min: 1.0, avg: 48.8, max: 109.0) [2024-03-21 05:43:05,522][03784] Avg episode reward: [(0, '1.146')] [2024-03-21 05:43:08,925][04017] Updated weights for policy 0, policy_version 35823 (0.0014) [2024-03-21 05:43:10,521][03784] Fps is (10 sec: 29491.2, 60 sec: 48605.8, 300 sec: 46874.9). Total num frames: 1173946368. Throughput: 0: 46617.8. Samples: 1175345800. Policy #0 lag: (min: 1.0, avg: 48.8, max: 109.0) [2024-03-21 05:43:10,522][03784] Avg episode reward: [(0, '0.697')] [2024-03-21 05:43:14,590][03995] Signal inference workers to stop experience collection... (23650 times) [2024-03-21 05:43:14,592][04017] Updated weights for policy 0, policy_version 35833 (0.0010) [2024-03-21 05:43:14,599][03995] Signal inference workers to resume experience collection... (23650 times) [2024-03-21 05:43:14,640][04017] InferenceWorker_p0-w0: stopping experience collection (23650 times) [2024-03-21 05:43:14,681][04017] InferenceWorker_p0-w0: resuming experience collection (23650 times) [2024-03-21 05:43:15,521][03784] Fps is (10 sec: 55706.2, 60 sec: 48059.7, 300 sec: 47097.1). Total num frames: 1174208512. Throughput: 0: 46491.2. Samples: 1175477800. Policy #0 lag: (min: 0.0, avg: 24.4, max: 64.0) [2024-03-21 05:43:15,522][03784] Avg episode reward: [(0, '1.200')] [2024-03-21 05:43:20,521][03784] Fps is (10 sec: 45875.0, 60 sec: 47513.6, 300 sec: 46874.9). Total num frames: 1174405120. Throughput: 0: 48037.7. Samples: 1175776400. Policy #0 lag: (min: 0.0, avg: 24.4, max: 64.0) [2024-03-21 05:43:20,522][03784] Avg episode reward: [(0, '1.538')] [2024-03-21 05:43:22,210][04017] Updated weights for policy 0, policy_version 35843 (0.0015) [2024-03-21 05:43:25,521][03784] Fps is (10 sec: 55705.8, 60 sec: 50244.4, 300 sec: 47874.6). Total num frames: 1174765568. Throughput: 0: 47164.5. Samples: 1176041500. Policy #0 lag: (min: 0.0, avg: 24.4, max: 64.0) [2024-03-21 05:43:25,522][03784] Avg episode reward: [(0, '1.041')] [2024-03-21 05:43:26,339][04017] Updated weights for policy 0, policy_version 35853 (0.0014) [2024-03-21 05:43:30,521][03784] Fps is (10 sec: 55705.8, 60 sec: 48059.7, 300 sec: 47874.6). Total num frames: 1174962176. Throughput: 0: 47795.7. Samples: 1176192400. Policy #0 lag: (min: 0.0, avg: 24.4, max: 64.0) [2024-03-21 05:43:30,522][03784] Avg episode reward: [(0, '1.091')] [2024-03-21 05:43:33,857][04017] Updated weights for policy 0, policy_version 35863 (0.0012) [2024-03-21 05:43:35,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 47763.5). Total num frames: 1175191552. Throughput: 0: 47792.0. Samples: 1176471600. Policy #0 lag: (min: 0.0, avg: 24.4, max: 64.0) [2024-03-21 05:43:35,522][03784] Avg episode reward: [(0, '1.505')] [2024-03-21 05:43:40,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46967.5, 300 sec: 47763.5). Total num frames: 1175420928. Throughput: 0: 47942.3. Samples: 1176767000. Policy #0 lag: (min: 0.0, avg: 54.9, max: 111.0) [2024-03-21 05:43:40,522][03784] Avg episode reward: [(0, '0.762')] [2024-03-21 05:43:41,152][04017] Updated weights for policy 0, policy_version 35873 (0.0010) [2024-03-21 05:43:45,521][03784] Fps is (10 sec: 49152.0, 60 sec: 45875.3, 300 sec: 47652.4). Total num frames: 1175683072. Throughput: 0: 47282.3. Samples: 1176889200. Policy #0 lag: (min: 0.0, avg: 54.9, max: 111.0) [2024-03-21 05:43:45,522][03784] Avg episode reward: [(0, '0.880')] [2024-03-21 05:43:49,256][04017] Updated weights for policy 0, policy_version 35883 (0.0010) [2024-03-21 05:43:50,521][03784] Fps is (10 sec: 45875.0, 60 sec: 43692.8, 300 sec: 47208.1). Total num frames: 1175879680. Throughput: 0: 46986.7. Samples: 1177188600. Policy #0 lag: (min: 0.0, avg: 54.9, max: 111.0) [2024-03-21 05:43:50,522][03784] Avg episode reward: [(0, '1.110')] [2024-03-21 05:43:54,831][04017] Updated weights for policy 0, policy_version 35893 (0.0011) [2024-03-21 05:43:55,521][03784] Fps is (10 sec: 49151.6, 60 sec: 45875.2, 300 sec: 47097.1). Total num frames: 1176174592. Throughput: 0: 47644.4. Samples: 1177489800. Policy #0 lag: (min: 0.0, avg: 54.9, max: 111.0) [2024-03-21 05:43:55,522][03784] Avg episode reward: [(0, '1.166')] [2024-03-21 05:43:59,886][04017] Updated weights for policy 0, policy_version 35903 (0.0015) [2024-03-21 05:43:59,924][03995] Signal inference workers to stop experience collection... (23700 times) [2024-03-21 05:43:59,991][04017] InferenceWorker_p0-w0: stopping experience collection (23700 times) [2024-03-21 05:44:00,176][03995] Signal inference workers to resume experience collection... (23700 times) [2024-03-21 05:44:00,177][04017] InferenceWorker_p0-w0: resuming experience collection (23700 times) [2024-03-21 05:44:00,521][03784] Fps is (10 sec: 65535.2, 60 sec: 48059.6, 300 sec: 47874.6). Total num frames: 1176535040. Throughput: 0: 47786.5. Samples: 1177628200. Policy #0 lag: (min: 3.0, avg: 60.5, max: 113.0) [2024-03-21 05:44:00,522][03784] Avg episode reward: [(0, '0.973')] [2024-03-21 05:44:00,738][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000035906_1176567808.pth... [2024-03-21 05:44:00,871][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000035553_1165000704.pth [2024-03-21 05:44:04,469][04017] Updated weights for policy 0, policy_version 35913 (0.0018) [2024-03-21 05:44:05,521][03784] Fps is (10 sec: 65536.7, 60 sec: 52975.0, 300 sec: 47985.7). Total num frames: 1176829952. Throughput: 0: 47451.2. Samples: 1177911700. Policy #0 lag: (min: 3.0, avg: 60.5, max: 113.0) [2024-03-21 05:44:05,522][03784] Avg episode reward: [(0, '1.147')] [2024-03-21 05:44:10,521][03784] Fps is (10 sec: 49152.9, 60 sec: 51336.6, 300 sec: 47874.6). Total num frames: 1177026560. Throughput: 0: 47702.2. Samples: 1178188100. Policy #0 lag: (min: 3.0, avg: 60.5, max: 113.0) [2024-03-21 05:44:10,522][03784] Avg episode reward: [(0, '1.476')] [2024-03-21 05:44:15,406][04017] Updated weights for policy 0, policy_version 35923 (0.0012) [2024-03-21 05:44:15,521][03784] Fps is (10 sec: 29491.1, 60 sec: 48605.8, 300 sec: 47430.3). Total num frames: 1177124864. Throughput: 0: 47882.2. Samples: 1178347100. Policy #0 lag: (min: 3.0, avg: 60.5, max: 113.0) [2024-03-21 05:44:15,522][03784] Avg episode reward: [(0, '1.275')] [2024-03-21 05:44:20,521][03784] Fps is (10 sec: 29491.0, 60 sec: 48605.9, 300 sec: 47430.3). Total num frames: 1177321472. Throughput: 0: 47684.4. Samples: 1178617400. Policy #0 lag: (min: 3.0, avg: 60.5, max: 113.0) [2024-03-21 05:44:20,522][03784] Avg episode reward: [(0, '0.484')] [2024-03-21 05:44:25,071][04017] Updated weights for policy 0, policy_version 35933 (0.0010) [2024-03-21 05:44:25,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45329.1, 300 sec: 47208.2). Total num frames: 1177485312. Throughput: 0: 47362.3. Samples: 1178898300. Policy #0 lag: (min: 0.0, avg: 29.1, max: 72.0) [2024-03-21 05:44:25,522][03784] Avg episode reward: [(0, '0.691')] [2024-03-21 05:44:30,521][03784] Fps is (10 sec: 32768.3, 60 sec: 44783.0, 300 sec: 47097.1). Total num frames: 1177649152. Throughput: 0: 48437.8. Samples: 1179068900. Policy #0 lag: (min: 0.0, avg: 29.1, max: 72.0) [2024-03-21 05:44:30,522][03784] Avg episode reward: [(0, '1.489')] [2024-03-21 05:44:31,780][04017] Updated weights for policy 0, policy_version 35943 (0.0010) [2024-03-21 05:44:35,521][03784] Fps is (10 sec: 52428.6, 60 sec: 46967.5, 300 sec: 47652.5). Total num frames: 1178009600. Throughput: 0: 48433.4. Samples: 1179368100. Policy #0 lag: (min: 0.0, avg: 29.1, max: 72.0) [2024-03-21 05:44:35,522][03784] Avg episode reward: [(0, '0.711')] [2024-03-21 05:44:36,873][04017] Updated weights for policy 0, policy_version 35953 (0.0014) [2024-03-21 05:44:40,521][03784] Fps is (10 sec: 65535.4, 60 sec: 48059.7, 300 sec: 47874.6). Total num frames: 1178304512. Throughput: 0: 47975.6. Samples: 1179648700. Policy #0 lag: (min: 0.0, avg: 29.1, max: 72.0) [2024-03-21 05:44:40,522][03784] Avg episode reward: [(0, '0.592')] [2024-03-21 05:44:41,901][04017] Updated weights for policy 0, policy_version 35963 (0.0010) [2024-03-21 05:44:45,521][03784] Fps is (10 sec: 68812.8, 60 sec: 50244.3, 300 sec: 48207.9). Total num frames: 1178697728. Throughput: 0: 47997.9. Samples: 1179788100. Policy #0 lag: (min: 0.0, avg: 29.1, max: 72.0) [2024-03-21 05:44:45,522][03784] Avg episode reward: [(0, '1.380')] [2024-03-21 05:44:45,863][04017] Updated weights for policy 0, policy_version 35973 (0.0015) [2024-03-21 05:44:50,521][03784] Fps is (10 sec: 65536.6, 60 sec: 51336.6, 300 sec: 48541.1). Total num frames: 1178959872. Throughput: 0: 48064.5. Samples: 1180074600. Policy #0 lag: (min: 0.0, avg: 38.2, max: 69.0) [2024-03-21 05:44:50,522][03784] Avg episode reward: [(0, '0.659')] [2024-03-21 05:44:54,797][04017] Updated weights for policy 0, policy_version 35983 (0.0012) [2024-03-21 05:44:55,521][03784] Fps is (10 sec: 42598.6, 60 sec: 49152.1, 300 sec: 47874.6). Total num frames: 1179123712. Throughput: 0: 48648.9. Samples: 1180377300. Policy #0 lag: (min: 0.0, avg: 38.2, max: 69.0) [2024-03-21 05:44:55,522][03784] Avg episode reward: [(0, '1.622')] [2024-03-21 05:44:57,769][03995] Signal inference workers to stop experience collection... (23750 times) [2024-03-21 05:44:57,770][03995] Signal inference workers to resume experience collection... (23750 times) [2024-03-21 05:44:57,822][04017] InferenceWorker_p0-w0: stopping experience collection (23750 times) [2024-03-21 05:44:57,869][04017] InferenceWorker_p0-w0: resuming experience collection (23750 times) [2024-03-21 05:45:00,521][03784] Fps is (10 sec: 42597.7, 60 sec: 47513.6, 300 sec: 47985.7). Total num frames: 1179385856. Throughput: 0: 48499.9. Samples: 1180529600. Policy #0 lag: (min: 0.0, avg: 38.2, max: 69.0) [2024-03-21 05:45:00,522][03784] Avg episode reward: [(0, '0.702')] [2024-03-21 05:45:00,777][04017] Updated weights for policy 0, policy_version 35993 (0.0015) [2024-03-21 05:45:05,521][03784] Fps is (10 sec: 39321.6, 60 sec: 44783.0, 300 sec: 47319.2). Total num frames: 1179516928. Throughput: 0: 49086.8. Samples: 1180826300. Policy #0 lag: (min: 0.0, avg: 38.2, max: 69.0) [2024-03-21 05:45:05,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 05:45:07,714][04017] Updated weights for policy 0, policy_version 36003 (0.0014) [2024-03-21 05:45:10,521][03784] Fps is (10 sec: 49152.5, 60 sec: 47513.6, 300 sec: 47652.5). Total num frames: 1179877376. Throughput: 0: 48753.3. Samples: 1181092200. Policy #0 lag: (min: 0.0, avg: 38.2, max: 69.0) [2024-03-21 05:45:10,522][03784] Avg episode reward: [(0, '1.001')] [2024-03-21 05:45:15,521][03784] Fps is (10 sec: 45875.1, 60 sec: 47513.6, 300 sec: 47430.3). Total num frames: 1179975680. Throughput: 0: 48173.3. Samples: 1181236700. Policy #0 lag: (min: 0.0, avg: 46.7, max: 93.0) [2024-03-21 05:45:15,522][03784] Avg episode reward: [(0, '0.999')] [2024-03-21 05:45:17,851][04017] Updated weights for policy 0, policy_version 36013 (0.0018) [2024-03-21 05:45:20,521][03784] Fps is (10 sec: 32768.1, 60 sec: 48059.8, 300 sec: 47430.3). Total num frames: 1180205056. Throughput: 0: 47955.6. Samples: 1181526100. Policy #0 lag: (min: 0.0, avg: 46.7, max: 93.0) [2024-03-21 05:45:20,522][03784] Avg episode reward: [(0, '0.908')] [2024-03-21 05:45:23,358][04017] Updated weights for policy 0, policy_version 36023 (0.0013) [2024-03-21 05:45:25,521][03784] Fps is (10 sec: 52428.9, 60 sec: 50244.3, 300 sec: 47874.6). Total num frames: 1180499968. Throughput: 0: 47744.5. Samples: 1181797200. Policy #0 lag: (min: 0.0, avg: 46.7, max: 93.0) [2024-03-21 05:45:25,522][03784] Avg episode reward: [(0, '0.619')] [2024-03-21 05:45:29,004][04017] Updated weights for policy 0, policy_version 36033 (0.0013) [2024-03-21 05:45:30,521][03784] Fps is (10 sec: 58982.2, 60 sec: 52428.7, 300 sec: 47430.3). Total num frames: 1180794880. Throughput: 0: 47753.3. Samples: 1181937000. Policy #0 lag: (min: 0.0, avg: 46.7, max: 93.0) [2024-03-21 05:45:30,522][03784] Avg episode reward: [(0, '0.996')] [2024-03-21 05:45:35,521][03784] Fps is (10 sec: 36044.9, 60 sec: 47513.6, 300 sec: 46652.8). Total num frames: 1180860416. Throughput: 0: 47822.2. Samples: 1182226600. Policy #0 lag: (min: 0.0, avg: 46.7, max: 93.0) [2024-03-21 05:45:35,522][03784] Avg episode reward: [(0, '1.177')] [2024-03-21 05:45:39,877][04017] Updated weights for policy 0, policy_version 36043 (0.0015) [2024-03-21 05:45:40,521][03784] Fps is (10 sec: 29491.3, 60 sec: 46421.4, 300 sec: 46986.0). Total num frames: 1181089792. Throughput: 0: 47651.1. Samples: 1182521600. Policy #0 lag: (min: 0.0, avg: 55.0, max: 110.0) [2024-03-21 05:45:40,522][03784] Avg episode reward: [(0, '1.310')] [2024-03-21 05:45:42,709][03995] Signal inference workers to stop experience collection... (23800 times) [2024-03-21 05:45:42,789][04017] InferenceWorker_p0-w0: stopping experience collection (23800 times) [2024-03-21 05:45:42,979][03995] Signal inference workers to resume experience collection... (23800 times) [2024-03-21 05:45:42,979][04017] InferenceWorker_p0-w0: resuming experience collection (23800 times) [2024-03-21 05:45:43,226][04017] Updated weights for policy 0, policy_version 36053 (0.0019) [2024-03-21 05:45:45,521][03784] Fps is (10 sec: 65536.3, 60 sec: 46967.5, 300 sec: 48430.0). Total num frames: 1181515776. Throughput: 0: 46753.5. Samples: 1182633500. Policy #0 lag: (min: 0.0, avg: 55.0, max: 110.0) [2024-03-21 05:45:45,522][03784] Avg episode reward: [(0, '0.817')] [2024-03-21 05:45:50,521][03784] Fps is (10 sec: 49152.1, 60 sec: 43690.6, 300 sec: 47763.5). Total num frames: 1181581312. Throughput: 0: 46948.9. Samples: 1182939000. Policy #0 lag: (min: 0.0, avg: 55.0, max: 110.0) [2024-03-21 05:45:50,522][03784] Avg episode reward: [(0, '1.459')] [2024-03-21 05:45:52,194][04017] Updated weights for policy 0, policy_version 36063 (0.0010) [2024-03-21 05:45:55,521][03784] Fps is (10 sec: 32767.7, 60 sec: 45329.0, 300 sec: 47430.3). Total num frames: 1181843456. Throughput: 0: 47717.8. Samples: 1183239500. Policy #0 lag: (min: 0.0, avg: 55.0, max: 110.0) [2024-03-21 05:45:55,522][03784] Avg episode reward: [(0, '0.810')] [2024-03-21 05:45:58,725][04017] Updated weights for policy 0, policy_version 36073 (0.0023) [2024-03-21 05:46:00,521][03784] Fps is (10 sec: 52428.6, 60 sec: 45329.1, 300 sec: 47763.5). Total num frames: 1182105600. Throughput: 0: 47786.6. Samples: 1183387100. Policy #0 lag: (min: 0.0, avg: 55.0, max: 110.0) [2024-03-21 05:46:00,522][03784] Avg episode reward: [(0, '1.282')] [2024-03-21 05:46:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000036075_1182105600.pth... [2024-03-21 05:46:00,651][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000035728_1170735104.pth [2024-03-21 05:46:04,268][04017] Updated weights for policy 0, policy_version 36083 (0.0016) [2024-03-21 05:46:05,521][03784] Fps is (10 sec: 58982.6, 60 sec: 48605.9, 300 sec: 47874.6). Total num frames: 1182433280. Throughput: 0: 47691.1. Samples: 1183672200. Policy #0 lag: (min: 1.0, avg: 42.7, max: 84.0) [2024-03-21 05:46:05,522][03784] Avg episode reward: [(0, '1.380')] [2024-03-21 05:46:09,009][04017] Updated weights for policy 0, policy_version 36093 (0.0016) [2024-03-21 05:46:10,521][03784] Fps is (10 sec: 68812.2, 60 sec: 48605.8, 300 sec: 48541.1). Total num frames: 1182793728. Throughput: 0: 47333.2. Samples: 1183927200. Policy #0 lag: (min: 1.0, avg: 42.7, max: 84.0) [2024-03-21 05:46:10,522][03784] Avg episode reward: [(0, '1.297')] [2024-03-21 05:46:12,594][04017] Updated weights for policy 0, policy_version 36103 (0.0016) [2024-03-21 05:46:15,521][03784] Fps is (10 sec: 65535.5, 60 sec: 51882.6, 300 sec: 47985.7). Total num frames: 1183088640. Throughput: 0: 47255.6. Samples: 1184063500. Policy #0 lag: (min: 1.0, avg: 42.7, max: 84.0) [2024-03-21 05:46:15,522][03784] Avg episode reward: [(0, '1.322')] [2024-03-21 05:46:20,521][03784] Fps is (10 sec: 45875.3, 60 sec: 50790.3, 300 sec: 47208.1). Total num frames: 1183252480. Throughput: 0: 47826.5. Samples: 1184378800. Policy #0 lag: (min: 1.0, avg: 42.7, max: 84.0) [2024-03-21 05:46:20,522][03784] Avg episode reward: [(0, '1.318')] [2024-03-21 05:46:22,164][04017] Updated weights for policy 0, policy_version 36113 (0.0010) [2024-03-21 05:46:25,521][03784] Fps is (10 sec: 29491.1, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 1183383552. Throughput: 0: 48126.6. Samples: 1184687300. Policy #0 lag: (min: 1.0, avg: 42.7, max: 84.0) [2024-03-21 05:46:25,522][03784] Avg episode reward: [(0, '1.498')] [2024-03-21 05:46:30,521][03784] Fps is (10 sec: 32768.0, 60 sec: 46421.3, 300 sec: 47208.1). Total num frames: 1183580160. Throughput: 0: 48815.3. Samples: 1184830200. Policy #0 lag: (min: 0.0, avg: 44.3, max: 96.0) [2024-03-21 05:46:30,522][03784] Avg episode reward: [(0, '1.231')] [2024-03-21 05:46:34,285][04017] Updated weights for policy 0, policy_version 36123 (0.0013) [2024-03-21 05:46:35,521][03784] Fps is (10 sec: 29491.4, 60 sec: 46967.4, 300 sec: 46652.7). Total num frames: 1183678464. Throughput: 0: 48377.8. Samples: 1185116000. Policy #0 lag: (min: 0.0, avg: 44.3, max: 96.0) [2024-03-21 05:46:35,522][03784] Avg episode reward: [(0, '0.954')] [2024-03-21 05:46:40,521][03784] Fps is (10 sec: 29491.4, 60 sec: 46421.3, 300 sec: 46652.8). Total num frames: 1183875072. Throughput: 0: 47842.2. Samples: 1185392400. Policy #0 lag: (min: 0.0, avg: 44.3, max: 96.0) [2024-03-21 05:46:40,522][03784] Avg episode reward: [(0, '1.339')] [2024-03-21 05:46:43,518][04017] Updated weights for policy 0, policy_version 36133 (0.0011) [2024-03-21 05:46:43,519][03995] Signal inference workers to stop experience collection... (23850 times) [2024-03-21 05:46:43,520][03995] Signal inference workers to resume experience collection... (23850 times) [2024-03-21 05:46:43,606][04017] InferenceWorker_p0-w0: stopping experience collection (23850 times) [2024-03-21 05:46:43,607][04017] InferenceWorker_p0-w0: resuming experience collection (23850 times) [2024-03-21 05:46:45,521][03784] Fps is (10 sec: 45875.0, 60 sec: 43690.6, 300 sec: 46541.7). Total num frames: 1184137216. Throughput: 0: 47795.5. Samples: 1185537900. Policy #0 lag: (min: 0.0, avg: 44.3, max: 96.0) [2024-03-21 05:46:45,522][03784] Avg episode reward: [(0, '1.038')] [2024-03-21 05:46:47,892][04017] Updated weights for policy 0, policy_version 36143 (0.0021) [2024-03-21 05:46:50,521][03784] Fps is (10 sec: 52428.9, 60 sec: 46967.4, 300 sec: 46874.9). Total num frames: 1184399360. Throughput: 0: 47344.4. Samples: 1185802700. Policy #0 lag: (min: 0.0, avg: 44.3, max: 96.0) [2024-03-21 05:46:50,522][03784] Avg episode reward: [(0, '1.141')] [2024-03-21 05:46:55,440][04017] Updated weights for policy 0, policy_version 36153 (0.0015) [2024-03-21 05:46:55,521][03784] Fps is (10 sec: 52428.3, 60 sec: 46967.3, 300 sec: 47208.1). Total num frames: 1184661504. Throughput: 0: 47966.6. Samples: 1186085700. Policy #0 lag: (min: 0.0, avg: 30.6, max: 72.0) [2024-03-21 05:46:55,522][03784] Avg episode reward: [(0, '1.472')] [2024-03-21 05:46:58,738][04017] Updated weights for policy 0, policy_version 36163 (0.0025) [2024-03-21 05:47:00,521][03784] Fps is (10 sec: 72089.7, 60 sec: 50244.3, 300 sec: 48430.0). Total num frames: 1185120256. Throughput: 0: 47804.5. Samples: 1186214700. Policy #0 lag: (min: 0.0, avg: 30.6, max: 72.0) [2024-03-21 05:47:00,522][03784] Avg episode reward: [(0, '1.401')] [2024-03-21 05:47:04,554][04017] Updated weights for policy 0, policy_version 36173 (0.0023) [2024-03-21 05:47:05,521][03784] Fps is (10 sec: 68814.1, 60 sec: 48605.9, 300 sec: 48541.1). Total num frames: 1185349632. Throughput: 0: 46964.6. Samples: 1186492200. Policy #0 lag: (min: 0.0, avg: 30.6, max: 72.0) [2024-03-21 05:47:05,522][03784] Avg episode reward: [(0, '0.693')] [2024-03-21 05:47:10,521][03784] Fps is (10 sec: 36044.6, 60 sec: 44783.0, 300 sec: 47985.7). Total num frames: 1185480704. Throughput: 0: 46588.9. Samples: 1186783800. Policy #0 lag: (min: 0.0, avg: 30.6, max: 72.0) [2024-03-21 05:47:10,522][03784] Avg episode reward: [(0, '0.802')] [2024-03-21 05:47:12,103][04017] Updated weights for policy 0, policy_version 36183 (0.0015) [2024-03-21 05:47:15,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 48318.9). Total num frames: 1185808384. Throughput: 0: 46540.1. Samples: 1186924500. Policy #0 lag: (min: 0.0, avg: 55.1, max: 119.0) [2024-03-21 05:47:15,522][03784] Avg episode reward: [(0, '1.050')] [2024-03-21 05:47:18,007][04017] Updated weights for policy 0, policy_version 36193 (0.0010) [2024-03-21 05:47:20,521][03784] Fps is (10 sec: 55706.0, 60 sec: 46421.4, 300 sec: 48430.0). Total num frames: 1186037760. Throughput: 0: 47044.5. Samples: 1187233000. Policy #0 lag: (min: 0.0, avg: 55.1, max: 119.0) [2024-03-21 05:47:20,522][03784] Avg episode reward: [(0, '1.050')] [2024-03-21 05:47:25,521][03784] Fps is (10 sec: 39321.7, 60 sec: 46967.5, 300 sec: 47874.6). Total num frames: 1186201600. Throughput: 0: 47500.1. Samples: 1187529900. Policy #0 lag: (min: 0.0, avg: 55.1, max: 119.0) [2024-03-21 05:47:25,522][03784] Avg episode reward: [(0, '1.105')] [2024-03-21 05:47:27,761][04017] Updated weights for policy 0, policy_version 36203 (0.0010) [2024-03-21 05:47:29,231][03995] Signal inference workers to stop experience collection... (23900 times) [2024-03-21 05:47:29,232][03995] Signal inference workers to resume experience collection... (23900 times) [2024-03-21 05:47:29,310][04017] InferenceWorker_p0-w0: stopping experience collection (23900 times) [2024-03-21 05:47:29,311][04017] InferenceWorker_p0-w0: resuming experience collection (23900 times) [2024-03-21 05:47:30,521][03784] Fps is (10 sec: 45875.3, 60 sec: 48606.0, 300 sec: 47430.3). Total num frames: 1186496512. Throughput: 0: 47724.5. Samples: 1187685500. Policy #0 lag: (min: 0.0, avg: 55.1, max: 119.0) [2024-03-21 05:47:30,522][03784] Avg episode reward: [(0, '1.105')] [2024-03-21 05:47:32,797][04017] Updated weights for policy 0, policy_version 36213 (0.0023) [2024-03-21 05:47:35,521][03784] Fps is (10 sec: 58982.7, 60 sec: 51882.8, 300 sec: 48096.8). Total num frames: 1186791424. Throughput: 0: 47940.1. Samples: 1187960000. Policy #0 lag: (min: 0.0, avg: 55.1, max: 119.0) [2024-03-21 05:47:35,521][03784] Avg episode reward: [(0, '0.806')] [2024-03-21 05:47:40,450][04017] Updated weights for policy 0, policy_version 36223 (0.0011) [2024-03-21 05:47:40,521][03784] Fps is (10 sec: 45875.0, 60 sec: 51336.6, 300 sec: 47541.4). Total num frames: 1186955264. Throughput: 0: 47826.8. Samples: 1188237900. Policy #0 lag: (min: 0.0, avg: 32.5, max: 86.0) [2024-03-21 05:47:40,522][03784] Avg episode reward: [(0, '1.350')] [2024-03-21 05:47:45,521][03784] Fps is (10 sec: 36044.6, 60 sec: 50244.3, 300 sec: 47097.5). Total num frames: 1187151872. Throughput: 0: 48400.1. Samples: 1188392700. Policy #0 lag: (min: 0.0, avg: 32.5, max: 86.0) [2024-03-21 05:47:45,522][03784] Avg episode reward: [(0, '1.350')] [2024-03-21 05:47:47,194][04017] Updated weights for policy 0, policy_version 36233 (0.0010) [2024-03-21 05:47:50,521][03784] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 47541.4). Total num frames: 1187446784. Throughput: 0: 48622.2. Samples: 1188680200. Policy #0 lag: (min: 0.0, avg: 32.5, max: 86.0) [2024-03-21 05:47:50,522][03784] Avg episode reward: [(0, '1.591')] [2024-03-21 05:47:54,785][04017] Updated weights for policy 0, policy_version 36243 (0.0012) [2024-03-21 05:47:55,521][03784] Fps is (10 sec: 45874.9, 60 sec: 49152.1, 300 sec: 47319.2). Total num frames: 1187610624. Throughput: 0: 48431.1. Samples: 1188963200. Policy #0 lag: (min: 0.0, avg: 32.5, max: 86.0) [2024-03-21 05:47:55,522][03784] Avg episode reward: [(0, '1.536')] [2024-03-21 05:48:00,370][04017] Updated weights for policy 0, policy_version 36253 (0.0011) [2024-03-21 05:48:00,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46967.5, 300 sec: 48430.0). Total num frames: 1187938304. Throughput: 0: 48235.5. Samples: 1189095100. Policy #0 lag: (min: 0.0, avg: 32.5, max: 86.0) [2024-03-21 05:48:00,522][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 05:48:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000036253_1187938304.pth... [2024-03-21 05:48:00,677][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000035906_1176567808.pth [2024-03-21 05:48:05,521][03784] Fps is (10 sec: 52428.9, 60 sec: 46421.3, 300 sec: 48096.8). Total num frames: 1188134912. Throughput: 0: 47586.6. Samples: 1189374400. Policy #0 lag: (min: 0.0, avg: 32.5, max: 86.0) [2024-03-21 05:48:05,522][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 05:48:10,521][03784] Fps is (10 sec: 29491.2, 60 sec: 45875.2, 300 sec: 47541.4). Total num frames: 1188233216. Throughput: 0: 47015.5. Samples: 1189645600. Policy #0 lag: (min: 0.0, avg: 38.6, max: 107.0) [2024-03-21 05:48:10,522][03784] Avg episode reward: [(0, '1.080')] [2024-03-21 05:48:11,251][04017] Updated weights for policy 0, policy_version 36263 (0.0035) [2024-03-21 05:48:15,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44782.9, 300 sec: 47763.5). Total num frames: 1188495360. Throughput: 0: 46448.8. Samples: 1189775700. Policy #0 lag: (min: 0.0, avg: 38.6, max: 107.0) [2024-03-21 05:48:15,522][03784] Avg episode reward: [(0, '0.662')] [2024-03-21 05:48:17,932][04017] Updated weights for policy 0, policy_version 36273 (0.0015) [2024-03-21 05:48:20,521][03784] Fps is (10 sec: 52428.7, 60 sec: 45329.0, 300 sec: 47430.3). Total num frames: 1188757504. Throughput: 0: 46411.0. Samples: 1190048500. Policy #0 lag: (min: 0.0, avg: 38.6, max: 107.0) [2024-03-21 05:48:20,522][03784] Avg episode reward: [(0, '1.345')] [2024-03-21 05:48:24,846][04017] Updated weights for policy 0, policy_version 36283 (0.0014) [2024-03-21 05:48:25,521][03784] Fps is (10 sec: 45875.2, 60 sec: 45875.2, 300 sec: 47430.3). Total num frames: 1188954112. Throughput: 0: 47015.5. Samples: 1190353600. Policy #0 lag: (min: 0.0, avg: 38.6, max: 107.0) [2024-03-21 05:48:25,522][03784] Avg episode reward: [(0, '0.766')] [2024-03-21 05:48:27,492][03995] Signal inference workers to stop experience collection... (23950 times) [2024-03-21 05:48:27,558][04017] InferenceWorker_p0-w0: stopping experience collection (23950 times) [2024-03-21 05:48:27,623][03995] Signal inference workers to resume experience collection... (23950 times) [2024-03-21 05:48:27,624][04017] InferenceWorker_p0-w0: resuming experience collection (23950 times) [2024-03-21 05:48:30,360][04017] Updated weights for policy 0, policy_version 36293 (0.0013) [2024-03-21 05:48:30,521][03784] Fps is (10 sec: 49151.8, 60 sec: 45875.1, 300 sec: 47652.4). Total num frames: 1189249024. Throughput: 0: 46897.7. Samples: 1190503100. Policy #0 lag: (min: 0.0, avg: 38.6, max: 107.0) [2024-03-21 05:48:30,522][03784] Avg episode reward: [(0, '1.625')] [2024-03-21 05:48:35,521][03784] Fps is (10 sec: 55705.7, 60 sec: 45329.0, 300 sec: 47763.5). Total num frames: 1189511168. Throughput: 0: 46633.3. Samples: 1190778700. Policy #0 lag: (min: 3.0, avg: 51.9, max: 119.0) [2024-03-21 05:48:35,522][03784] Avg episode reward: [(0, '1.161')] [2024-03-21 05:48:36,257][04017] Updated weights for policy 0, policy_version 36303 (0.0010) [2024-03-21 05:48:40,521][03784] Fps is (10 sec: 55706.3, 60 sec: 47513.6, 300 sec: 47874.6). Total num frames: 1189806080. Throughput: 0: 46593.4. Samples: 1191059900. Policy #0 lag: (min: 3.0, avg: 51.9, max: 119.0) [2024-03-21 05:48:40,522][03784] Avg episode reward: [(0, '1.262')] [2024-03-21 05:48:41,409][04017] Updated weights for policy 0, policy_version 36313 (0.0011) [2024-03-21 05:48:45,521][03784] Fps is (10 sec: 58982.0, 60 sec: 49151.9, 300 sec: 48207.8). Total num frames: 1190100992. Throughput: 0: 46517.7. Samples: 1191188400. Policy #0 lag: (min: 3.0, avg: 51.9, max: 119.0) [2024-03-21 05:48:45,522][03784] Avg episode reward: [(0, '1.262')] [2024-03-21 05:48:49,065][04017] Updated weights for policy 0, policy_version 36323 (0.0014) [2024-03-21 05:48:50,521][03784] Fps is (10 sec: 52427.9, 60 sec: 48059.6, 300 sec: 47985.7). Total num frames: 1190330368. Throughput: 0: 46611.0. Samples: 1191471900. Policy #0 lag: (min: 3.0, avg: 51.9, max: 119.0) [2024-03-21 05:48:50,522][03784] Avg episode reward: [(0, '1.365')] [2024-03-21 05:48:55,521][03784] Fps is (10 sec: 39321.8, 60 sec: 48059.8, 300 sec: 47319.2). Total num frames: 1190494208. Throughput: 0: 46235.6. Samples: 1191726200. Policy #0 lag: (min: 3.0, avg: 51.9, max: 119.0) [2024-03-21 05:48:55,522][03784] Avg episode reward: [(0, '1.450')] [2024-03-21 05:49:00,521][03784] Fps is (10 sec: 19661.1, 60 sec: 43144.6, 300 sec: 46430.6). Total num frames: 1190526976. Throughput: 0: 46884.5. Samples: 1191885500. Policy #0 lag: (min: 0.0, avg: 33.4, max: 69.0) [2024-03-21 05:49:00,522][03784] Avg episode reward: [(0, '1.418')] [2024-03-21 05:49:01,939][04017] Updated weights for policy 0, policy_version 36333 (0.0010) [2024-03-21 05:49:05,521][03784] Fps is (10 sec: 32767.8, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 1190821888. Throughput: 0: 47344.4. Samples: 1192179000. Policy #0 lag: (min: 0.0, avg: 33.4, max: 69.0) [2024-03-21 05:49:05,522][03784] Avg episode reward: [(0, '1.720')] [2024-03-21 05:49:06,177][04017] Updated weights for policy 0, policy_version 36343 (0.0013) [2024-03-21 05:49:10,521][03784] Fps is (10 sec: 58982.1, 60 sec: 48059.7, 300 sec: 47430.3). Total num frames: 1191116800. Throughput: 0: 47168.9. Samples: 1192476200. Policy #0 lag: (min: 0.0, avg: 33.4, max: 69.0) [2024-03-21 05:49:10,522][03784] Avg episode reward: [(0, '0.960')] [2024-03-21 05:49:12,693][04017] Updated weights for policy 0, policy_version 36353 (0.0017) [2024-03-21 05:49:15,521][03784] Fps is (10 sec: 52429.5, 60 sec: 47513.7, 300 sec: 47541.4). Total num frames: 1191346176. Throughput: 0: 47169.0. Samples: 1192625700. Policy #0 lag: (min: 0.0, avg: 33.4, max: 69.0) [2024-03-21 05:49:15,522][03784] Avg episode reward: [(0, '1.511')] [2024-03-21 05:49:18,616][03995] Signal inference workers to stop experience collection... (24000 times) [2024-03-21 05:49:18,719][04017] InferenceWorker_p0-w0: stopping experience collection (24000 times) [2024-03-21 05:49:18,807][03995] Signal inference workers to resume experience collection... (24000 times) [2024-03-21 05:49:18,808][04017] InferenceWorker_p0-w0: resuming experience collection (24000 times) [2024-03-21 05:49:18,811][04017] Updated weights for policy 0, policy_version 36363 (0.0014) [2024-03-21 05:49:20,521][03784] Fps is (10 sec: 49152.2, 60 sec: 47513.6, 300 sec: 47874.6). Total num frames: 1191608320. Throughput: 0: 47540.0. Samples: 1192918000. Policy #0 lag: (min: 0.0, avg: 58.0, max: 111.0) [2024-03-21 05:49:20,522][03784] Avg episode reward: [(0, '0.870')] [2024-03-21 05:49:23,644][04017] Updated weights for policy 0, policy_version 36373 (0.0014) [2024-03-21 05:49:25,521][03784] Fps is (10 sec: 62258.6, 60 sec: 50244.3, 300 sec: 48541.1). Total num frames: 1191968768. Throughput: 0: 47568.8. Samples: 1193200500. Policy #0 lag: (min: 0.0, avg: 58.0, max: 111.0) [2024-03-21 05:49:25,522][03784] Avg episode reward: [(0, '0.985')] [2024-03-21 05:49:30,521][03784] Fps is (10 sec: 52428.5, 60 sec: 48059.8, 300 sec: 47874.6). Total num frames: 1192132608. Throughput: 0: 47935.6. Samples: 1193345500. Policy #0 lag: (min: 0.0, avg: 58.0, max: 111.0) [2024-03-21 05:49:30,522][03784] Avg episode reward: [(0, '0.825')] [2024-03-21 05:49:33,326][04017] Updated weights for policy 0, policy_version 36383 (0.0010) [2024-03-21 05:49:35,521][03784] Fps is (10 sec: 42598.1, 60 sec: 48059.6, 300 sec: 47763.5). Total num frames: 1192394752. Throughput: 0: 48060.0. Samples: 1193634600. Policy #0 lag: (min: 0.0, avg: 58.0, max: 111.0) [2024-03-21 05:49:35,522][03784] Avg episode reward: [(0, '1.391')] [2024-03-21 05:49:38,273][04017] Updated weights for policy 0, policy_version 36393 (0.0011) [2024-03-21 05:49:40,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46421.2, 300 sec: 47097.0). Total num frames: 1192591360. Throughput: 0: 48697.7. Samples: 1193917600. Policy #0 lag: (min: 0.0, avg: 58.0, max: 111.0) [2024-03-21 05:49:40,531][03784] Avg episode reward: [(0, '1.340')] [2024-03-21 05:49:44,165][04017] Updated weights for policy 0, policy_version 36403 (0.0014) [2024-03-21 05:49:45,521][03784] Fps is (10 sec: 58982.0, 60 sec: 48059.6, 300 sec: 47541.3). Total num frames: 1192984576. Throughput: 0: 48150.9. Samples: 1194052300. Policy #0 lag: (min: 0.0, avg: 36.2, max: 79.0) [2024-03-21 05:49:45,522][03784] Avg episode reward: [(0, '0.967')] [2024-03-21 05:49:49,124][04017] Updated weights for policy 0, policy_version 36413 (0.0016) [2024-03-21 05:49:50,521][03784] Fps is (10 sec: 65536.5, 60 sec: 48606.0, 300 sec: 47874.6). Total num frames: 1193246720. Throughput: 0: 47731.2. Samples: 1194326900. Policy #0 lag: (min: 0.0, avg: 36.2, max: 79.0) [2024-03-21 05:49:50,522][03784] Avg episode reward: [(0, '1.202')] [2024-03-21 05:49:55,521][03784] Fps is (10 sec: 42598.8, 60 sec: 48605.8, 300 sec: 47541.4). Total num frames: 1193410560. Throughput: 0: 47631.0. Samples: 1194619600. Policy #0 lag: (min: 0.0, avg: 36.2, max: 79.0) [2024-03-21 05:49:55,522][03784] Avg episode reward: [(0, '0.873')] [2024-03-21 05:49:57,923][04017] Updated weights for policy 0, policy_version 36423 (0.0014) [2024-03-21 05:50:00,521][03784] Fps is (10 sec: 36044.6, 60 sec: 51336.4, 300 sec: 47763.5). Total num frames: 1193607168. Throughput: 0: 47277.6. Samples: 1194753200. Policy #0 lag: (min: 0.0, avg: 36.2, max: 79.0) [2024-03-21 05:50:00,522][03784] Avg episode reward: [(0, '0.973')] [2024-03-21 05:50:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000036426_1193607168.pth... [2024-03-21 05:50:00,661][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000036075_1182105600.pth [2024-03-21 05:50:05,521][03784] Fps is (10 sec: 39322.0, 60 sec: 49698.2, 300 sec: 47208.1). Total num frames: 1193803776. Throughput: 0: 47360.0. Samples: 1195049200. Policy #0 lag: (min: 0.0, avg: 36.2, max: 79.0) [2024-03-21 05:50:05,522][03784] Avg episode reward: [(0, '1.214')] [2024-03-21 05:50:06,815][04017] Updated weights for policy 0, policy_version 36433 (0.0011) [2024-03-21 05:50:10,521][03784] Fps is (10 sec: 36044.6, 60 sec: 47513.5, 300 sec: 47430.3). Total num frames: 1193967616. Throughput: 0: 47128.8. Samples: 1195321300. Policy #0 lag: (min: 1.0, avg: 30.7, max: 75.0) [2024-03-21 05:50:10,522][03784] Avg episode reward: [(0, '1.410')] [2024-03-21 05:50:15,521][03784] Fps is (10 sec: 32767.8, 60 sec: 46421.3, 300 sec: 47208.1). Total num frames: 1194131456. Throughput: 0: 47160.0. Samples: 1195467700. Policy #0 lag: (min: 1.0, avg: 30.7, max: 75.0) [2024-03-21 05:50:15,522][03784] Avg episode reward: [(0, '0.427')] [2024-03-21 05:50:15,734][04017] Updated weights for policy 0, policy_version 36443 (0.0012) [2024-03-21 05:50:20,521][03784] Fps is (10 sec: 26214.6, 60 sec: 43690.6, 300 sec: 46541.7). Total num frames: 1194229760. Throughput: 0: 47269.0. Samples: 1195761700. Policy #0 lag: (min: 1.0, avg: 30.7, max: 75.0) [2024-03-21 05:50:20,522][03784] Avg episode reward: [(0, '1.127')] [2024-03-21 05:50:23,217][03995] Signal inference workers to stop experience collection... (24050 times) [2024-03-21 05:50:23,218][03995] Signal inference workers to resume experience collection... (24050 times) [2024-03-21 05:50:23,274][04017] InferenceWorker_p0-w0: stopping experience collection (24050 times) [2024-03-21 05:50:23,275][04017] InferenceWorker_p0-w0: resuming experience collection (24050 times) [2024-03-21 05:50:24,995][04017] Updated weights for policy 0, policy_version 36453 (0.0013) [2024-03-21 05:50:25,521][03784] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 46541.7). Total num frames: 1194524672. Throughput: 0: 47502.3. Samples: 1196055200. Policy #0 lag: (min: 1.0, avg: 30.7, max: 75.0) [2024-03-21 05:50:25,522][03784] Avg episode reward: [(0, '1.403')] [2024-03-21 05:50:28,970][04017] Updated weights for policy 0, policy_version 36463 (0.0015) [2024-03-21 05:50:30,521][03784] Fps is (10 sec: 62258.8, 60 sec: 45329.0, 300 sec: 47430.3). Total num frames: 1194852352. Throughput: 0: 47606.8. Samples: 1196194600. Policy #0 lag: (min: 1.0, avg: 30.7, max: 75.0) [2024-03-21 05:50:30,522][03784] Avg episode reward: [(0, '1.403')] [2024-03-21 05:50:33,950][04017] Updated weights for policy 0, policy_version 36473 (0.0015) [2024-03-21 05:50:35,521][03784] Fps is (10 sec: 78643.4, 60 sec: 48606.0, 300 sec: 48207.8). Total num frames: 1195311104. Throughput: 0: 47673.4. Samples: 1196472200. Policy #0 lag: (min: 0.0, avg: 29.1, max: 63.0) [2024-03-21 05:50:35,522][03784] Avg episode reward: [(0, '1.577')] [2024-03-21 05:50:39,288][04017] Updated weights for policy 0, policy_version 36483 (0.0015) [2024-03-21 05:50:40,521][03784] Fps is (10 sec: 75366.7, 60 sec: 50244.3, 300 sec: 47763.5). Total num frames: 1195606016. Throughput: 0: 47431.2. Samples: 1196754000. Policy #0 lag: (min: 0.0, avg: 29.1, max: 63.0) [2024-03-21 05:50:40,522][03784] Avg episode reward: [(0, '1.515')] [2024-03-21 05:50:43,010][04017] Updated weights for policy 0, policy_version 36493 (0.0023) [2024-03-21 05:50:45,521][03784] Fps is (10 sec: 58982.0, 60 sec: 48606.0, 300 sec: 48541.1). Total num frames: 1195900928. Throughput: 0: 47602.2. Samples: 1196895300. Policy #0 lag: (min: 0.0, avg: 29.1, max: 63.0) [2024-03-21 05:50:45,522][03784] Avg episode reward: [(0, '0.962')] [2024-03-21 05:50:50,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46421.3, 300 sec: 48096.8). Total num frames: 1196032000. Throughput: 0: 48053.3. Samples: 1197211600. Policy #0 lag: (min: 0.0, avg: 29.1, max: 63.0) [2024-03-21 05:50:50,522][03784] Avg episode reward: [(0, '0.972')] [2024-03-21 05:50:52,036][04017] Updated weights for policy 0, policy_version 36503 (0.0016) [2024-03-21 05:50:55,521][03784] Fps is (10 sec: 39321.7, 60 sec: 48059.8, 300 sec: 48096.8). Total num frames: 1196294144. Throughput: 0: 48362.3. Samples: 1197497600. Policy #0 lag: (min: 0.0, avg: 43.6, max: 96.0) [2024-03-21 05:50:55,522][03784] Avg episode reward: [(0, '1.417')] [2024-03-21 05:50:59,812][04017] Updated weights for policy 0, policy_version 36513 (0.0011) [2024-03-21 05:51:00,521][03784] Fps is (10 sec: 45874.9, 60 sec: 48059.7, 300 sec: 47652.4). Total num frames: 1196490752. Throughput: 0: 48397.7. Samples: 1197645600. Policy #0 lag: (min: 0.0, avg: 43.6, max: 96.0) [2024-03-21 05:51:00,522][03784] Avg episode reward: [(0, '1.104')] [2024-03-21 05:51:03,660][03995] Signal inference workers to stop experience collection... (24100 times) [2024-03-21 05:51:03,661][03995] Signal inference workers to resume experience collection... (24100 times) [2024-03-21 05:51:03,704][04017] InferenceWorker_p0-w0: stopping experience collection (24100 times) [2024-03-21 05:51:03,704][04017] InferenceWorker_p0-w0: resuming experience collection (24100 times) [2024-03-21 05:51:05,521][03784] Fps is (10 sec: 45875.6, 60 sec: 49152.0, 300 sec: 47319.2). Total num frames: 1196752896. Throughput: 0: 47984.5. Samples: 1197921000. Policy #0 lag: (min: 0.0, avg: 43.6, max: 96.0) [2024-03-21 05:51:05,522][03784] Avg episode reward: [(0, '0.748')] [2024-03-21 05:51:05,573][04017] Updated weights for policy 0, policy_version 36523 (0.0011) [2024-03-21 05:51:10,521][03784] Fps is (10 sec: 39322.1, 60 sec: 48606.0, 300 sec: 46763.8). Total num frames: 1196883968. Throughput: 0: 47833.3. Samples: 1198207700. Policy #0 lag: (min: 0.0, avg: 43.6, max: 96.0) [2024-03-21 05:51:10,522][03784] Avg episode reward: [(0, '0.883')] [2024-03-21 05:51:15,521][03784] Fps is (10 sec: 26214.1, 60 sec: 48059.7, 300 sec: 46652.8). Total num frames: 1197015040. Throughput: 0: 48182.2. Samples: 1198362800. Policy #0 lag: (min: 0.0, avg: 43.6, max: 96.0) [2024-03-21 05:51:15,522][03784] Avg episode reward: [(0, '0.995')] [2024-03-21 05:51:16,386][04017] Updated weights for policy 0, policy_version 36533 (0.0015) [2024-03-21 05:51:19,735][04017] Updated weights for policy 0, policy_version 36543 (0.0016) [2024-03-21 05:51:20,521][03784] Fps is (10 sec: 62259.3, 60 sec: 54613.4, 300 sec: 47874.6). Total num frames: 1197506560. Throughput: 0: 47653.3. Samples: 1198616600. Policy #0 lag: (min: 5.0, avg: 32.9, max: 87.0) [2024-03-21 05:51:20,522][03784] Avg episode reward: [(0, '0.828')] [2024-03-21 05:51:25,521][03784] Fps is (10 sec: 58983.0, 60 sec: 51336.6, 300 sec: 47541.4). Total num frames: 1197604864. Throughput: 0: 47464.5. Samples: 1198889900. Policy #0 lag: (min: 5.0, avg: 32.9, max: 87.0) [2024-03-21 05:51:25,522][03784] Avg episode reward: [(0, '1.303')] [2024-03-21 05:51:29,890][04017] Updated weights for policy 0, policy_version 36553 (0.0014) [2024-03-21 05:51:30,521][03784] Fps is (10 sec: 26214.4, 60 sec: 48606.0, 300 sec: 47763.5). Total num frames: 1197768704. Throughput: 0: 47462.3. Samples: 1199031100. Policy #0 lag: (min: 5.0, avg: 32.9, max: 87.0) [2024-03-21 05:51:30,522][03784] Avg episode reward: [(0, '0.944')] [2024-03-21 05:51:35,521][03784] Fps is (10 sec: 36044.5, 60 sec: 44236.8, 300 sec: 47763.5). Total num frames: 1197965312. Throughput: 0: 46900.0. Samples: 1199322100. Policy #0 lag: (min: 5.0, avg: 32.9, max: 87.0) [2024-03-21 05:51:35,522][03784] Avg episode reward: [(0, '1.373')] [2024-03-21 05:51:40,222][04017] Updated weights for policy 0, policy_version 36563 (0.0015) [2024-03-21 05:51:40,521][03784] Fps is (10 sec: 36045.3, 60 sec: 42052.4, 300 sec: 47430.3). Total num frames: 1198129152. Throughput: 0: 46891.3. Samples: 1199607700. Policy #0 lag: (min: 5.0, avg: 32.9, max: 87.0) [2024-03-21 05:51:40,522][03784] Avg episode reward: [(0, '0.716')] [2024-03-21 05:51:45,521][03784] Fps is (10 sec: 36044.9, 60 sec: 40413.9, 300 sec: 47208.1). Total num frames: 1198325760. Throughput: 0: 46733.4. Samples: 1199748600. Policy #0 lag: (min: 0.0, avg: 29.1, max: 83.0) [2024-03-21 05:51:45,522][03784] Avg episode reward: [(0, '1.456')] [2024-03-21 05:51:46,509][04017] Updated weights for policy 0, policy_version 36573 (0.0011) [2024-03-21 05:51:50,521][03784] Fps is (10 sec: 55704.7, 60 sec: 44236.8, 300 sec: 47541.4). Total num frames: 1198686208. Throughput: 0: 46835.5. Samples: 1200028600. Policy #0 lag: (min: 0.0, avg: 29.1, max: 83.0) [2024-03-21 05:51:50,522][03784] Avg episode reward: [(0, '0.904')] [2024-03-21 05:51:52,514][04017] Updated weights for policy 0, policy_version 36583 (0.0015) [2024-03-21 05:51:55,521][03784] Fps is (10 sec: 65536.9, 60 sec: 44783.0, 300 sec: 46986.0). Total num frames: 1198981120. Throughput: 0: 46480.1. Samples: 1200299300. Policy #0 lag: (min: 0.0, avg: 29.1, max: 83.0) [2024-03-21 05:51:55,521][03784] Avg episode reward: [(0, '0.954')] [2024-03-21 05:51:57,085][04017] Updated weights for policy 0, policy_version 36593 (0.0014) [2024-03-21 05:51:57,460][03995] Signal inference workers to stop experience collection... (24150 times) [2024-03-21 05:51:57,510][04017] InferenceWorker_p0-w0: stopping experience collection (24150 times) [2024-03-21 05:51:57,683][03995] Signal inference workers to resume experience collection... (24150 times) [2024-03-21 05:51:57,684][04017] InferenceWorker_p0-w0: resuming experience collection (24150 times) [2024-03-21 05:52:00,521][03784] Fps is (10 sec: 65535.9, 60 sec: 47513.7, 300 sec: 47430.3). Total num frames: 1199341568. Throughput: 0: 45982.3. Samples: 1200432000. Policy #0 lag: (min: 0.0, avg: 29.1, max: 83.0) [2024-03-21 05:52:00,522][03784] Avg episode reward: [(0, '0.954')] [2024-03-21 05:52:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000036601_1199341568.pth... [2024-03-21 05:52:00,662][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000036253_1187938304.pth [2024-03-21 05:52:02,057][04017] Updated weights for policy 0, policy_version 36603 (0.0016) [2024-03-21 05:52:05,521][03784] Fps is (10 sec: 65534.8, 60 sec: 48059.6, 300 sec: 47985.7). Total num frames: 1199636480. Throughput: 0: 46624.3. Samples: 1200714700. Policy #0 lag: (min: 0.0, avg: 29.1, max: 83.0) [2024-03-21 05:52:05,522][03784] Avg episode reward: [(0, '1.345')] [2024-03-21 05:52:06,461][04017] Updated weights for policy 0, policy_version 36613 (0.0017) [2024-03-21 05:52:10,521][03784] Fps is (10 sec: 58981.9, 60 sec: 50790.3, 300 sec: 47874.6). Total num frames: 1199931392. Throughput: 0: 46835.4. Samples: 1200997500. Policy #0 lag: (min: 2.0, avg: 45.0, max: 85.0) [2024-03-21 05:52:10,522][03784] Avg episode reward: [(0, '1.111')] [2024-03-21 05:52:12,498][04017] Updated weights for policy 0, policy_version 36623 (0.0011) [2024-03-21 05:52:15,521][03784] Fps is (10 sec: 52429.3, 60 sec: 52428.9, 300 sec: 47874.6). Total num frames: 1200160768. Throughput: 0: 46757.8. Samples: 1201135200. Policy #0 lag: (min: 2.0, avg: 45.0, max: 85.0) [2024-03-21 05:52:15,522][03784] Avg episode reward: [(0, '1.111')] [2024-03-21 05:52:20,521][03784] Fps is (10 sec: 32768.1, 60 sec: 45875.1, 300 sec: 47652.4). Total num frames: 1200259072. Throughput: 0: 46855.5. Samples: 1201430600. Policy #0 lag: (min: 2.0, avg: 45.0, max: 85.0) [2024-03-21 05:52:20,522][03784] Avg episode reward: [(0, '1.094')] [2024-03-21 05:52:25,521][03784] Fps is (10 sec: 16383.9, 60 sec: 45329.0, 300 sec: 46874.9). Total num frames: 1200324608. Throughput: 0: 47477.6. Samples: 1201744200. Policy #0 lag: (min: 2.0, avg: 45.0, max: 85.0) [2024-03-21 05:52:25,522][03784] Avg episode reward: [(0, '1.526')] [2024-03-21 05:52:27,139][04017] Updated weights for policy 0, policy_version 36633 (0.0012) [2024-03-21 05:52:30,521][03784] Fps is (10 sec: 19660.9, 60 sec: 44782.9, 300 sec: 46319.5). Total num frames: 1200455680. Throughput: 0: 47757.8. Samples: 1201897700. Policy #0 lag: (min: 2.0, avg: 45.0, max: 85.0) [2024-03-21 05:52:30,522][03784] Avg episode reward: [(0, '1.526')] [2024-03-21 05:52:34,002][04017] Updated weights for policy 0, policy_version 36643 (0.0010) [2024-03-21 05:52:35,521][03784] Fps is (10 sec: 49152.1, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 1200816128. Throughput: 0: 47946.7. Samples: 1202186200. Policy #0 lag: (min: 0.0, avg: 25.9, max: 69.0) [2024-03-21 05:52:35,522][03784] Avg episode reward: [(0, '1.353')] [2024-03-21 05:52:39,608][04017] Updated weights for policy 0, policy_version 36653 (0.0017) [2024-03-21 05:52:40,521][03784] Fps is (10 sec: 62259.1, 60 sec: 49151.9, 300 sec: 47208.1). Total num frames: 1201078272. Throughput: 0: 47648.8. Samples: 1202443500. Policy #0 lag: (min: 0.0, avg: 25.9, max: 69.0) [2024-03-21 05:52:40,522][03784] Avg episode reward: [(0, '1.555')] [2024-03-21 05:52:45,521][03784] Fps is (10 sec: 42598.5, 60 sec: 48605.9, 300 sec: 46763.8). Total num frames: 1201242112. Throughput: 0: 47822.3. Samples: 1202584000. Policy #0 lag: (min: 0.0, avg: 25.9, max: 69.0) [2024-03-21 05:52:45,522][03784] Avg episode reward: [(0, '1.067')] [2024-03-21 05:52:47,803][04017] Updated weights for policy 0, policy_version 36663 (0.0016) [2024-03-21 05:52:50,153][03995] Signal inference workers to stop experience collection... (24200 times) [2024-03-21 05:52:50,221][04017] InferenceWorker_p0-w0: stopping experience collection (24200 times) [2024-03-21 05:52:50,405][03995] Signal inference workers to resume experience collection... (24200 times) [2024-03-21 05:52:50,405][04017] InferenceWorker_p0-w0: resuming experience collection (24200 times) [2024-03-21 05:52:50,521][03784] Fps is (10 sec: 55705.9, 60 sec: 49152.0, 300 sec: 47541.4). Total num frames: 1201635328. Throughput: 0: 47184.6. Samples: 1202838000. Policy #0 lag: (min: 0.0, avg: 25.9, max: 69.0) [2024-03-21 05:52:50,522][03784] Avg episode reward: [(0, '1.248')] [2024-03-21 05:52:50,983][04017] Updated weights for policy 0, policy_version 36673 (0.0024) [2024-03-21 05:52:55,521][03784] Fps is (10 sec: 62258.9, 60 sec: 48059.6, 300 sec: 47208.1). Total num frames: 1201864704. Throughput: 0: 47042.3. Samples: 1203114400. Policy #0 lag: (min: 0.0, avg: 36.1, max: 63.0) [2024-03-21 05:52:55,522][03784] Avg episode reward: [(0, '1.248')] [2024-03-21 05:52:59,863][04017] Updated weights for policy 0, policy_version 36683 (0.0011) [2024-03-21 05:53:00,521][03784] Fps is (10 sec: 42598.2, 60 sec: 45329.1, 300 sec: 47208.1). Total num frames: 1202061312. Throughput: 0: 47488.9. Samples: 1203272200. Policy #0 lag: (min: 0.0, avg: 36.1, max: 63.0) [2024-03-21 05:53:00,522][03784] Avg episode reward: [(0, '1.125')] [2024-03-21 05:53:05,521][03784] Fps is (10 sec: 36044.9, 60 sec: 43144.6, 300 sec: 47430.3). Total num frames: 1202225152. Throughput: 0: 47409.0. Samples: 1203564000. Policy #0 lag: (min: 0.0, avg: 36.1, max: 63.0) [2024-03-21 05:53:05,522][03784] Avg episode reward: [(0, '1.412')] [2024-03-21 05:53:08,129][04017] Updated weights for policy 0, policy_version 36693 (0.0012) [2024-03-21 05:53:10,521][03784] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 47541.4). Total num frames: 1202520064. Throughput: 0: 46740.1. Samples: 1203847500. Policy #0 lag: (min: 0.0, avg: 36.1, max: 63.0) [2024-03-21 05:53:10,522][03784] Avg episode reward: [(0, '1.472')] [2024-03-21 05:53:15,521][03784] Fps is (10 sec: 39321.7, 60 sec: 40960.0, 300 sec: 46986.0). Total num frames: 1202618368. Throughput: 0: 46580.0. Samples: 1203993800. Policy #0 lag: (min: 0.0, avg: 36.1, max: 63.0) [2024-03-21 05:53:15,522][03784] Avg episode reward: [(0, '1.049')] [2024-03-21 05:53:15,953][04017] Updated weights for policy 0, policy_version 36703 (0.0011) [2024-03-21 05:53:19,693][04017] Updated weights for policy 0, policy_version 36713 (0.0015) [2024-03-21 05:53:20,521][03784] Fps is (10 sec: 52428.7, 60 sec: 46421.4, 300 sec: 47763.5). Total num frames: 1203044352. Throughput: 0: 46022.2. Samples: 1204257200. Policy #0 lag: (min: 2.0, avg: 43.6, max: 107.0) [2024-03-21 05:53:20,522][03784] Avg episode reward: [(0, '1.026')] [2024-03-21 05:53:24,156][04017] Updated weights for policy 0, policy_version 36723 (0.0013) [2024-03-21 05:53:25,521][03784] Fps is (10 sec: 78642.6, 60 sec: 51336.5, 300 sec: 47985.7). Total num frames: 1203404800. Throughput: 0: 46140.0. Samples: 1204519800. Policy #0 lag: (min: 2.0, avg: 43.6, max: 107.0) [2024-03-21 05:53:25,522][03784] Avg episode reward: [(0, '0.468')] [2024-03-21 05:53:30,521][03784] Fps is (10 sec: 55705.5, 60 sec: 52428.8, 300 sec: 47763.5). Total num frames: 1203601408. Throughput: 0: 46095.5. Samples: 1204658300. Policy #0 lag: (min: 2.0, avg: 43.6, max: 107.0) [2024-03-21 05:53:30,522][03784] Avg episode reward: [(0, '0.604')] [2024-03-21 05:53:30,983][04017] Updated weights for policy 0, policy_version 36733 (0.0018) [2024-03-21 05:53:35,521][03784] Fps is (10 sec: 39321.5, 60 sec: 49698.1, 300 sec: 47430.3). Total num frames: 1203798016. Throughput: 0: 46986.5. Samples: 1204952400. Policy #0 lag: (min: 2.0, avg: 43.6, max: 107.0) [2024-03-21 05:53:35,522][03784] Avg episode reward: [(0, '1.094')] [2024-03-21 05:53:39,343][04017] Updated weights for policy 0, policy_version 36743 (0.0010) [2024-03-21 05:53:40,521][03784] Fps is (10 sec: 39321.6, 60 sec: 48605.9, 300 sec: 47097.1). Total num frames: 1203994624. Throughput: 0: 47093.3. Samples: 1205233600. Policy #0 lag: (min: 2.0, avg: 43.6, max: 107.0) [2024-03-21 05:53:40,522][03784] Avg episode reward: [(0, '1.026')] [2024-03-21 05:53:43,198][03995] Signal inference workers to stop experience collection... (24250 times) [2024-03-21 05:53:43,296][04017] InferenceWorker_p0-w0: stopping experience collection (24250 times) [2024-03-21 05:53:43,439][03995] Signal inference workers to resume experience collection... (24250 times) [2024-03-21 05:53:43,439][04017] InferenceWorker_p0-w0: resuming experience collection (24250 times) [2024-03-21 05:53:45,521][03784] Fps is (10 sec: 36045.1, 60 sec: 48605.9, 300 sec: 46874.9). Total num frames: 1204158464. Throughput: 0: 47106.7. Samples: 1205392000. Policy #0 lag: (min: 0.0, avg: 40.6, max: 81.0) [2024-03-21 05:53:45,522][03784] Avg episode reward: [(0, '1.615')] [2024-03-21 05:53:50,521][03784] Fps is (10 sec: 29491.0, 60 sec: 44236.7, 300 sec: 46763.8). Total num frames: 1204289536. Throughput: 0: 47719.9. Samples: 1205711400. Policy #0 lag: (min: 0.0, avg: 40.6, max: 81.0) [2024-03-21 05:53:50,522][03784] Avg episode reward: [(0, '1.265')] [2024-03-21 05:53:50,788][04017] Updated weights for policy 0, policy_version 36753 (0.0025) [2024-03-21 05:53:55,521][03784] Fps is (10 sec: 42598.1, 60 sec: 45329.1, 300 sec: 47652.4). Total num frames: 1204584448. Throughput: 0: 47522.1. Samples: 1205986000. Policy #0 lag: (min: 0.0, avg: 40.6, max: 81.0) [2024-03-21 05:53:55,522][03784] Avg episode reward: [(0, '1.392')] [2024-03-21 05:53:56,766][04017] Updated weights for policy 0, policy_version 36763 (0.0017) [2024-03-21 05:54:00,521][03784] Fps is (10 sec: 62260.1, 60 sec: 47513.6, 300 sec: 47763.5). Total num frames: 1204912128. Throughput: 0: 47137.8. Samples: 1206115000. Policy #0 lag: (min: 0.0, avg: 40.6, max: 81.0) [2024-03-21 05:54:00,522][03784] Avg episode reward: [(0, '1.409')] [2024-03-21 05:54:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000036771_1204912128.pth... [2024-03-21 05:54:00,649][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000036426_1193607168.pth [2024-03-21 05:54:05,521][03784] Fps is (10 sec: 32767.8, 60 sec: 44782.9, 300 sec: 46763.8). Total num frames: 1204912128. Throughput: 0: 47753.2. Samples: 1206406100. Policy #0 lag: (min: 0.0, avg: 40.6, max: 81.0) [2024-03-21 05:54:05,522][03784] Avg episode reward: [(0, '0.649')] [2024-03-21 05:54:06,093][04017] Updated weights for policy 0, policy_version 36773 (0.0016) [2024-03-21 05:54:10,521][03784] Fps is (10 sec: 36044.2, 60 sec: 45875.1, 300 sec: 47208.1). Total num frames: 1205272576. Throughput: 0: 47955.5. Samples: 1206677800. Policy #0 lag: (min: 3.0, avg: 47.4, max: 112.0) [2024-03-21 05:54:10,522][03784] Avg episode reward: [(0, '1.248')] [2024-03-21 05:54:10,578][04017] Updated weights for policy 0, policy_version 36783 (0.0011) [2024-03-21 05:54:15,521][03784] Fps is (10 sec: 68813.4, 60 sec: 49698.1, 300 sec: 47430.3). Total num frames: 1205600256. Throughput: 0: 47655.6. Samples: 1206802800. Policy #0 lag: (min: 3.0, avg: 47.4, max: 112.0) [2024-03-21 05:54:15,522][03784] Avg episode reward: [(0, '1.197')] [2024-03-21 05:54:16,568][04017] Updated weights for policy 0, policy_version 36793 (0.0028) [2024-03-21 05:54:20,521][03784] Fps is (10 sec: 62259.6, 60 sec: 47513.6, 300 sec: 47208.1). Total num frames: 1205895168. Throughput: 0: 48060.0. Samples: 1207115100. Policy #0 lag: (min: 3.0, avg: 47.4, max: 112.0) [2024-03-21 05:54:20,522][03784] Avg episode reward: [(0, '1.197')] [2024-03-21 05:54:20,858][04017] Updated weights for policy 0, policy_version 36803 (0.0020) [2024-03-21 05:54:24,447][03995] Signal inference workers to stop experience collection... (24300 times) [2024-03-21 05:54:24,448][03995] Signal inference workers to resume experience collection... (24300 times) [2024-03-21 05:54:24,505][04017] InferenceWorker_p0-w0: stopping experience collection (24300 times) [2024-03-21 05:54:24,505][04017] InferenceWorker_p0-w0: resuming experience collection (24300 times) [2024-03-21 05:54:25,521][03784] Fps is (10 sec: 65535.6, 60 sec: 47513.6, 300 sec: 47874.6). Total num frames: 1206255616. Throughput: 0: 47699.9. Samples: 1207380100. Policy #0 lag: (min: 3.0, avg: 47.4, max: 112.0) [2024-03-21 05:54:25,522][03784] Avg episode reward: [(0, '1.237')] [2024-03-21 05:54:26,759][04017] Updated weights for policy 0, policy_version 36813 (0.0011) [2024-03-21 05:54:30,521][03784] Fps is (10 sec: 49152.2, 60 sec: 46421.3, 300 sec: 47430.3). Total num frames: 1206386688. Throughput: 0: 47413.3. Samples: 1207525600. Policy #0 lag: (min: 3.0, avg: 47.4, max: 112.0) [2024-03-21 05:54:30,522][03784] Avg episode reward: [(0, '1.558')] [2024-03-21 05:54:35,103][04017] Updated weights for policy 0, policy_version 36823 (0.0015) [2024-03-21 05:54:35,521][03784] Fps is (10 sec: 39321.3, 60 sec: 47513.5, 300 sec: 47652.4). Total num frames: 1206648832. Throughput: 0: 46151.0. Samples: 1207788200. Policy #0 lag: (min: 1.0, avg: 41.7, max: 77.0) [2024-03-21 05:54:35,522][03784] Avg episode reward: [(0, '0.610')] [2024-03-21 05:54:40,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48605.9, 300 sec: 47208.2). Total num frames: 1206910976. Throughput: 0: 46122.2. Samples: 1208061500. Policy #0 lag: (min: 1.0, avg: 41.7, max: 77.0) [2024-03-21 05:54:40,522][03784] Avg episode reward: [(0, '0.610')] [2024-03-21 05:54:42,468][04017] Updated weights for policy 0, policy_version 36833 (0.0030) [2024-03-21 05:54:45,521][03784] Fps is (10 sec: 36045.4, 60 sec: 47513.6, 300 sec: 46652.7). Total num frames: 1207009280. Throughput: 0: 46184.4. Samples: 1208193300. Policy #0 lag: (min: 1.0, avg: 41.7, max: 77.0) [2024-03-21 05:54:45,522][03784] Avg episode reward: [(0, '0.999')] [2024-03-21 05:54:50,521][03784] Fps is (10 sec: 26214.5, 60 sec: 48059.8, 300 sec: 46652.8). Total num frames: 1207173120. Throughput: 0: 45891.2. Samples: 1208471200. Policy #0 lag: (min: 1.0, avg: 41.7, max: 77.0) [2024-03-21 05:54:50,522][03784] Avg episode reward: [(0, '0.789')] [2024-03-21 05:54:52,654][04017] Updated weights for policy 0, policy_version 36843 (0.0022) [2024-03-21 05:54:55,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46421.4, 300 sec: 46652.8). Total num frames: 1207369728. Throughput: 0: 46157.9. Samples: 1208754900. Policy #0 lag: (min: 0.0, avg: 26.8, max: 63.0) [2024-03-21 05:54:55,522][03784] Avg episode reward: [(0, '1.367')] [2024-03-21 05:55:00,521][03784] Fps is (10 sec: 26214.4, 60 sec: 42052.2, 300 sec: 46208.4). Total num frames: 1207435264. Throughput: 0: 46986.7. Samples: 1208917200. Policy #0 lag: (min: 0.0, avg: 26.8, max: 63.0) [2024-03-21 05:55:00,522][03784] Avg episode reward: [(0, '0.937')] [2024-03-21 05:55:04,427][04017] Updated weights for policy 0, policy_version 36853 (0.0010) [2024-03-21 05:55:05,521][03784] Fps is (10 sec: 32767.6, 60 sec: 46421.3, 300 sec: 46541.7). Total num frames: 1207697408. Throughput: 0: 46659.9. Samples: 1209214800. Policy #0 lag: (min: 0.0, avg: 26.8, max: 63.0) [2024-03-21 05:55:05,522][03784] Avg episode reward: [(0, '1.451')] [2024-03-21 05:55:09,728][04017] Updated weights for policy 0, policy_version 36863 (0.0015) [2024-03-21 05:55:10,521][03784] Fps is (10 sec: 55704.9, 60 sec: 45329.0, 300 sec: 46986.0). Total num frames: 1207992320. Throughput: 0: 46811.0. Samples: 1209486600. Policy #0 lag: (min: 0.0, avg: 26.8, max: 63.0) [2024-03-21 05:55:10,522][03784] Avg episode reward: [(0, '1.011')] [2024-03-21 05:55:13,051][04017] Updated weights for policy 0, policy_version 36873 (0.0016) [2024-03-21 05:55:15,521][03784] Fps is (10 sec: 65536.0, 60 sec: 45875.1, 300 sec: 47874.6). Total num frames: 1208352768. Throughput: 0: 46622.1. Samples: 1209623600. Policy #0 lag: (min: 0.0, avg: 26.8, max: 63.0) [2024-03-21 05:55:15,522][03784] Avg episode reward: [(0, '1.011')] [2024-03-21 05:55:19,504][04017] Updated weights for policy 0, policy_version 36883 (0.0016) [2024-03-21 05:55:20,521][03784] Fps is (10 sec: 65536.2, 60 sec: 45875.1, 300 sec: 47874.6). Total num frames: 1208647680. Throughput: 0: 47384.5. Samples: 1209920500. Policy #0 lag: (min: 0.0, avg: 40.2, max: 122.0) [2024-03-21 05:55:20,522][03784] Avg episode reward: [(0, '1.574')] [2024-03-21 05:55:22,122][03995] Signal inference workers to stop experience collection... (24350 times) [2024-03-21 05:55:22,127][03995] Signal inference workers to resume experience collection... (24350 times) [2024-03-21 05:55:22,187][04017] InferenceWorker_p0-w0: stopping experience collection (24350 times) [2024-03-21 05:55:22,187][04017] InferenceWorker_p0-w0: resuming experience collection (24350 times) [2024-03-21 05:55:23,862][04017] Updated weights for policy 0, policy_version 36893 (0.0012) [2024-03-21 05:55:25,521][03784] Fps is (10 sec: 65536.9, 60 sec: 45875.3, 300 sec: 47985.7). Total num frames: 1209008128. Throughput: 0: 46931.2. Samples: 1210173400. Policy #0 lag: (min: 0.0, avg: 40.2, max: 122.0) [2024-03-21 05:55:25,522][03784] Avg episode reward: [(0, '0.898')] [2024-03-21 05:55:29,106][04017] Updated weights for policy 0, policy_version 36903 (0.0019) [2024-03-21 05:55:30,521][03784] Fps is (10 sec: 58982.8, 60 sec: 47513.6, 300 sec: 47208.1). Total num frames: 1209237504. Throughput: 0: 46922.2. Samples: 1210304800. Policy #0 lag: (min: 0.0, avg: 40.2, max: 122.0) [2024-03-21 05:55:30,522][03784] Avg episode reward: [(0, '1.190')] [2024-03-21 05:55:35,521][03784] Fps is (10 sec: 52428.1, 60 sec: 48059.8, 300 sec: 47208.1). Total num frames: 1209532416. Throughput: 0: 47191.0. Samples: 1210594800. Policy #0 lag: (min: 0.0, avg: 40.2, max: 122.0) [2024-03-21 05:55:35,522][03784] Avg episode reward: [(0, '1.190')] [2024-03-21 05:55:35,958][04017] Updated weights for policy 0, policy_version 36913 (0.0016) [2024-03-21 05:55:40,521][03784] Fps is (10 sec: 49151.5, 60 sec: 46967.4, 300 sec: 46874.9). Total num frames: 1209729024. Throughput: 0: 47479.8. Samples: 1210891500. Policy #0 lag: (min: 0.0, avg: 40.2, max: 122.0) [2024-03-21 05:55:40,522][03784] Avg episode reward: [(0, '1.174')] [2024-03-21 05:55:43,336][04017] Updated weights for policy 0, policy_version 36923 (0.0012) [2024-03-21 05:55:45,521][03784] Fps is (10 sec: 42598.9, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 1209958400. Throughput: 0: 46846.7. Samples: 1211025300. Policy #0 lag: (min: 0.0, avg: 51.7, max: 94.0) [2024-03-21 05:55:45,522][03784] Avg episode reward: [(0, '0.780')] [2024-03-21 05:55:50,521][03784] Fps is (10 sec: 26214.7, 60 sec: 46967.4, 300 sec: 46430.6). Total num frames: 1209991168. Throughput: 0: 46231.2. Samples: 1211295200. Policy #0 lag: (min: 0.0, avg: 51.7, max: 94.0) [2024-03-21 05:55:50,522][03784] Avg episode reward: [(0, '0.980')] [2024-03-21 05:55:55,155][04017] Updated weights for policy 0, policy_version 36933 (0.0021) [2024-03-21 05:55:55,521][03784] Fps is (10 sec: 29491.3, 60 sec: 48059.7, 300 sec: 46652.8). Total num frames: 1210253312. Throughput: 0: 46449.1. Samples: 1211576800. Policy #0 lag: (min: 0.0, avg: 51.7, max: 94.0) [2024-03-21 05:55:55,522][03784] Avg episode reward: [(0, '1.570')] [2024-03-21 05:56:00,521][03784] Fps is (10 sec: 45874.7, 60 sec: 50244.2, 300 sec: 46430.6). Total num frames: 1210449920. Throughput: 0: 46220.0. Samples: 1211703500. Policy #0 lag: (min: 0.0, avg: 51.7, max: 94.0) [2024-03-21 05:56:00,531][03784] Avg episode reward: [(0, '0.675')] [2024-03-21 05:56:00,543][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000036940_1210449920.pth... [2024-03-21 05:56:00,727][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000036601_1199341568.pth [2024-03-21 05:56:03,457][04017] Updated weights for policy 0, policy_version 36943 (0.0017) [2024-03-21 05:56:05,521][03784] Fps is (10 sec: 39321.7, 60 sec: 49152.1, 300 sec: 46652.8). Total num frames: 1210646528. Throughput: 0: 45304.6. Samples: 1211959200. Policy #0 lag: (min: 0.0, avg: 51.7, max: 94.0) [2024-03-21 05:56:05,522][03784] Avg episode reward: [(0, '1.426')] [2024-03-21 05:56:10,521][03784] Fps is (10 sec: 32768.3, 60 sec: 46421.4, 300 sec: 46652.7). Total num frames: 1210777600. Throughput: 0: 45819.9. Samples: 1212235300. Policy #0 lag: (min: 0.0, avg: 31.3, max: 84.0) [2024-03-21 05:56:10,522][03784] Avg episode reward: [(0, '0.856')] [2024-03-21 05:56:12,749][04017] Updated weights for policy 0, policy_version 36953 (0.0015) [2024-03-21 05:56:15,521][03784] Fps is (10 sec: 39321.0, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 1211039744. Throughput: 0: 46079.9. Samples: 1212378400. Policy #0 lag: (min: 0.0, avg: 31.3, max: 84.0) [2024-03-21 05:56:15,522][03784] Avg episode reward: [(0, '0.822')] [2024-03-21 05:56:17,348][04017] Updated weights for policy 0, policy_version 36963 (0.0011) [2024-03-21 05:56:20,521][03784] Fps is (10 sec: 58981.8, 60 sec: 45329.0, 300 sec: 46652.7). Total num frames: 1211367424. Throughput: 0: 45686.6. Samples: 1212650700. Policy #0 lag: (min: 0.0, avg: 31.3, max: 84.0) [2024-03-21 05:56:20,522][03784] Avg episode reward: [(0, '0.822')] [2024-03-21 05:56:23,223][03995] Signal inference workers to stop experience collection... (24400 times) [2024-03-21 05:56:23,224][03995] Signal inference workers to resume experience collection... (24400 times) [2024-03-21 05:56:23,300][04017] InferenceWorker_p0-w0: stopping experience collection (24400 times) [2024-03-21 05:56:23,300][04017] InferenceWorker_p0-w0: resuming experience collection (24400 times) [2024-03-21 05:56:23,587][04017] Updated weights for policy 0, policy_version 36973 (0.0020) [2024-03-21 05:56:25,521][03784] Fps is (10 sec: 58982.7, 60 sec: 43690.6, 300 sec: 46986.0). Total num frames: 1211629568. Throughput: 0: 45437.9. Samples: 1212936200. Policy #0 lag: (min: 0.0, avg: 31.3, max: 84.0) [2024-03-21 05:56:25,522][03784] Avg episode reward: [(0, '1.368')] [2024-03-21 05:56:30,521][03784] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 46986.0). Total num frames: 1211826176. Throughput: 0: 45491.1. Samples: 1213072400. Policy #0 lag: (min: 0.0, avg: 31.3, max: 84.0) [2024-03-21 05:56:30,522][03784] Avg episode reward: [(0, '1.236')] [2024-03-21 05:56:35,238][04017] Updated weights for policy 0, policy_version 36983 (0.0016) [2024-03-21 05:56:35,521][03784] Fps is (10 sec: 26214.0, 60 sec: 39321.5, 300 sec: 46652.7). Total num frames: 1211891712. Throughput: 0: 45799.8. Samples: 1213356200. Policy #0 lag: (min: 0.0, avg: 35.2, max: 70.0) [2024-03-21 05:56:35,523][03784] Avg episode reward: [(0, '0.769')] [2024-03-21 05:56:39,243][04017] Updated weights for policy 0, policy_version 36993 (0.0015) [2024-03-21 05:56:40,521][03784] Fps is (10 sec: 49151.3, 60 sec: 43144.5, 300 sec: 47430.3). Total num frames: 1212317696. Throughput: 0: 45386.4. Samples: 1213619200. Policy #0 lag: (min: 0.0, avg: 35.2, max: 70.0) [2024-03-21 05:56:40,523][03784] Avg episode reward: [(0, '0.813')] [2024-03-21 05:56:43,852][04017] Updated weights for policy 0, policy_version 37003 (0.0014) [2024-03-21 05:56:45,521][03784] Fps is (10 sec: 75368.0, 60 sec: 44782.9, 300 sec: 47319.2). Total num frames: 1212645376. Throughput: 0: 45657.9. Samples: 1213758100. Policy #0 lag: (min: 0.0, avg: 35.2, max: 70.0) [2024-03-21 05:56:45,522][03784] Avg episode reward: [(0, '1.611')] [2024-03-21 05:56:47,758][04017] Updated weights for policy 0, policy_version 37013 (0.0023) [2024-03-21 05:56:50,521][03784] Fps is (10 sec: 68814.4, 60 sec: 50244.3, 300 sec: 47541.4). Total num frames: 1213005824. Throughput: 0: 45426.7. Samples: 1214003400. Policy #0 lag: (min: 0.0, avg: 35.2, max: 70.0) [2024-03-21 05:56:50,522][03784] Avg episode reward: [(0, '1.191')] [2024-03-21 05:56:55,521][03784] Fps is (10 sec: 45875.2, 60 sec: 47513.6, 300 sec: 46652.7). Total num frames: 1213104128. Throughput: 0: 46022.3. Samples: 1214306300. Policy #0 lag: (min: 0.0, avg: 35.2, max: 70.0) [2024-03-21 05:56:55,522][03784] Avg episode reward: [(0, '1.191')] [2024-03-21 05:56:57,316][04017] Updated weights for policy 0, policy_version 37023 (0.0024) [2024-03-21 05:57:00,521][03784] Fps is (10 sec: 36044.2, 60 sec: 48605.9, 300 sec: 46541.7). Total num frames: 1213366272. Throughput: 0: 45980.0. Samples: 1214447500. Policy #0 lag: (min: 5.0, avg: 36.4, max: 77.0) [2024-03-21 05:57:00,522][03784] Avg episode reward: [(0, '0.979')] [2024-03-21 05:57:05,458][04017] Updated weights for policy 0, policy_version 37033 (0.0010) [2024-03-21 05:57:05,521][03784] Fps is (10 sec: 39320.6, 60 sec: 47513.4, 300 sec: 45986.3). Total num frames: 1213497344. Throughput: 0: 46311.0. Samples: 1214734700. Policy #0 lag: (min: 5.0, avg: 36.4, max: 77.0) [2024-03-21 05:57:05,523][03784] Avg episode reward: [(0, '0.583')] [2024-03-21 05:57:10,521][03784] Fps is (10 sec: 36045.3, 60 sec: 49152.1, 300 sec: 45986.3). Total num frames: 1213726720. Throughput: 0: 46300.1. Samples: 1215019700. Policy #0 lag: (min: 5.0, avg: 36.4, max: 77.0) [2024-03-21 05:57:10,522][03784] Avg episode reward: [(0, '1.197')] [2024-03-21 05:57:13,001][04017] Updated weights for policy 0, policy_version 37043 (0.0011) [2024-03-21 05:57:13,260][03995] Signal inference workers to stop experience collection... (24450 times) [2024-03-21 05:57:13,329][04017] InferenceWorker_p0-w0: stopping experience collection (24450 times) [2024-03-21 05:57:13,338][03995] Signal inference workers to resume experience collection... (24450 times) [2024-03-21 05:57:13,381][04017] InferenceWorker_p0-w0: resuming experience collection (24450 times) [2024-03-21 05:57:15,521][03784] Fps is (10 sec: 42599.3, 60 sec: 48059.8, 300 sec: 46319.5). Total num frames: 1213923328. Throughput: 0: 46062.2. Samples: 1215145200. Policy #0 lag: (min: 5.0, avg: 36.4, max: 77.0) [2024-03-21 05:57:15,522][03784] Avg episode reward: [(0, '1.367')] [2024-03-21 05:57:20,521][03784] Fps is (10 sec: 32767.8, 60 sec: 44783.0, 300 sec: 46541.7). Total num frames: 1214054400. Throughput: 0: 45662.4. Samples: 1215411000. Policy #0 lag: (min: 5.0, avg: 36.4, max: 77.0) [2024-03-21 05:57:20,522][03784] Avg episode reward: [(0, '0.998')] [2024-03-21 05:57:23,862][04017] Updated weights for policy 0, policy_version 37053 (0.0017) [2024-03-21 05:57:25,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44236.8, 300 sec: 46874.9). Total num frames: 1214283776. Throughput: 0: 45257.9. Samples: 1215655800. Policy #0 lag: (min: 0.0, avg: 23.5, max: 67.0) [2024-03-21 05:57:25,522][03784] Avg episode reward: [(0, '0.707')] [2024-03-21 05:57:28,862][04017] Updated weights for policy 0, policy_version 37063 (0.0012) [2024-03-21 05:57:30,521][03784] Fps is (10 sec: 55705.7, 60 sec: 46421.4, 300 sec: 46763.8). Total num frames: 1214611456. Throughput: 0: 45495.6. Samples: 1215805400. Policy #0 lag: (min: 0.0, avg: 23.5, max: 67.0) [2024-03-21 05:57:30,522][03784] Avg episode reward: [(0, '0.891')] [2024-03-21 05:57:33,730][04017] Updated weights for policy 0, policy_version 37073 (0.0011) [2024-03-21 05:57:35,521][03784] Fps is (10 sec: 62259.6, 60 sec: 50244.5, 300 sec: 46874.9). Total num frames: 1214906368. Throughput: 0: 46284.4. Samples: 1216086200. Policy #0 lag: (min: 0.0, avg: 23.5, max: 67.0) [2024-03-21 05:57:35,522][03784] Avg episode reward: [(0, '1.147')] [2024-03-21 05:57:40,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44237.0, 300 sec: 46541.7). Total num frames: 1214971904. Throughput: 0: 46157.8. Samples: 1216383400. Policy #0 lag: (min: 0.0, avg: 23.5, max: 67.0) [2024-03-21 05:57:40,522][03784] Avg episode reward: [(0, '1.147')] [2024-03-21 05:57:45,521][03784] Fps is (10 sec: 19660.8, 60 sec: 40960.0, 300 sec: 45653.0). Total num frames: 1215102976. Throughput: 0: 46429.0. Samples: 1216536800. Policy #0 lag: (min: 0.0, avg: 23.5, max: 67.0) [2024-03-21 05:57:45,522][03784] Avg episode reward: [(0, '0.985')] [2024-03-21 05:57:45,906][04017] Updated weights for policy 0, policy_version 37083 (0.0010) [2024-03-21 05:57:50,422][04017] Updated weights for policy 0, policy_version 37093 (0.0011) [2024-03-21 05:57:50,521][03784] Fps is (10 sec: 49151.0, 60 sec: 40959.9, 300 sec: 46097.3). Total num frames: 1215463424. Throughput: 0: 46182.3. Samples: 1216812900. Policy #0 lag: (min: 3.0, avg: 39.4, max: 115.0) [2024-03-21 05:57:50,523][03784] Avg episode reward: [(0, '1.413')] [2024-03-21 05:57:54,967][04017] Updated weights for policy 0, policy_version 37103 (0.0013) [2024-03-21 05:57:55,521][03784] Fps is (10 sec: 72089.6, 60 sec: 45329.1, 300 sec: 46652.7). Total num frames: 1215823872. Throughput: 0: 45211.1. Samples: 1217054200. Policy #0 lag: (min: 3.0, avg: 39.4, max: 115.0) [2024-03-21 05:57:55,522][03784] Avg episode reward: [(0, '1.572')] [2024-03-21 05:57:56,637][03995] Signal inference workers to stop experience collection... (24500 times) [2024-03-21 05:57:56,709][03995] Signal inference workers to resume experience collection... (24500 times) [2024-03-21 05:57:56,715][04017] InferenceWorker_p0-w0: stopping experience collection (24500 times) [2024-03-21 05:57:56,766][04017] InferenceWorker_p0-w0: resuming experience collection (24500 times) [2024-03-21 05:57:59,792][04017] Updated weights for policy 0, policy_version 37113 (0.0014) [2024-03-21 05:58:00,521][03784] Fps is (10 sec: 72088.8, 60 sec: 46967.3, 300 sec: 47319.2). Total num frames: 1216184320. Throughput: 0: 45362.0. Samples: 1217186500. Policy #0 lag: (min: 3.0, avg: 39.4, max: 115.0) [2024-03-21 05:58:00,523][03784] Avg episode reward: [(0, '1.117')] [2024-03-21 05:58:00,807][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000037116_1216217088.pth... [2024-03-21 05:58:00,919][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000036771_1204912128.pth [2024-03-21 05:58:05,521][03784] Fps is (10 sec: 55705.6, 60 sec: 48059.9, 300 sec: 46986.0). Total num frames: 1216380928. Throughput: 0: 45320.0. Samples: 1217450400. Policy #0 lag: (min: 3.0, avg: 39.4, max: 115.0) [2024-03-21 05:58:05,522][03784] Avg episode reward: [(0, '0.903')] [2024-03-21 05:58:06,601][04017] Updated weights for policy 0, policy_version 37123 (0.0015) [2024-03-21 05:58:10,521][03784] Fps is (10 sec: 42599.0, 60 sec: 48059.6, 300 sec: 47430.3). Total num frames: 1216610304. Throughput: 0: 46026.6. Samples: 1217727000. Policy #0 lag: (min: 3.0, avg: 39.4, max: 115.0) [2024-03-21 05:58:10,522][03784] Avg episode reward: [(0, '1.423')] [2024-03-21 05:58:15,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46421.3, 300 sec: 46319.5). Total num frames: 1216708608. Throughput: 0: 46111.1. Samples: 1217880400. Policy #0 lag: (min: 0.0, avg: 31.1, max: 85.0) [2024-03-21 05:58:15,522][03784] Avg episode reward: [(0, '0.582')] [2024-03-21 05:58:15,918][04017] Updated weights for policy 0, policy_version 37133 (0.0015) [2024-03-21 05:58:20,521][03784] Fps is (10 sec: 26214.6, 60 sec: 46967.4, 300 sec: 45653.0). Total num frames: 1216872448. Throughput: 0: 46808.8. Samples: 1218192600. Policy #0 lag: (min: 0.0, avg: 31.1, max: 85.0) [2024-03-21 05:58:20,523][03784] Avg episode reward: [(0, '0.582')] [2024-03-21 05:58:24,085][04017] Updated weights for policy 0, policy_version 37143 (0.0027) [2024-03-21 05:58:25,521][03784] Fps is (10 sec: 45875.2, 60 sec: 48059.7, 300 sec: 45986.3). Total num frames: 1217167360. Throughput: 0: 46279.9. Samples: 1218466000. Policy #0 lag: (min: 0.0, avg: 31.1, max: 85.0) [2024-03-21 05:58:25,522][03784] Avg episode reward: [(0, '0.799')] [2024-03-21 05:58:30,521][03784] Fps is (10 sec: 45874.0, 60 sec: 45328.8, 300 sec: 45875.2). Total num frames: 1217331200. Throughput: 0: 46128.5. Samples: 1218612600. Policy #0 lag: (min: 0.0, avg: 31.1, max: 85.0) [2024-03-21 05:58:30,522][03784] Avg episode reward: [(0, '0.812')] [2024-03-21 05:58:33,359][04017] Updated weights for policy 0, policy_version 37153 (0.0020) [2024-03-21 05:58:35,521][03784] Fps is (10 sec: 36044.7, 60 sec: 43690.6, 300 sec: 45875.2). Total num frames: 1217527808. Throughput: 0: 46469.0. Samples: 1218904000. Policy #0 lag: (min: 0.0, avg: 31.1, max: 85.0) [2024-03-21 05:58:35,522][03784] Avg episode reward: [(0, '0.812')] [2024-03-21 05:58:37,755][04017] Updated weights for policy 0, policy_version 37163 (0.0019) [2024-03-21 05:58:40,521][03784] Fps is (10 sec: 52430.2, 60 sec: 48059.7, 300 sec: 46430.6). Total num frames: 1217855488. Throughput: 0: 46711.0. Samples: 1219156200. Policy #0 lag: (min: 4.0, avg: 37.7, max: 83.0) [2024-03-21 05:58:40,522][03784] Avg episode reward: [(0, '0.931')] [2024-03-21 05:58:45,521][03784] Fps is (10 sec: 52428.9, 60 sec: 49151.9, 300 sec: 46652.8). Total num frames: 1218052096. Throughput: 0: 46878.0. Samples: 1219296000. Policy #0 lag: (min: 4.0, avg: 37.7, max: 83.0) [2024-03-21 05:58:45,522][03784] Avg episode reward: [(0, '0.733')] [2024-03-21 05:58:45,708][04017] Updated weights for policy 0, policy_version 37173 (0.0010) [2024-03-21 05:58:50,521][03784] Fps is (10 sec: 39321.9, 60 sec: 46421.5, 300 sec: 46319.5). Total num frames: 1218248704. Throughput: 0: 47908.9. Samples: 1219606300. Policy #0 lag: (min: 4.0, avg: 37.7, max: 83.0) [2024-03-21 05:58:50,522][03784] Avg episode reward: [(0, '0.733')] [2024-03-21 05:58:53,343][03995] Signal inference workers to stop experience collection... (24550 times) [2024-03-21 05:58:53,424][03995] Signal inference workers to resume experience collection... (24550 times) [2024-03-21 05:58:53,445][04017] InferenceWorker_p0-w0: stopping experience collection (24550 times) [2024-03-21 05:58:53,501][04017] InferenceWorker_p0-w0: resuming experience collection (24550 times) [2024-03-21 05:58:54,481][04017] Updated weights for policy 0, policy_version 37183 (0.0016) [2024-03-21 05:58:55,521][03784] Fps is (10 sec: 42597.9, 60 sec: 44236.7, 300 sec: 45986.2). Total num frames: 1218478080. Throughput: 0: 48437.7. Samples: 1219906700. Policy #0 lag: (min: 4.0, avg: 37.7, max: 83.0) [2024-03-21 05:58:55,522][03784] Avg episode reward: [(0, '0.733')] [2024-03-21 05:58:58,315][04017] Updated weights for policy 0, policy_version 37193 (0.0012) [2024-03-21 05:59:00,521][03784] Fps is (10 sec: 62259.3, 60 sec: 44783.2, 300 sec: 47319.2). Total num frames: 1218871296. Throughput: 0: 47893.4. Samples: 1220035600. Policy #0 lag: (min: 4.0, avg: 37.7, max: 83.0) [2024-03-21 05:59:00,522][03784] Avg episode reward: [(0, '0.917')] [2024-03-21 05:59:05,521][03784] Fps is (10 sec: 55706.1, 60 sec: 44236.7, 300 sec: 46652.7). Total num frames: 1219035136. Throughput: 0: 46964.4. Samples: 1220306000. Policy #0 lag: (min: 3.0, avg: 37.8, max: 68.0) [2024-03-21 05:59:05,522][03784] Avg episode reward: [(0, '0.960')] [2024-03-21 05:59:05,843][04017] Updated weights for policy 0, policy_version 37203 (0.0020) [2024-03-21 05:59:10,428][04017] Updated weights for policy 0, policy_version 37213 (0.0037) [2024-03-21 05:59:10,521][03784] Fps is (10 sec: 52427.9, 60 sec: 46421.3, 300 sec: 46763.8). Total num frames: 1219395584. Throughput: 0: 46028.8. Samples: 1220537300. Policy #0 lag: (min: 3.0, avg: 37.8, max: 68.0) [2024-03-21 05:59:10,522][03784] Avg episode reward: [(0, '0.939')] [2024-03-21 05:59:14,485][04017] Updated weights for policy 0, policy_version 37223 (0.0017) [2024-03-21 05:59:15,521][03784] Fps is (10 sec: 75367.0, 60 sec: 51336.6, 300 sec: 47097.1). Total num frames: 1219788800. Throughput: 0: 45742.6. Samples: 1220671000. Policy #0 lag: (min: 3.0, avg: 37.8, max: 68.0) [2024-03-21 05:59:15,522][03784] Avg episode reward: [(0, '0.939')] [2024-03-21 05:59:20,521][03784] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 46208.4). Total num frames: 1219887104. Throughput: 0: 45089.0. Samples: 1220933000. Policy #0 lag: (min: 3.0, avg: 37.8, max: 68.0) [2024-03-21 05:59:20,522][03784] Avg episode reward: [(0, '1.138')] [2024-03-21 05:59:25,521][03784] Fps is (10 sec: 22937.5, 60 sec: 47513.6, 300 sec: 46208.4). Total num frames: 1220018176. Throughput: 0: 45568.9. Samples: 1221206800. Policy #0 lag: (min: 3.0, avg: 37.8, max: 68.0) [2024-03-21 05:59:25,522][03784] Avg episode reward: [(0, '1.112')] [2024-03-21 05:59:26,336][04017] Updated weights for policy 0, policy_version 37233 (0.0011) [2024-03-21 05:59:30,521][03784] Fps is (10 sec: 32768.0, 60 sec: 48060.0, 300 sec: 45986.3). Total num frames: 1220214784. Throughput: 0: 45373.4. Samples: 1221337800. Policy #0 lag: (min: 0.0, avg: 40.7, max: 98.0) [2024-03-21 05:59:30,522][03784] Avg episode reward: [(0, '1.093')] [2024-03-21 05:59:35,521][03784] Fps is (10 sec: 22937.6, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 1220247552. Throughput: 0: 44426.6. Samples: 1221605500. Policy #0 lag: (min: 0.0, avg: 40.7, max: 98.0) [2024-03-21 05:59:35,522][03784] Avg episode reward: [(0, '0.765')] [2024-03-21 05:59:38,938][04017] Updated weights for policy 0, policy_version 37243 (0.0017) [2024-03-21 05:59:40,521][03784] Fps is (10 sec: 29491.2, 60 sec: 44236.9, 300 sec: 45764.1). Total num frames: 1220509696. Throughput: 0: 43929.1. Samples: 1221883500. Policy #0 lag: (min: 0.0, avg: 40.7, max: 98.0) [2024-03-21 05:59:40,522][03784] Avg episode reward: [(0, '1.452')] [2024-03-21 05:59:43,633][04017] Updated weights for policy 0, policy_version 37253 (0.0010) [2024-03-21 05:59:45,521][03784] Fps is (10 sec: 45875.0, 60 sec: 44236.8, 300 sec: 45875.2). Total num frames: 1220706304. Throughput: 0: 44264.4. Samples: 1222027500. Policy #0 lag: (min: 0.0, avg: 40.7, max: 98.0) [2024-03-21 05:59:45,522][03784] Avg episode reward: [(0, '1.452')] [2024-03-21 05:59:50,521][03784] Fps is (10 sec: 29490.8, 60 sec: 42598.3, 300 sec: 45541.9). Total num frames: 1220804608. Throughput: 0: 45042.2. Samples: 1222332900. Policy #0 lag: (min: 0.0, avg: 40.7, max: 98.0) [2024-03-21 05:59:50,522][03784] Avg episode reward: [(0, '1.154')] [2024-03-21 05:59:53,364][03995] Signal inference workers to stop experience collection... (24600 times) [2024-03-21 05:59:53,365][03995] Signal inference workers to resume experience collection... (24600 times) [2024-03-21 05:59:53,427][04017] InferenceWorker_p0-w0: stopping experience collection (24600 times) [2024-03-21 05:59:53,427][04017] InferenceWorker_p0-w0: resuming experience collection (24600 times) [2024-03-21 05:59:54,229][04017] Updated weights for policy 0, policy_version 37263 (0.0016) [2024-03-21 05:59:55,521][03784] Fps is (10 sec: 42598.4, 60 sec: 44236.9, 300 sec: 46430.6). Total num frames: 1221132288. Throughput: 0: 45866.7. Samples: 1222601300. Policy #0 lag: (min: 0.0, avg: 24.5, max: 65.0) [2024-03-21 05:59:55,522][03784] Avg episode reward: [(0, '1.588')] [2024-03-21 05:59:57,879][04017] Updated weights for policy 0, policy_version 37273 (0.0025) [2024-03-21 06:00:00,521][03784] Fps is (10 sec: 72090.4, 60 sec: 44236.8, 300 sec: 46874.9). Total num frames: 1221525504. Throughput: 0: 45604.4. Samples: 1222723200. Policy #0 lag: (min: 0.0, avg: 24.5, max: 65.0) [2024-03-21 06:00:00,522][03784] Avg episode reward: [(0, '0.668')] [2024-03-21 06:00:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000037278_1221525504.pth... [2024-03-21 06:00:00,671][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000036940_1210449920.pth [2024-03-21 06:00:03,311][04017] Updated weights for policy 0, policy_version 37283 (0.0011) [2024-03-21 06:00:05,521][03784] Fps is (10 sec: 62259.0, 60 sec: 45329.1, 300 sec: 46652.8). Total num frames: 1221754880. Throughput: 0: 45248.8. Samples: 1222969200. Policy #0 lag: (min: 0.0, avg: 24.5, max: 65.0) [2024-03-21 06:00:05,522][03784] Avg episode reward: [(0, '1.317')] [2024-03-21 06:00:10,521][03784] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 46208.4). Total num frames: 1221984256. Throughput: 0: 45282.2. Samples: 1223244500. Policy #0 lag: (min: 0.0, avg: 24.5, max: 65.0) [2024-03-21 06:00:10,522][03784] Avg episode reward: [(0, '1.095')] [2024-03-21 06:00:11,621][04017] Updated weights for policy 0, policy_version 37293 (0.0011) [2024-03-21 06:00:15,521][03784] Fps is (10 sec: 52429.0, 60 sec: 41506.1, 300 sec: 46208.4). Total num frames: 1222279168. Throughput: 0: 45513.3. Samples: 1223385900. Policy #0 lag: (min: 3.0, avg: 36.5, max: 62.0) [2024-03-21 06:00:15,522][03784] Avg episode reward: [(0, '0.522')] [2024-03-21 06:00:16,792][04017] Updated weights for policy 0, policy_version 37303 (0.0017) [2024-03-21 06:00:20,521][03784] Fps is (10 sec: 49151.0, 60 sec: 43144.4, 300 sec: 45653.0). Total num frames: 1222475776. Throughput: 0: 45508.7. Samples: 1223653400. Policy #0 lag: (min: 3.0, avg: 36.5, max: 62.0) [2024-03-21 06:00:20,522][03784] Avg episode reward: [(0, '1.189')] [2024-03-21 06:00:23,054][04017] Updated weights for policy 0, policy_version 37313 (0.0012) [2024-03-21 06:00:25,521][03784] Fps is (10 sec: 42598.5, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 1222705152. Throughput: 0: 45735.5. Samples: 1223941600. Policy #0 lag: (min: 3.0, avg: 36.5, max: 62.0) [2024-03-21 06:00:25,522][03784] Avg episode reward: [(0, '1.039')] [2024-03-21 06:00:30,521][03784] Fps is (10 sec: 42599.2, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 1222901760. Throughput: 0: 45371.2. Samples: 1224069200. Policy #0 lag: (min: 3.0, avg: 36.5, max: 62.0) [2024-03-21 06:00:30,522][03784] Avg episode reward: [(0, '1.131')] [2024-03-21 06:00:32,609][04017] Updated weights for policy 0, policy_version 37323 (0.0023) [2024-03-21 06:00:35,521][03784] Fps is (10 sec: 39321.4, 60 sec: 47513.6, 300 sec: 45319.8). Total num frames: 1223098368. Throughput: 0: 44644.5. Samples: 1224341900. Policy #0 lag: (min: 3.0, avg: 36.5, max: 62.0) [2024-03-21 06:00:35,522][03784] Avg episode reward: [(0, '0.579')] [2024-03-21 06:00:40,521][03784] Fps is (10 sec: 39321.4, 60 sec: 46421.3, 300 sec: 45208.7). Total num frames: 1223294976. Throughput: 0: 44535.5. Samples: 1224605400. Policy #0 lag: (min: 0.0, avg: 37.5, max: 86.0) [2024-03-21 06:00:40,522][03784] Avg episode reward: [(0, '0.719')] [2024-03-21 06:00:41,673][04017] Updated weights for policy 0, policy_version 37333 (0.0023) [2024-03-21 06:00:43,635][03995] Signal inference workers to stop experience collection... (24650 times) [2024-03-21 06:00:43,636][03995] Signal inference workers to resume experience collection... (24650 times) [2024-03-21 06:00:43,716][04017] InferenceWorker_p0-w0: stopping experience collection (24650 times) [2024-03-21 06:00:43,717][04017] InferenceWorker_p0-w0: resuming experience collection (24650 times) [2024-03-21 06:00:45,521][03784] Fps is (10 sec: 49151.8, 60 sec: 48059.7, 300 sec: 46097.3). Total num frames: 1223589888. Throughput: 0: 44171.0. Samples: 1224710900. Policy #0 lag: (min: 0.0, avg: 37.5, max: 86.0) [2024-03-21 06:00:45,522][03784] Avg episode reward: [(0, '1.347')] [2024-03-21 06:00:46,995][04017] Updated weights for policy 0, policy_version 37343 (0.0030) [2024-03-21 06:00:50,521][03784] Fps is (10 sec: 52428.9, 60 sec: 50244.3, 300 sec: 45986.3). Total num frames: 1223819264. Throughput: 0: 44802.3. Samples: 1224985300. Policy #0 lag: (min: 0.0, avg: 37.5, max: 86.0) [2024-03-21 06:00:50,522][03784] Avg episode reward: [(0, '1.360')] [2024-03-21 06:00:53,007][04017] Updated weights for policy 0, policy_version 37353 (0.0016) [2024-03-21 06:00:55,521][03784] Fps is (10 sec: 45875.2, 60 sec: 48605.8, 300 sec: 46097.4). Total num frames: 1224048640. Throughput: 0: 44695.5. Samples: 1225255800. Policy #0 lag: (min: 0.0, avg: 37.5, max: 86.0) [2024-03-21 06:00:55,522][03784] Avg episode reward: [(0, '1.360')] [2024-03-21 06:01:00,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44236.7, 300 sec: 45875.2). Total num frames: 1224179712. Throughput: 0: 44571.1. Samples: 1225391600. Policy #0 lag: (min: 0.0, avg: 37.5, max: 86.0) [2024-03-21 06:01:00,522][03784] Avg episode reward: [(0, '0.725')] [2024-03-21 06:01:05,288][04017] Updated weights for policy 0, policy_version 37363 (0.0015) [2024-03-21 06:01:05,521][03784] Fps is (10 sec: 26214.6, 60 sec: 42598.5, 300 sec: 45875.2). Total num frames: 1224310784. Throughput: 0: 45155.8. Samples: 1225685400. Policy #0 lag: (min: 2.0, avg: 24.1, max: 54.0) [2024-03-21 06:01:05,522][03784] Avg episode reward: [(0, '1.196')] [2024-03-21 06:01:10,521][03784] Fps is (10 sec: 32768.8, 60 sec: 42052.4, 300 sec: 45653.1). Total num frames: 1224507392. Throughput: 0: 45082.4. Samples: 1225970300. Policy #0 lag: (min: 2.0, avg: 24.1, max: 54.0) [2024-03-21 06:01:10,521][03784] Avg episode reward: [(0, '0.751')] [2024-03-21 06:01:13,407][04017] Updated weights for policy 0, policy_version 37373 (0.0012) [2024-03-21 06:01:15,521][03784] Fps is (10 sec: 39321.7, 60 sec: 40413.9, 300 sec: 45208.8). Total num frames: 1224704000. Throughput: 0: 45380.0. Samples: 1226111300. Policy #0 lag: (min: 2.0, avg: 24.1, max: 54.0) [2024-03-21 06:01:15,522][03784] Avg episode reward: [(0, '0.940')] [2024-03-21 06:01:20,521][03784] Fps is (10 sec: 39320.8, 60 sec: 40414.0, 300 sec: 44986.6). Total num frames: 1224900608. Throughput: 0: 45784.5. Samples: 1226402200. Policy #0 lag: (min: 2.0, avg: 24.1, max: 54.0) [2024-03-21 06:01:20,522][03784] Avg episode reward: [(0, '1.460')] [2024-03-21 06:01:21,446][04017] Updated weights for policy 0, policy_version 37383 (0.0011) [2024-03-21 06:01:25,521][03784] Fps is (10 sec: 52428.7, 60 sec: 42052.3, 300 sec: 45430.9). Total num frames: 1225228288. Throughput: 0: 45924.5. Samples: 1226672000. Policy #0 lag: (min: 2.0, avg: 24.1, max: 54.0) [2024-03-21 06:01:25,522][03784] Avg episode reward: [(0, '1.229')] [2024-03-21 06:01:26,321][04017] Updated weights for policy 0, policy_version 37393 (0.0012) [2024-03-21 06:01:30,429][04017] Updated weights for policy 0, policy_version 37403 (0.0011) [2024-03-21 06:01:30,521][03784] Fps is (10 sec: 72089.5, 60 sec: 45329.0, 300 sec: 46541.7). Total num frames: 1225621504. Throughput: 0: 46573.4. Samples: 1226806700. Policy #0 lag: (min: 2.0, avg: 34.6, max: 101.0) [2024-03-21 06:01:30,522][03784] Avg episode reward: [(0, '1.172')] [2024-03-21 06:01:34,868][04017] Updated weights for policy 0, policy_version 37413 (0.0018) [2024-03-21 06:01:35,521][03784] Fps is (10 sec: 78642.8, 60 sec: 48605.9, 300 sec: 46430.6). Total num frames: 1226014720. Throughput: 0: 46213.3. Samples: 1227064900. Policy #0 lag: (min: 2.0, avg: 34.6, max: 101.0) [2024-03-21 06:01:35,522][03784] Avg episode reward: [(0, '1.400')] [2024-03-21 06:01:40,521][03784] Fps is (10 sec: 58982.6, 60 sec: 48605.9, 300 sec: 45986.3). Total num frames: 1226211328. Throughput: 0: 46266.7. Samples: 1227337800. Policy #0 lag: (min: 2.0, avg: 34.6, max: 101.0) [2024-03-21 06:01:40,522][03784] Avg episode reward: [(0, '1.023')] [2024-03-21 06:01:41,886][03995] Signal inference workers to stop experience collection... (24700 times) [2024-03-21 06:01:41,950][04017] InferenceWorker_p0-w0: stopping experience collection (24700 times) [2024-03-21 06:01:42,168][03995] Signal inference workers to resume experience collection... (24700 times) [2024-03-21 06:01:42,169][04017] InferenceWorker_p0-w0: resuming experience collection (24700 times) [2024-03-21 06:01:42,413][04017] Updated weights for policy 0, policy_version 37423 (0.0011) [2024-03-21 06:01:45,521][03784] Fps is (10 sec: 45875.4, 60 sec: 48059.8, 300 sec: 45653.0). Total num frames: 1226473472. Throughput: 0: 46391.2. Samples: 1227479200. Policy #0 lag: (min: 2.0, avg: 34.6, max: 101.0) [2024-03-21 06:01:45,522][03784] Avg episode reward: [(0, '0.471')] [2024-03-21 06:01:49,201][04017] Updated weights for policy 0, policy_version 37433 (0.0025) [2024-03-21 06:01:50,521][03784] Fps is (10 sec: 45875.0, 60 sec: 47513.6, 300 sec: 45986.3). Total num frames: 1226670080. Throughput: 0: 46319.9. Samples: 1227769800. Policy #0 lag: (min: 2.0, avg: 34.6, max: 101.0) [2024-03-21 06:01:50,522][03784] Avg episode reward: [(0, '0.471')] [2024-03-21 06:01:55,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46421.4, 300 sec: 45653.1). Total num frames: 1226833920. Throughput: 0: 46135.4. Samples: 1228046400. Policy #0 lag: (min: 0.0, avg: 61.5, max: 126.0) [2024-03-21 06:01:55,522][03784] Avg episode reward: [(0, '0.471')] [2024-03-21 06:01:59,035][04017] Updated weights for policy 0, policy_version 37443 (0.0011) [2024-03-21 06:02:00,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46967.5, 300 sec: 45764.1). Total num frames: 1226997760. Throughput: 0: 46122.1. Samples: 1228186800. Policy #0 lag: (min: 0.0, avg: 61.5, max: 126.0) [2024-03-21 06:02:00,522][03784] Avg episode reward: [(0, '0.935')] [2024-03-21 06:02:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000037445_1226997760.pth... [2024-03-21 06:02:00,659][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000037116_1216217088.pth [2024-03-21 06:02:05,521][03784] Fps is (10 sec: 26214.5, 60 sec: 46421.4, 300 sec: 45319.8). Total num frames: 1227096064. Throughput: 0: 45342.3. Samples: 1228442600. Policy #0 lag: (min: 0.0, avg: 61.5, max: 126.0) [2024-03-21 06:02:05,522][03784] Avg episode reward: [(0, '0.893')] [2024-03-21 06:02:09,064][04017] Updated weights for policy 0, policy_version 37453 (0.0017) [2024-03-21 06:02:10,521][03784] Fps is (10 sec: 32768.0, 60 sec: 46967.3, 300 sec: 45430.9). Total num frames: 1227325440. Throughput: 0: 45217.7. Samples: 1228706800. Policy #0 lag: (min: 0.0, avg: 61.5, max: 126.0) [2024-03-21 06:02:10,522][03784] Avg episode reward: [(0, '1.124')] [2024-03-21 06:02:15,521][03784] Fps is (10 sec: 39321.6, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1227489280. Throughput: 0: 45026.7. Samples: 1228832900. Policy #0 lag: (min: 0.0, avg: 61.5, max: 126.0) [2024-03-21 06:02:15,522][03784] Avg episode reward: [(0, '1.510')] [2024-03-21 06:02:16,273][04017] Updated weights for policy 0, policy_version 37463 (0.0011) [2024-03-21 06:02:20,521][03784] Fps is (10 sec: 49151.9, 60 sec: 48605.8, 300 sec: 45875.2). Total num frames: 1227816960. Throughput: 0: 45300.0. Samples: 1229103400. Policy #0 lag: (min: 2.0, avg: 25.5, max: 65.0) [2024-03-21 06:02:20,522][03784] Avg episode reward: [(0, '0.896')] [2024-03-21 06:02:23,102][04017] Updated weights for policy 0, policy_version 37473 (0.0020) [2024-03-21 06:02:25,521][03784] Fps is (10 sec: 52428.1, 60 sec: 46421.2, 300 sec: 45430.9). Total num frames: 1228013568. Throughput: 0: 45555.4. Samples: 1229387800. Policy #0 lag: (min: 2.0, avg: 25.5, max: 65.0) [2024-03-21 06:02:25,522][03784] Avg episode reward: [(0, '1.398')] [2024-03-21 06:02:29,957][04017] Updated weights for policy 0, policy_version 37483 (0.0015) [2024-03-21 06:02:30,521][03784] Fps is (10 sec: 45875.5, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 1228275712. Throughput: 0: 45311.1. Samples: 1229518200. Policy #0 lag: (min: 2.0, avg: 25.5, max: 65.0) [2024-03-21 06:02:30,522][03784] Avg episode reward: [(0, '1.203')] [2024-03-21 06:02:35,521][03784] Fps is (10 sec: 52429.7, 60 sec: 42052.3, 300 sec: 45986.3). Total num frames: 1228537856. Throughput: 0: 44662.3. Samples: 1229779600. Policy #0 lag: (min: 2.0, avg: 25.5, max: 65.0) [2024-03-21 06:02:35,522][03784] Avg episode reward: [(0, '0.869')] [2024-03-21 06:02:36,247][04017] Updated weights for policy 0, policy_version 37493 (0.0012) [2024-03-21 06:02:40,521][03784] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 45986.3). Total num frames: 1228668928. Throughput: 0: 44671.1. Samples: 1230056600. Policy #0 lag: (min: 2.0, avg: 25.5, max: 65.0) [2024-03-21 06:02:40,522][03784] Avg episode reward: [(0, '0.673')] [2024-03-21 06:02:43,852][04017] Updated weights for policy 0, policy_version 37503 (0.0018) [2024-03-21 06:02:43,930][03995] Signal inference workers to stop experience collection... (24750 times) [2024-03-21 06:02:43,931][03995] Signal inference workers to resume experience collection... (24750 times) [2024-03-21 06:02:43,983][04017] InferenceWorker_p0-w0: stopping experience collection (24750 times) [2024-03-21 06:02:43,984][04017] InferenceWorker_p0-w0: resuming experience collection (24750 times) [2024-03-21 06:02:45,521][03784] Fps is (10 sec: 45874.0, 60 sec: 42052.1, 300 sec: 45875.2). Total num frames: 1228996608. Throughput: 0: 44593.2. Samples: 1230193500. Policy #0 lag: (min: 0.0, avg: 50.5, max: 120.0) [2024-03-21 06:02:45,522][03784] Avg episode reward: [(0, '0.894')] [2024-03-21 06:02:50,521][03784] Fps is (10 sec: 52428.0, 60 sec: 42052.2, 300 sec: 45319.8). Total num frames: 1229193216. Throughput: 0: 44924.2. Samples: 1230464200. Policy #0 lag: (min: 0.0, avg: 50.5, max: 120.0) [2024-03-21 06:02:50,522][03784] Avg episode reward: [(0, '1.241')] [2024-03-21 06:02:50,920][04017] Updated weights for policy 0, policy_version 37513 (0.0011) [2024-03-21 06:02:55,521][03784] Fps is (10 sec: 49152.2, 60 sec: 44236.7, 300 sec: 45097.7). Total num frames: 1229488128. Throughput: 0: 44717.7. Samples: 1230719100. Policy #0 lag: (min: 0.0, avg: 50.5, max: 120.0) [2024-03-21 06:02:55,522][03784] Avg episode reward: [(0, '1.169')] [2024-03-21 06:02:56,186][04017] Updated weights for policy 0, policy_version 37523 (0.0016) [2024-03-21 06:03:00,521][03784] Fps is (10 sec: 52429.8, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 1229717504. Throughput: 0: 44837.8. Samples: 1230850600. Policy #0 lag: (min: 0.0, avg: 50.5, max: 120.0) [2024-03-21 06:03:00,522][03784] Avg episode reward: [(0, '1.057')] [2024-03-21 06:03:04,452][04017] Updated weights for policy 0, policy_version 37533 (0.0014) [2024-03-21 06:03:05,521][03784] Fps is (10 sec: 45876.0, 60 sec: 47513.6, 300 sec: 45208.7). Total num frames: 1229946880. Throughput: 0: 44784.5. Samples: 1231118700. Policy #0 lag: (min: 0.0, avg: 50.5, max: 120.0) [2024-03-21 06:03:05,522][03784] Avg episode reward: [(0, '1.572')] [2024-03-21 06:03:10,521][03784] Fps is (10 sec: 39321.4, 60 sec: 46421.3, 300 sec: 45430.9). Total num frames: 1230110720. Throughput: 0: 44342.3. Samples: 1231383200. Policy #0 lag: (min: 0.0, avg: 38.5, max: 76.0) [2024-03-21 06:03:10,522][03784] Avg episode reward: [(0, '1.177')] [2024-03-21 06:03:11,342][04017] Updated weights for policy 0, policy_version 37543 (0.0019) [2024-03-21 06:03:15,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46967.4, 300 sec: 45542.0). Total num frames: 1230307328. Throughput: 0: 44368.9. Samples: 1231514800. Policy #0 lag: (min: 0.0, avg: 38.5, max: 76.0) [2024-03-21 06:03:15,522][03784] Avg episode reward: [(0, '1.401')] [2024-03-21 06:03:20,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44236.8, 300 sec: 45097.6). Total num frames: 1230471168. Throughput: 0: 45128.8. Samples: 1231810400. Policy #0 lag: (min: 0.0, avg: 38.5, max: 76.0) [2024-03-21 06:03:20,522][03784] Avg episode reward: [(0, '1.297')] [2024-03-21 06:03:20,961][04017] Updated weights for policy 0, policy_version 37553 (0.0019) [2024-03-21 06:03:25,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44236.9, 300 sec: 45208.8). Total num frames: 1230667776. Throughput: 0: 45262.3. Samples: 1232093400. Policy #0 lag: (min: 0.0, avg: 38.5, max: 76.0) [2024-03-21 06:03:25,522][03784] Avg episode reward: [(0, '0.654')] [2024-03-21 06:03:28,464][04017] Updated weights for policy 0, policy_version 37563 (0.0019) [2024-03-21 06:03:30,521][03784] Fps is (10 sec: 45875.1, 60 sec: 44236.7, 300 sec: 45430.9). Total num frames: 1230929920. Throughput: 0: 44460.1. Samples: 1232194200. Policy #0 lag: (min: 0.0, avg: 38.5, max: 76.0) [2024-03-21 06:03:30,522][03784] Avg episode reward: [(0, '1.492')] [2024-03-21 06:03:34,960][04017] Updated weights for policy 0, policy_version 37573 (0.0011) [2024-03-21 06:03:35,521][03784] Fps is (10 sec: 55705.6, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 1231224832. Throughput: 0: 44631.3. Samples: 1232472600. Policy #0 lag: (min: 0.0, avg: 33.9, max: 118.0) [2024-03-21 06:03:35,522][03784] Avg episode reward: [(0, '1.373')] [2024-03-21 06:03:40,521][03784] Fps is (10 sec: 49152.4, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1231421440. Throughput: 0: 45326.8. Samples: 1232758800. Policy #0 lag: (min: 0.0, avg: 33.9, max: 118.0) [2024-03-21 06:03:40,522][03784] Avg episode reward: [(0, '1.676')] [2024-03-21 06:03:40,542][03995] Signal inference workers to stop experience collection... (24800 times) [2024-03-21 06:03:40,543][03995] Signal inference workers to resume experience collection... (24800 times) [2024-03-21 06:03:40,609][04017] InferenceWorker_p0-w0: stopping experience collection (24800 times) [2024-03-21 06:03:40,609][04017] InferenceWorker_p0-w0: resuming experience collection (24800 times) [2024-03-21 06:03:43,902][04017] Updated weights for policy 0, policy_version 37583 (0.0011) [2024-03-21 06:03:45,521][03784] Fps is (10 sec: 32768.0, 60 sec: 42598.6, 300 sec: 45097.7). Total num frames: 1231552512. Throughput: 0: 45686.7. Samples: 1232906500. Policy #0 lag: (min: 0.0, avg: 33.9, max: 118.0) [2024-03-21 06:03:45,522][03784] Avg episode reward: [(0, '1.085')] [2024-03-21 06:03:50,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43144.7, 300 sec: 45097.7). Total num frames: 1231781888. Throughput: 0: 45677.8. Samples: 1233174200. Policy #0 lag: (min: 0.0, avg: 33.9, max: 118.0) [2024-03-21 06:03:50,522][03784] Avg episode reward: [(0, '0.544')] [2024-03-21 06:03:51,015][04017] Updated weights for policy 0, policy_version 37593 (0.0012) [2024-03-21 06:03:55,521][03784] Fps is (10 sec: 52429.4, 60 sec: 43144.8, 300 sec: 44764.4). Total num frames: 1232076800. Throughput: 0: 45862.4. Samples: 1233447000. Policy #0 lag: (min: 0.0, avg: 33.9, max: 118.0) [2024-03-21 06:03:55,521][03784] Avg episode reward: [(0, '1.055')] [2024-03-21 06:03:56,489][04017] Updated weights for policy 0, policy_version 37603 (0.0018) [2024-03-21 06:04:00,521][03784] Fps is (10 sec: 62259.2, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 1232404480. Throughput: 0: 45180.0. Samples: 1233547900. Policy #0 lag: (min: 2.0, avg: 41.6, max: 97.0) [2024-03-21 06:04:00,522][03784] Avg episode reward: [(0, '1.284')] [2024-03-21 06:04:00,538][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000037610_1232404480.pth... [2024-03-21 06:04:00,658][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000037278_1221525504.pth [2024-03-21 06:04:02,251][04017] Updated weights for policy 0, policy_version 37613 (0.0012) [2024-03-21 06:04:05,521][03784] Fps is (10 sec: 62258.2, 60 sec: 45875.2, 300 sec: 45097.7). Total num frames: 1232699392. Throughput: 0: 44626.7. Samples: 1233818600. Policy #0 lag: (min: 2.0, avg: 41.6, max: 97.0) [2024-03-21 06:04:05,522][03784] Avg episode reward: [(0, '1.420')] [2024-03-21 06:04:08,359][04017] Updated weights for policy 0, policy_version 37623 (0.0014) [2024-03-21 06:04:10,521][03784] Fps is (10 sec: 49151.6, 60 sec: 46421.3, 300 sec: 44431.2). Total num frames: 1232896000. Throughput: 0: 44175.5. Samples: 1234081300. Policy #0 lag: (min: 2.0, avg: 41.6, max: 97.0) [2024-03-21 06:04:10,522][03784] Avg episode reward: [(0, '0.803')] [2024-03-21 06:04:15,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46967.5, 300 sec: 44875.5). Total num frames: 1233125376. Throughput: 0: 45033.4. Samples: 1234220700. Policy #0 lag: (min: 2.0, avg: 41.6, max: 97.0) [2024-03-21 06:04:15,522][03784] Avg episode reward: [(0, '1.413')] [2024-03-21 06:04:15,946][04017] Updated weights for policy 0, policy_version 37634 (0.0011) [2024-03-21 06:04:20,521][03784] Fps is (10 sec: 55706.8, 60 sec: 49698.3, 300 sec: 45542.0). Total num frames: 1233453056. Throughput: 0: 44606.8. Samples: 1234479900. Policy #0 lag: (min: 2.0, avg: 41.6, max: 97.0) [2024-03-21 06:04:20,521][03784] Avg episode reward: [(0, '0.764')] [2024-03-21 06:04:24,998][04017] Updated weights for policy 0, policy_version 37644 (0.0019) [2024-03-21 06:04:25,521][03784] Fps is (10 sec: 42598.6, 60 sec: 48059.7, 300 sec: 45208.7). Total num frames: 1233551360. Throughput: 0: 44566.7. Samples: 1234764300. Policy #0 lag: (min: 0.0, avg: 40.5, max: 87.0) [2024-03-21 06:04:25,522][03784] Avg episode reward: [(0, '1.515')] [2024-03-21 06:04:30,521][03784] Fps is (10 sec: 19660.4, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1233649664. Throughput: 0: 44648.8. Samples: 1234915700. Policy #0 lag: (min: 0.0, avg: 40.5, max: 87.0) [2024-03-21 06:04:30,522][03784] Avg episode reward: [(0, '1.388')] [2024-03-21 06:04:35,521][03784] Fps is (10 sec: 26214.4, 60 sec: 43144.5, 300 sec: 45097.7). Total num frames: 1233813504. Throughput: 0: 45131.1. Samples: 1235205100. Policy #0 lag: (min: 0.0, avg: 40.5, max: 87.0) [2024-03-21 06:04:35,522][03784] Avg episode reward: [(0, '1.388')] [2024-03-21 06:04:35,569][04017] Updated weights for policy 0, policy_version 37654 (0.0011) [2024-03-21 06:04:40,518][03995] Signal inference workers to stop experience collection... (24850 times) [2024-03-21 06:04:40,518][03995] Signal inference workers to resume experience collection... (24850 times) [2024-03-21 06:04:40,521][03784] Fps is (10 sec: 32768.4, 60 sec: 42598.4, 300 sec: 44986.6). Total num frames: 1233977344. Throughput: 0: 45595.5. Samples: 1235498800. Policy #0 lag: (min: 0.0, avg: 40.5, max: 87.0) [2024-03-21 06:04:40,522][03784] Avg episode reward: [(0, '1.220')] [2024-03-21 06:04:40,596][04017] InferenceWorker_p0-w0: stopping experience collection (24850 times) [2024-03-21 06:04:40,597][04017] InferenceWorker_p0-w0: resuming experience collection (24850 times) [2024-03-21 06:04:44,000][04017] Updated weights for policy 0, policy_version 37664 (0.0020) [2024-03-21 06:04:45,521][03784] Fps is (10 sec: 42598.2, 60 sec: 44782.9, 300 sec: 45542.0). Total num frames: 1234239488. Throughput: 0: 46362.2. Samples: 1235634200. Policy #0 lag: (min: 0.0, avg: 40.5, max: 87.0) [2024-03-21 06:04:45,522][03784] Avg episode reward: [(0, '1.117')] [2024-03-21 06:04:50,521][03784] Fps is (10 sec: 49152.3, 60 sec: 44783.0, 300 sec: 45208.8). Total num frames: 1234468864. Throughput: 0: 46829.1. Samples: 1235925900. Policy #0 lag: (min: 0.0, avg: 26.5, max: 60.0) [2024-03-21 06:04:50,522][03784] Avg episode reward: [(0, '1.626')] [2024-03-21 06:04:50,817][04017] Updated weights for policy 0, policy_version 37674 (0.0018) [2024-03-21 06:04:55,521][03784] Fps is (10 sec: 52428.5, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 1234763776. Throughput: 0: 47066.6. Samples: 1236199300. Policy #0 lag: (min: 0.0, avg: 26.5, max: 60.0) [2024-03-21 06:04:55,522][03784] Avg episode reward: [(0, '1.058')] [2024-03-21 06:04:56,095][04017] Updated weights for policy 0, policy_version 37684 (0.0016) [2024-03-21 06:05:00,024][04017] Updated weights for policy 0, policy_version 37694 (0.0013) [2024-03-21 06:05:00,521][03784] Fps is (10 sec: 72088.9, 60 sec: 46421.4, 300 sec: 45542.0). Total num frames: 1235189760. Throughput: 0: 46928.9. Samples: 1236332500. Policy #0 lag: (min: 0.0, avg: 26.5, max: 60.0) [2024-03-21 06:05:00,522][03784] Avg episode reward: [(0, '1.058')] [2024-03-21 06:05:05,205][04017] Updated weights for policy 0, policy_version 37704 (0.0015) [2024-03-21 06:05:05,521][03784] Fps is (10 sec: 72090.1, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 1235484672. Throughput: 0: 47315.4. Samples: 1236609100. Policy #0 lag: (min: 0.0, avg: 26.5, max: 60.0) [2024-03-21 06:05:05,522][03784] Avg episode reward: [(0, '1.409')] [2024-03-21 06:05:10,521][03784] Fps is (10 sec: 49151.6, 60 sec: 46421.4, 300 sec: 45430.9). Total num frames: 1235681280. Throughput: 0: 47224.4. Samples: 1236889400. Policy #0 lag: (min: 0.0, avg: 26.5, max: 60.0) [2024-03-21 06:05:10,522][03784] Avg episode reward: [(0, '1.409')] [2024-03-21 06:05:12,423][04017] Updated weights for policy 0, policy_version 37714 (0.0010) [2024-03-21 06:05:15,521][03784] Fps is (10 sec: 42599.1, 60 sec: 46421.4, 300 sec: 45542.0). Total num frames: 1235910656. Throughput: 0: 46911.3. Samples: 1237026700. Policy #0 lag: (min: 0.0, avg: 26.5, max: 60.0) [2024-03-21 06:05:15,522][03784] Avg episode reward: [(0, '1.409')] [2024-03-21 06:05:20,521][03784] Fps is (10 sec: 39321.9, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 1236074496. Throughput: 0: 46857.8. Samples: 1237313700. Policy #0 lag: (min: 0.0, avg: 37.5, max: 84.0) [2024-03-21 06:05:20,522][03784] Avg episode reward: [(0, '1.142')] [2024-03-21 06:05:20,987][04017] Updated weights for policy 0, policy_version 37724 (0.0010) [2024-03-21 06:05:25,266][03995] Signal inference workers to stop experience collection... (24900 times) [2024-03-21 06:05:25,291][04017] InferenceWorker_p0-w0: stopping experience collection (24900 times) [2024-03-21 06:05:25,521][03784] Fps is (10 sec: 49151.7, 60 sec: 47513.6, 300 sec: 45764.1). Total num frames: 1236402176. Throughput: 0: 45864.4. Samples: 1237562700. Policy #0 lag: (min: 0.0, avg: 37.5, max: 84.0) [2024-03-21 06:05:25,521][03784] Avg episode reward: [(0, '1.142')] [2024-03-21 06:05:25,605][03995] Signal inference workers to resume experience collection... (24900 times) [2024-03-21 06:05:25,606][04017] InferenceWorker_p0-w0: resuming experience collection (24900 times) [2024-03-21 06:05:25,944][04017] Updated weights for policy 0, policy_version 37734 (0.0016) [2024-03-21 06:05:30,521][03784] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 45653.1). Total num frames: 1236566016. Throughput: 0: 45324.5. Samples: 1237673800. Policy #0 lag: (min: 0.0, avg: 37.5, max: 84.0) [2024-03-21 06:05:30,522][03784] Avg episode reward: [(0, '0.916')] [2024-03-21 06:05:35,521][03784] Fps is (10 sec: 22937.3, 60 sec: 46967.4, 300 sec: 45208.7). Total num frames: 1236631552. Throughput: 0: 44062.0. Samples: 1237908700. Policy #0 lag: (min: 0.0, avg: 37.5, max: 84.0) [2024-03-21 06:05:35,522][03784] Avg episode reward: [(0, '1.594')] [2024-03-21 06:05:38,630][04017] Updated weights for policy 0, policy_version 37744 (0.0026) [2024-03-21 06:05:40,521][03784] Fps is (10 sec: 32767.9, 60 sec: 48605.8, 300 sec: 45097.7). Total num frames: 1236893696. Throughput: 0: 43671.2. Samples: 1238164500. Policy #0 lag: (min: 0.0, avg: 37.3, max: 89.0) [2024-03-21 06:05:40,522][03784] Avg episode reward: [(0, '0.531')] [2024-03-21 06:05:44,461][04017] Updated weights for policy 0, policy_version 37754 (0.0014) [2024-03-21 06:05:45,521][03784] Fps is (10 sec: 55706.2, 60 sec: 49152.1, 300 sec: 45319.8). Total num frames: 1237188608. Throughput: 0: 43724.5. Samples: 1238300100. Policy #0 lag: (min: 0.0, avg: 37.3, max: 89.0) [2024-03-21 06:05:45,522][03784] Avg episode reward: [(0, '1.612')] [2024-03-21 06:05:50,521][03784] Fps is (10 sec: 32767.5, 60 sec: 45875.0, 300 sec: 44653.3). Total num frames: 1237221376. Throughput: 0: 44299.9. Samples: 1238602600. Policy #0 lag: (min: 0.0, avg: 37.3, max: 89.0) [2024-03-21 06:05:50,522][03784] Avg episode reward: [(0, '1.612')] [2024-03-21 06:05:53,652][04017] Updated weights for policy 0, policy_version 37764 (0.0010) [2024-03-21 06:05:55,521][03784] Fps is (10 sec: 29491.0, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 1237483520. Throughput: 0: 44355.6. Samples: 1238885400. Policy #0 lag: (min: 0.0, avg: 37.3, max: 89.0) [2024-03-21 06:05:55,522][03784] Avg episode reward: [(0, '1.287')] [2024-03-21 06:06:00,521][03784] Fps is (10 sec: 36045.3, 60 sec: 39867.7, 300 sec: 44986.6). Total num frames: 1237581824. Throughput: 0: 44168.7. Samples: 1239014300. Policy #0 lag: (min: 0.0, avg: 37.3, max: 89.0) [2024-03-21 06:06:00,522][03784] Avg episode reward: [(0, '1.265')] [2024-03-21 06:06:00,974][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000037769_1237614592.pth... [2024-03-21 06:06:01,032][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000037445_1226997760.pth [2024-03-21 06:06:03,811][04017] Updated weights for policy 0, policy_version 37774 (0.0015) [2024-03-21 06:06:05,521][03784] Fps is (10 sec: 36045.1, 60 sec: 39321.7, 300 sec: 45208.7). Total num frames: 1237843968. Throughput: 0: 43744.5. Samples: 1239282200. Policy #0 lag: (min: 0.0, avg: 38.9, max: 88.0) [2024-03-21 06:06:05,522][03784] Avg episode reward: [(0, '1.708')] [2024-03-21 06:06:08,593][04017] Updated weights for policy 0, policy_version 37784 (0.0019) [2024-03-21 06:06:10,521][03784] Fps is (10 sec: 68812.3, 60 sec: 43144.5, 300 sec: 45986.3). Total num frames: 1238269952. Throughput: 0: 44275.4. Samples: 1239555100. Policy #0 lag: (min: 0.0, avg: 38.9, max: 88.0) [2024-03-21 06:06:10,522][03784] Avg episode reward: [(0, '1.708')] [2024-03-21 06:06:15,521][03784] Fps is (10 sec: 52428.8, 60 sec: 40959.9, 300 sec: 45653.1). Total num frames: 1238368256. Throughput: 0: 45277.8. Samples: 1239711300. Policy #0 lag: (min: 0.0, avg: 38.9, max: 88.0) [2024-03-21 06:06:15,522][03784] Avg episode reward: [(0, '1.708')] [2024-03-21 06:06:16,052][04017] Updated weights for policy 0, policy_version 37794 (0.0016) [2024-03-21 06:06:20,521][03784] Fps is (10 sec: 39322.2, 60 sec: 43144.6, 300 sec: 45542.0). Total num frames: 1238663168. Throughput: 0: 46080.1. Samples: 1239982300. Policy #0 lag: (min: 0.0, avg: 38.9, max: 88.0) [2024-03-21 06:06:20,522][03784] Avg episode reward: [(0, '1.326')] [2024-03-21 06:06:22,366][04017] Updated weights for policy 0, policy_version 37804 (0.0012) [2024-03-21 06:06:25,521][03784] Fps is (10 sec: 68812.3, 60 sec: 44236.7, 300 sec: 45542.0). Total num frames: 1239056384. Throughput: 0: 45886.6. Samples: 1240229400. Policy #0 lag: (min: 0.0, avg: 38.9, max: 88.0) [2024-03-21 06:06:25,522][03784] Avg episode reward: [(0, '0.868')] [2024-03-21 06:06:25,828][04017] Updated weights for policy 0, policy_version 37814 (0.0022) [2024-03-21 06:06:25,896][03995] Signal inference workers to stop experience collection... (24950 times) [2024-03-21 06:06:25,897][03995] Signal inference workers to resume experience collection... (24950 times) [2024-03-21 06:06:25,942][04017] InferenceWorker_p0-w0: stopping experience collection (24950 times) [2024-03-21 06:06:25,943][04017] InferenceWorker_p0-w0: resuming experience collection (24950 times) [2024-03-21 06:06:30,521][03784] Fps is (10 sec: 55705.2, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 1239220224. Throughput: 0: 45571.1. Samples: 1240350800. Policy #0 lag: (min: 0.0, avg: 38.9, max: 88.0) [2024-03-21 06:06:30,522][03784] Avg episode reward: [(0, '1.204')] [2024-03-21 06:06:34,570][04017] Updated weights for policy 0, policy_version 37824 (0.0011) [2024-03-21 06:06:35,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46967.5, 300 sec: 44875.5). Total num frames: 1239449600. Throughput: 0: 44842.3. Samples: 1240620500. Policy #0 lag: (min: 0.0, avg: 58.4, max: 114.0) [2024-03-21 06:06:35,522][03784] Avg episode reward: [(0, '1.039')] [2024-03-21 06:06:40,521][03784] Fps is (10 sec: 49151.7, 60 sec: 46967.4, 300 sec: 44875.5). Total num frames: 1239711744. Throughput: 0: 44675.5. Samples: 1240895800. Policy #0 lag: (min: 0.0, avg: 58.4, max: 114.0) [2024-03-21 06:06:40,522][03784] Avg episode reward: [(0, '1.039')] [2024-03-21 06:06:40,793][04017] Updated weights for policy 0, policy_version 37834 (0.0011) [2024-03-21 06:06:45,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 1239875584. Throughput: 0: 44851.1. Samples: 1241032600. Policy #0 lag: (min: 0.0, avg: 58.4, max: 114.0) [2024-03-21 06:06:45,522][03784] Avg episode reward: [(0, '0.678')] [2024-03-21 06:06:50,521][03784] Fps is (10 sec: 32768.5, 60 sec: 46967.6, 300 sec: 44764.4). Total num frames: 1240039424. Throughput: 0: 44088.9. Samples: 1241266200. Policy #0 lag: (min: 0.0, avg: 58.4, max: 114.0) [2024-03-21 06:06:50,522][03784] Avg episode reward: [(0, '1.208')] [2024-03-21 06:06:51,567][04017] Updated weights for policy 0, policy_version 37844 (0.0012) [2024-03-21 06:06:55,521][03784] Fps is (10 sec: 22937.6, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1240104960. Throughput: 0: 43611.1. Samples: 1241517600. Policy #0 lag: (min: 0.0, avg: 58.4, max: 114.0) [2024-03-21 06:06:55,523][03784] Avg episode reward: [(0, '0.518')] [2024-03-21 06:07:00,371][04017] Updated weights for policy 0, policy_version 37854 (0.0015) [2024-03-21 06:07:00,521][03784] Fps is (10 sec: 36043.8, 60 sec: 46967.3, 300 sec: 45097.6). Total num frames: 1240399872. Throughput: 0: 42866.4. Samples: 1241640300. Policy #0 lag: (min: 0.0, avg: 33.5, max: 81.0) [2024-03-21 06:07:00,522][03784] Avg episode reward: [(0, '0.858')] [2024-03-21 06:07:04,190][04017] Updated weights for policy 0, policy_version 37864 (0.0012) [2024-03-21 06:07:05,521][03784] Fps is (10 sec: 65535.9, 60 sec: 48605.8, 300 sec: 45542.0). Total num frames: 1240760320. Throughput: 0: 42535.4. Samples: 1241896400. Policy #0 lag: (min: 0.0, avg: 33.5, max: 81.0) [2024-03-21 06:07:05,522][03784] Avg episode reward: [(0, '1.008')] [2024-03-21 06:07:10,521][03784] Fps is (10 sec: 62259.3, 60 sec: 45875.1, 300 sec: 45875.2). Total num frames: 1241022464. Throughput: 0: 43159.9. Samples: 1242171600. Policy #0 lag: (min: 0.0, avg: 33.5, max: 81.0) [2024-03-21 06:07:10,522][03784] Avg episode reward: [(0, '1.058')] [2024-03-21 06:07:10,886][04017] Updated weights for policy 0, policy_version 37874 (0.0010) [2024-03-21 06:07:15,521][03784] Fps is (10 sec: 42598.9, 60 sec: 46967.5, 300 sec: 45319.8). Total num frames: 1241186304. Throughput: 0: 43555.6. Samples: 1242310800. Policy #0 lag: (min: 0.0, avg: 33.5, max: 81.0) [2024-03-21 06:07:15,522][03784] Avg episode reward: [(0, '0.768')] [2024-03-21 06:07:20,521][03784] Fps is (10 sec: 29491.6, 60 sec: 44236.7, 300 sec: 45097.7). Total num frames: 1241317376. Throughput: 0: 43631.1. Samples: 1242583900. Policy #0 lag: (min: 0.0, avg: 33.5, max: 81.0) [2024-03-21 06:07:20,522][03784] Avg episode reward: [(0, '1.158')] [2024-03-21 06:07:22,815][04017] Updated weights for policy 0, policy_version 37884 (0.0015) [2024-03-21 06:07:25,521][03784] Fps is (10 sec: 22937.4, 60 sec: 39321.6, 300 sec: 44542.3). Total num frames: 1241415680. Throughput: 0: 43935.6. Samples: 1242872900. Policy #0 lag: (min: 0.0, avg: 35.6, max: 72.0) [2024-03-21 06:07:25,522][03784] Avg episode reward: [(0, '0.548')] [2024-03-21 06:07:30,521][03784] Fps is (10 sec: 29491.2, 60 sec: 39867.7, 300 sec: 44320.1). Total num frames: 1241612288. Throughput: 0: 43857.8. Samples: 1243006200. Policy #0 lag: (min: 0.0, avg: 35.6, max: 72.0) [2024-03-21 06:07:30,522][03784] Avg episode reward: [(0, '0.998')] [2024-03-21 06:07:31,351][04017] Updated weights for policy 0, policy_version 37894 (0.0020) [2024-03-21 06:07:35,521][03784] Fps is (10 sec: 42599.2, 60 sec: 39867.9, 300 sec: 44653.4). Total num frames: 1241841664. Throughput: 0: 45044.5. Samples: 1243293200. Policy #0 lag: (min: 0.0, avg: 35.6, max: 72.0) [2024-03-21 06:07:35,521][03784] Avg episode reward: [(0, '0.998')] [2024-03-21 06:07:38,223][03995] Signal inference workers to stop experience collection... (25000 times) [2024-03-21 06:07:38,224][03995] Signal inference workers to resume experience collection... (25000 times) [2024-03-21 06:07:38,372][04017] InferenceWorker_p0-w0: stopping experience collection (25000 times) [2024-03-21 06:07:38,372][04017] InferenceWorker_p0-w0: resuming experience collection (25000 times) [2024-03-21 06:07:38,555][04017] Updated weights for policy 0, policy_version 37904 (0.0018) [2024-03-21 06:07:40,521][03784] Fps is (10 sec: 45875.3, 60 sec: 39321.6, 300 sec: 44320.1). Total num frames: 1242071040. Throughput: 0: 45511.2. Samples: 1243565600. Policy #0 lag: (min: 0.0, avg: 35.6, max: 72.0) [2024-03-21 06:07:40,522][03784] Avg episode reward: [(0, '1.165')] [2024-03-21 06:07:43,620][04017] Updated weights for policy 0, policy_version 37914 (0.0013) [2024-03-21 06:07:45,521][03784] Fps is (10 sec: 72090.0, 60 sec: 44783.1, 300 sec: 45319.9). Total num frames: 1242562560. Throughput: 0: 45291.5. Samples: 1243678400. Policy #0 lag: (min: 0.0, avg: 35.6, max: 72.0) [2024-03-21 06:07:45,521][03784] Avg episode reward: [(0, '1.593')] [2024-03-21 06:07:48,084][04017] Updated weights for policy 0, policy_version 37924 (0.0014) [2024-03-21 06:07:50,521][03784] Fps is (10 sec: 81920.2, 60 sec: 47513.5, 300 sec: 45430.9). Total num frames: 1242890240. Throughput: 0: 44760.1. Samples: 1243910600. Policy #0 lag: (min: 0.0, avg: 35.8, max: 75.0) [2024-03-21 06:07:50,522][03784] Avg episode reward: [(0, '0.624')] [2024-03-21 06:07:55,328][04017] Updated weights for policy 0, policy_version 37934 (0.0012) [2024-03-21 06:07:55,521][03784] Fps is (10 sec: 45873.4, 60 sec: 48605.8, 300 sec: 45097.6). Total num frames: 1243021312. Throughput: 0: 45044.4. Samples: 1244198600. Policy #0 lag: (min: 0.0, avg: 35.8, max: 75.0) [2024-03-21 06:07:55,523][03784] Avg episode reward: [(0, '0.701')] [2024-03-21 06:08:00,521][03784] Fps is (10 sec: 26214.1, 60 sec: 45875.3, 300 sec: 44764.4). Total num frames: 1243152384. Throughput: 0: 45226.5. Samples: 1244346000. Policy #0 lag: (min: 0.0, avg: 35.8, max: 75.0) [2024-03-21 06:08:00,524][03784] Avg episode reward: [(0, '0.833')] [2024-03-21 06:08:00,816][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000037939_1243185152.pth... [2024-03-21 06:08:00,951][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000037610_1232404480.pth [2024-03-21 06:08:05,521][03784] Fps is (10 sec: 26214.9, 60 sec: 42052.3, 300 sec: 44653.3). Total num frames: 1243283456. Throughput: 0: 46104.5. Samples: 1244658600. Policy #0 lag: (min: 0.0, avg: 35.8, max: 75.0) [2024-03-21 06:08:05,522][03784] Avg episode reward: [(0, '0.833')] [2024-03-21 06:08:06,019][04017] Updated weights for policy 0, policy_version 37944 (0.0012) [2024-03-21 06:08:10,521][03784] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 44875.5). Total num frames: 1243545600. Throughput: 0: 45864.5. Samples: 1244936800. Policy #0 lag: (min: 0.0, avg: 35.8, max: 75.0) [2024-03-21 06:08:10,522][03784] Avg episode reward: [(0, '0.940')] [2024-03-21 06:08:12,112][04017] Updated weights for policy 0, policy_version 37954 (0.0011) [2024-03-21 06:08:15,521][03784] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 45208.7). Total num frames: 1243807744. Throughput: 0: 45953.3. Samples: 1245074100. Policy #0 lag: (min: 2.0, avg: 33.9, max: 70.0) [2024-03-21 06:08:15,522][03784] Avg episode reward: [(0, '0.940')] [2024-03-21 06:08:17,405][04017] Updated weights for policy 0, policy_version 37964 (0.0020) [2024-03-21 06:08:20,521][03784] Fps is (10 sec: 49152.2, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1244037120. Throughput: 0: 45728.8. Samples: 1245351000. Policy #0 lag: (min: 2.0, avg: 33.9, max: 70.0) [2024-03-21 06:08:20,522][03784] Avg episode reward: [(0, '0.931')] [2024-03-21 06:08:25,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46967.5, 300 sec: 45097.7). Total num frames: 1244233728. Throughput: 0: 46148.8. Samples: 1245642300. Policy #0 lag: (min: 2.0, avg: 33.9, max: 70.0) [2024-03-21 06:08:25,522][03784] Avg episode reward: [(0, '1.320')] [2024-03-21 06:08:25,925][03995] Signal inference workers to stop experience collection... (25050 times) [2024-03-21 06:08:25,926][03995] Signal inference workers to resume experience collection... (25050 times) [2024-03-21 06:08:26,025][04017] InferenceWorker_p0-w0: stopping experience collection (25050 times) [2024-03-21 06:08:26,026][04017] InferenceWorker_p0-w0: resuming experience collection (25050 times) [2024-03-21 06:08:26,605][04017] Updated weights for policy 0, policy_version 37974 (0.0019) [2024-03-21 06:08:30,521][03784] Fps is (10 sec: 55705.6, 60 sec: 49698.2, 300 sec: 45319.8). Total num frames: 1244594176. Throughput: 0: 46377.6. Samples: 1245765400. Policy #0 lag: (min: 2.0, avg: 33.9, max: 70.0) [2024-03-21 06:08:30,522][03784] Avg episode reward: [(0, '1.320')] [2024-03-21 06:08:31,718][04017] Updated weights for policy 0, policy_version 37984 (0.0010) [2024-03-21 06:08:35,521][03784] Fps is (10 sec: 62259.1, 60 sec: 50244.1, 300 sec: 45542.0). Total num frames: 1244856320. Throughput: 0: 47342.1. Samples: 1246041000. Policy #0 lag: (min: 2.0, avg: 33.9, max: 70.0) [2024-03-21 06:08:35,523][03784] Avg episode reward: [(0, '1.239')] [2024-03-21 06:08:37,495][04017] Updated weights for policy 0, policy_version 37994 (0.0011) [2024-03-21 06:08:40,521][03784] Fps is (10 sec: 42598.2, 60 sec: 49152.0, 300 sec: 45653.0). Total num frames: 1245020160. Throughput: 0: 47380.2. Samples: 1246330700. Policy #0 lag: (min: 1.0, avg: 49.3, max: 101.0) [2024-03-21 06:08:40,522][03784] Avg episode reward: [(0, '1.387')] [2024-03-21 06:08:45,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45328.9, 300 sec: 45764.1). Total num frames: 1245282304. Throughput: 0: 47386.8. Samples: 1246478400. Policy #0 lag: (min: 1.0, avg: 49.3, max: 101.0) [2024-03-21 06:08:45,522][03784] Avg episode reward: [(0, '1.516')] [2024-03-21 06:08:46,717][04017] Updated weights for policy 0, policy_version 38004 (0.0014) [2024-03-21 06:08:50,521][03784] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 45542.0). Total num frames: 1245511680. Throughput: 0: 45980.0. Samples: 1246727700. Policy #0 lag: (min: 1.0, avg: 49.3, max: 101.0) [2024-03-21 06:08:50,522][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 06:08:55,295][04017] Updated weights for policy 0, policy_version 38014 (0.0012) [2024-03-21 06:08:55,521][03784] Fps is (10 sec: 36044.6, 60 sec: 43690.8, 300 sec: 44875.5). Total num frames: 1245642752. Throughput: 0: 46060.0. Samples: 1247009500. Policy #0 lag: (min: 1.0, avg: 49.3, max: 101.0) [2024-03-21 06:08:55,522][03784] Avg episode reward: [(0, '1.189')] [2024-03-21 06:09:00,521][03784] Fps is (10 sec: 39321.4, 60 sec: 45875.3, 300 sec: 44764.4). Total num frames: 1245904896. Throughput: 0: 45746.7. Samples: 1247132700. Policy #0 lag: (min: 1.0, avg: 49.3, max: 101.0) [2024-03-21 06:09:00,522][03784] Avg episode reward: [(0, '0.821')] [2024-03-21 06:09:01,203][04017] Updated weights for policy 0, policy_version 38024 (0.0019) [2024-03-21 06:09:05,521][03784] Fps is (10 sec: 52429.3, 60 sec: 48059.8, 300 sec: 44986.6). Total num frames: 1246167040. Throughput: 0: 45397.8. Samples: 1247393900. Policy #0 lag: (min: 1.0, avg: 38.7, max: 79.0) [2024-03-21 06:09:05,522][03784] Avg episode reward: [(0, '1.076')] [2024-03-21 06:09:06,619][04017] Updated weights for policy 0, policy_version 38034 (0.0012) [2024-03-21 06:09:10,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46421.3, 300 sec: 44764.4). Total num frames: 1246330880. Throughput: 0: 45388.9. Samples: 1247684800. Policy #0 lag: (min: 1.0, avg: 38.7, max: 79.0) [2024-03-21 06:09:10,522][03784] Avg episode reward: [(0, '0.824')] [2024-03-21 06:09:12,572][03995] Signal inference workers to stop experience collection... (25100 times) [2024-03-21 06:09:12,631][04017] InferenceWorker_p0-w0: stopping experience collection (25100 times) [2024-03-21 06:09:12,876][03995] Signal inference workers to resume experience collection... (25100 times) [2024-03-21 06:09:12,876][04017] InferenceWorker_p0-w0: resuming experience collection (25100 times) [2024-03-21 06:09:14,084][04017] Updated weights for policy 0, policy_version 38044 (0.0015) [2024-03-21 06:09:15,521][03784] Fps is (10 sec: 55705.3, 60 sec: 48605.9, 300 sec: 44986.6). Total num frames: 1246724096. Throughput: 0: 45413.3. Samples: 1247809000. Policy #0 lag: (min: 1.0, avg: 38.7, max: 79.0) [2024-03-21 06:09:15,522][03784] Avg episode reward: [(0, '1.448')] [2024-03-21 06:09:18,691][04017] Updated weights for policy 0, policy_version 38054 (0.0012) [2024-03-21 06:09:20,521][03784] Fps is (10 sec: 72090.0, 60 sec: 50244.3, 300 sec: 45764.1). Total num frames: 1247051776. Throughput: 0: 45231.2. Samples: 1248076400. Policy #0 lag: (min: 1.0, avg: 38.7, max: 79.0) [2024-03-21 06:09:20,522][03784] Avg episode reward: [(0, '1.383')] [2024-03-21 06:09:25,521][03784] Fps is (10 sec: 55705.4, 60 sec: 50790.4, 300 sec: 46208.4). Total num frames: 1247281152. Throughput: 0: 44868.8. Samples: 1248349800. Policy #0 lag: (min: 1.0, avg: 48.0, max: 94.0) [2024-03-21 06:09:25,522][03784] Avg episode reward: [(0, '1.142')] [2024-03-21 06:09:25,535][04017] Updated weights for policy 0, policy_version 38064 (0.0020) [2024-03-21 06:09:30,521][03784] Fps is (10 sec: 22937.4, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 1247281152. Throughput: 0: 45006.6. Samples: 1248503700. Policy #0 lag: (min: 1.0, avg: 48.0, max: 94.0) [2024-03-21 06:09:30,522][03784] Avg episode reward: [(0, '0.806')] [2024-03-21 06:09:35,521][03784] Fps is (10 sec: 13107.3, 60 sec: 42598.5, 300 sec: 45542.0). Total num frames: 1247412224. Throughput: 0: 46333.3. Samples: 1248812700. Policy #0 lag: (min: 1.0, avg: 48.0, max: 94.0) [2024-03-21 06:09:35,522][03784] Avg episode reward: [(0, '0.979')] [2024-03-21 06:09:39,918][04017] Updated weights for policy 0, policy_version 38074 (0.0012) [2024-03-21 06:09:40,521][03784] Fps is (10 sec: 36045.0, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 1247641600. Throughput: 0: 46491.2. Samples: 1249101600. Policy #0 lag: (min: 1.0, avg: 48.0, max: 94.0) [2024-03-21 06:09:40,522][03784] Avg episode reward: [(0, '0.790')] [2024-03-21 06:09:45,521][03784] Fps is (10 sec: 36044.8, 60 sec: 41506.2, 300 sec: 45097.6). Total num frames: 1247772672. Throughput: 0: 47117.8. Samples: 1249253000. Policy #0 lag: (min: 1.0, avg: 48.0, max: 94.0) [2024-03-21 06:09:45,522][03784] Avg episode reward: [(0, '0.787')] [2024-03-21 06:09:47,601][04017] Updated weights for policy 0, policy_version 38084 (0.0018) [2024-03-21 06:09:50,521][03784] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 44986.6). Total num frames: 1248034816. Throughput: 0: 47268.9. Samples: 1249521000. Policy #0 lag: (min: 1.0, avg: 48.0, max: 94.0) [2024-03-21 06:09:50,522][03784] Avg episode reward: [(0, '1.645')] [2024-03-21 06:09:52,684][04017] Updated weights for policy 0, policy_version 38094 (0.0020) [2024-03-21 06:09:55,521][03784] Fps is (10 sec: 68812.3, 60 sec: 46967.5, 300 sec: 44986.6). Total num frames: 1248460800. Throughput: 0: 46424.4. Samples: 1249773900. Policy #0 lag: (min: 0.0, avg: 47.8, max: 115.0) [2024-03-21 06:09:55,522][03784] Avg episode reward: [(0, '1.254')] [2024-03-21 06:09:56,876][04017] Updated weights for policy 0, policy_version 38104 (0.0019) [2024-03-21 06:10:00,521][03784] Fps is (10 sec: 75366.0, 60 sec: 48059.8, 300 sec: 45097.7). Total num frames: 1248788480. Throughput: 0: 46586.7. Samples: 1249905400. Policy #0 lag: (min: 0.0, avg: 47.8, max: 115.0) [2024-03-21 06:10:00,522][03784] Avg episode reward: [(0, '0.539')] [2024-03-21 06:10:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000038110_1248788480.pth... [2024-03-21 06:10:00,653][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000037769_1237614592.pth [2024-03-21 06:10:01,883][03995] Signal inference workers to stop experience collection... (25150 times) [2024-03-21 06:10:01,954][03995] Signal inference workers to resume experience collection... (25150 times) [2024-03-21 06:10:01,958][04017] InferenceWorker_p0-w0: stopping experience collection (25150 times) [2024-03-21 06:10:01,994][04017] InferenceWorker_p0-w0: resuming experience collection (25150 times) [2024-03-21 06:10:02,302][04017] Updated weights for policy 0, policy_version 38114 (0.0021) [2024-03-21 06:10:05,521][03784] Fps is (10 sec: 55705.9, 60 sec: 47513.6, 300 sec: 45208.7). Total num frames: 1249017856. Throughput: 0: 46664.4. Samples: 1250176300. Policy #0 lag: (min: 0.0, avg: 47.8, max: 115.0) [2024-03-21 06:10:05,522][03784] Avg episode reward: [(0, '0.865')] [2024-03-21 06:10:10,521][03784] Fps is (10 sec: 42597.8, 60 sec: 48059.6, 300 sec: 45097.6). Total num frames: 1249214464. Throughput: 0: 46357.7. Samples: 1250435900. Policy #0 lag: (min: 0.0, avg: 47.8, max: 115.0) [2024-03-21 06:10:10,522][03784] Avg episode reward: [(0, '0.901')] [2024-03-21 06:10:10,868][04017] Updated weights for policy 0, policy_version 38124 (0.0015) [2024-03-21 06:10:14,876][04017] Updated weights for policy 0, policy_version 38134 (0.0017) [2024-03-21 06:10:15,521][03784] Fps is (10 sec: 55705.4, 60 sec: 47513.6, 300 sec: 45764.1). Total num frames: 1249574912. Throughput: 0: 45533.3. Samples: 1250552700. Policy #0 lag: (min: 5.0, avg: 46.2, max: 87.0) [2024-03-21 06:10:15,522][03784] Avg episode reward: [(0, '0.982')] [2024-03-21 06:10:20,521][03784] Fps is (10 sec: 52429.2, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 1249738752. Throughput: 0: 44179.9. Samples: 1250800800. Policy #0 lag: (min: 5.0, avg: 46.2, max: 87.0) [2024-03-21 06:10:20,522][03784] Avg episode reward: [(0, '0.766')] [2024-03-21 06:10:23,428][04017] Updated weights for policy 0, policy_version 38144 (0.0012) [2024-03-21 06:10:25,521][03784] Fps is (10 sec: 39321.9, 60 sec: 44783.0, 300 sec: 45430.9). Total num frames: 1249968128. Throughput: 0: 42953.4. Samples: 1251034500. Policy #0 lag: (min: 5.0, avg: 46.2, max: 87.0) [2024-03-21 06:10:25,522][03784] Avg episode reward: [(0, '0.735')] [2024-03-21 06:10:30,526][03784] Fps is (10 sec: 29477.1, 60 sec: 45871.5, 300 sec: 45430.2). Total num frames: 1250033664. Throughput: 0: 42848.7. Samples: 1251181400. Policy #0 lag: (min: 5.0, avg: 46.2, max: 87.0) [2024-03-21 06:10:30,527][03784] Avg episode reward: [(0, '0.994')] [2024-03-21 06:10:34,766][04017] Updated weights for policy 0, policy_version 38154 (0.0015) [2024-03-21 06:10:35,521][03784] Fps is (10 sec: 32768.0, 60 sec: 48059.7, 300 sec: 45430.9). Total num frames: 1250295808. Throughput: 0: 43157.8. Samples: 1251463100. Policy #0 lag: (min: 5.0, avg: 46.2, max: 87.0) [2024-03-21 06:10:35,522][03784] Avg episode reward: [(0, '1.317')] [2024-03-21 06:10:40,521][03784] Fps is (10 sec: 45897.4, 60 sec: 47513.6, 300 sec: 45097.6). Total num frames: 1250492416. Throughput: 0: 43731.2. Samples: 1251741800. Policy #0 lag: (min: 2.0, avg: 32.3, max: 74.0) [2024-03-21 06:10:40,522][03784] Avg episode reward: [(0, '0.984')] [2024-03-21 06:10:40,924][04017] Updated weights for policy 0, policy_version 38164 (0.0016) [2024-03-21 06:10:45,521][03784] Fps is (10 sec: 36044.6, 60 sec: 48059.7, 300 sec: 45542.0). Total num frames: 1250656256. Throughput: 0: 44077.8. Samples: 1251888900. Policy #0 lag: (min: 2.0, avg: 32.3, max: 74.0) [2024-03-21 06:10:45,522][03784] Avg episode reward: [(0, '1.106')] [2024-03-21 06:10:50,521][03784] Fps is (10 sec: 26214.5, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 1250754560. Throughput: 0: 44228.9. Samples: 1252166600. Policy #0 lag: (min: 2.0, avg: 32.3, max: 74.0) [2024-03-21 06:10:50,522][03784] Avg episode reward: [(0, '0.851')] [2024-03-21 06:10:53,298][04017] Updated weights for policy 0, policy_version 38174 (0.0012) [2024-03-21 06:10:55,521][03784] Fps is (10 sec: 26214.3, 60 sec: 40960.0, 300 sec: 45208.7). Total num frames: 1250918400. Throughput: 0: 44851.2. Samples: 1252454200. Policy #0 lag: (min: 2.0, avg: 32.3, max: 74.0) [2024-03-21 06:10:55,522][03784] Avg episode reward: [(0, '0.833')] [2024-03-21 06:10:58,824][04017] Updated weights for policy 0, policy_version 38184 (0.0011) [2024-03-21 06:11:00,521][03784] Fps is (10 sec: 58982.0, 60 sec: 42598.4, 300 sec: 45764.1). Total num frames: 1251344384. Throughput: 0: 45248.9. Samples: 1252588900. Policy #0 lag: (min: 2.0, avg: 32.3, max: 74.0) [2024-03-21 06:11:00,522][03784] Avg episode reward: [(0, '1.054')] [2024-03-21 06:11:04,129][04017] Updated weights for policy 0, policy_version 38194 (0.0011) [2024-03-21 06:11:05,521][03784] Fps is (10 sec: 72089.6, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 1251639296. Throughput: 0: 45471.1. Samples: 1252847000. Policy #0 lag: (min: 0.0, avg: 32.8, max: 73.0) [2024-03-21 06:11:05,522][03784] Avg episode reward: [(0, '1.233')] [2024-03-21 06:11:09,671][03995] Signal inference workers to stop experience collection... (25200 times) [2024-03-21 06:11:09,679][03995] Signal inference workers to resume experience collection... (25200 times) [2024-03-21 06:11:09,916][04017] InferenceWorker_p0-w0: stopping experience collection (25200 times) [2024-03-21 06:11:09,916][04017] InferenceWorker_p0-w0: resuming experience collection (25200 times) [2024-03-21 06:11:10,521][03784] Fps is (10 sec: 49152.3, 60 sec: 43690.8, 300 sec: 45653.0). Total num frames: 1251835904. Throughput: 0: 46128.9. Samples: 1253110300. Policy #0 lag: (min: 0.0, avg: 32.8, max: 73.0) [2024-03-21 06:11:10,522][03784] Avg episode reward: [(0, '0.903')] [2024-03-21 06:11:11,674][04017] Updated weights for policy 0, policy_version 38204 (0.0011) [2024-03-21 06:11:15,521][03784] Fps is (10 sec: 32768.4, 60 sec: 39867.8, 300 sec: 45097.7). Total num frames: 1251966976. Throughput: 0: 46076.1. Samples: 1253254600. Policy #0 lag: (min: 0.0, avg: 32.8, max: 73.0) [2024-03-21 06:11:15,522][03784] Avg episode reward: [(0, '0.770')] [2024-03-21 06:11:19,205][04017] Updated weights for policy 0, policy_version 38214 (0.0010) [2024-03-21 06:11:20,521][03784] Fps is (10 sec: 36045.0, 60 sec: 40960.1, 300 sec: 44542.3). Total num frames: 1252196352. Throughput: 0: 45800.0. Samples: 1253524100. Policy #0 lag: (min: 0.0, avg: 32.8, max: 73.0) [2024-03-21 06:11:20,522][03784] Avg episode reward: [(0, '1.430')] [2024-03-21 06:11:23,791][04017] Updated weights for policy 0, policy_version 38224 (0.0012) [2024-03-21 06:11:25,521][03784] Fps is (10 sec: 62259.8, 60 sec: 43690.8, 300 sec: 45319.8). Total num frames: 1252589568. Throughput: 0: 45377.9. Samples: 1253783800. Policy #0 lag: (min: 0.0, avg: 32.8, max: 73.0) [2024-03-21 06:11:25,522][03784] Avg episode reward: [(0, '1.430')] [2024-03-21 06:11:29,782][04017] Updated weights for policy 0, policy_version 38234 (0.0011) [2024-03-21 06:11:30,521][03784] Fps is (10 sec: 68811.7, 60 sec: 47517.4, 300 sec: 45542.0). Total num frames: 1252884480. Throughput: 0: 45073.3. Samples: 1253917200. Policy #0 lag: (min: 0.0, avg: 47.8, max: 84.0) [2024-03-21 06:11:30,522][03784] Avg episode reward: [(0, '1.317')] [2024-03-21 06:11:35,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44236.9, 300 sec: 44875.5). Total num frames: 1252950016. Throughput: 0: 45333.4. Samples: 1254206600. Policy #0 lag: (min: 0.0, avg: 47.8, max: 84.0) [2024-03-21 06:11:35,522][03784] Avg episode reward: [(0, '1.317')] [2024-03-21 06:11:40,335][04017] Updated weights for policy 0, policy_version 38244 (0.0031) [2024-03-21 06:11:40,522][03784] Fps is (10 sec: 29490.6, 60 sec: 44782.7, 300 sec: 45097.6). Total num frames: 1253179392. Throughput: 0: 44819.8. Samples: 1254471100. Policy #0 lag: (min: 0.0, avg: 47.8, max: 84.0) [2024-03-21 06:11:40,523][03784] Avg episode reward: [(0, '0.596')] [2024-03-21 06:11:45,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45875.3, 300 sec: 45319.8). Total num frames: 1253408768. Throughput: 0: 44697.9. Samples: 1254600300. Policy #0 lag: (min: 0.0, avg: 47.8, max: 84.0) [2024-03-21 06:11:45,522][03784] Avg episode reward: [(0, '1.168')] [2024-03-21 06:11:46,395][04017] Updated weights for policy 0, policy_version 38254 (0.0012) [2024-03-21 06:11:50,521][03784] Fps is (10 sec: 52430.7, 60 sec: 49152.0, 300 sec: 46097.4). Total num frames: 1253703680. Throughput: 0: 44702.4. Samples: 1254858600. Policy #0 lag: (min: 0.0, avg: 47.8, max: 84.0) [2024-03-21 06:11:50,522][03784] Avg episode reward: [(0, '0.737')] [2024-03-21 06:11:52,528][04017] Updated weights for policy 0, policy_version 38264 (0.0010) [2024-03-21 06:11:55,521][03784] Fps is (10 sec: 65535.3, 60 sec: 52428.8, 300 sec: 46319.5). Total num frames: 1254064128. Throughput: 0: 44888.9. Samples: 1255130300. Policy #0 lag: (min: 2.0, avg: 40.9, max: 88.0) [2024-03-21 06:11:55,522][03784] Avg episode reward: [(0, '0.901')] [2024-03-21 06:11:58,656][03995] Signal inference workers to stop experience collection... (25250 times) [2024-03-21 06:11:58,706][04017] InferenceWorker_p0-w0: stopping experience collection (25250 times) [2024-03-21 06:11:58,920][03995] Signal inference workers to resume experience collection... (25250 times) [2024-03-21 06:11:58,920][04017] InferenceWorker_p0-w0: resuming experience collection (25250 times) [2024-03-21 06:11:58,923][04017] Updated weights for policy 0, policy_version 38274 (0.0010) [2024-03-21 06:12:00,521][03784] Fps is (10 sec: 52428.3, 60 sec: 48059.8, 300 sec: 45653.1). Total num frames: 1254227968. Throughput: 0: 44944.4. Samples: 1255277100. Policy #0 lag: (min: 2.0, avg: 40.9, max: 88.0) [2024-03-21 06:12:00,522][03784] Avg episode reward: [(0, '1.115')] [2024-03-21 06:12:00,748][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000038277_1254260736.pth... [2024-03-21 06:12:00,869][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000037939_1243185152.pth [2024-03-21 06:12:05,521][03784] Fps is (10 sec: 32768.4, 60 sec: 45875.3, 300 sec: 45319.9). Total num frames: 1254391808. Throughput: 0: 44633.4. Samples: 1255532600. Policy #0 lag: (min: 2.0, avg: 40.9, max: 88.0) [2024-03-21 06:12:05,522][03784] Avg episode reward: [(0, '1.110')] [2024-03-21 06:12:08,434][04017] Updated weights for policy 0, policy_version 38284 (0.0015) [2024-03-21 06:12:10,521][03784] Fps is (10 sec: 26214.7, 60 sec: 44236.9, 300 sec: 45097.7). Total num frames: 1254490112. Throughput: 0: 45117.7. Samples: 1255814100. Policy #0 lag: (min: 2.0, avg: 40.9, max: 88.0) [2024-03-21 06:12:10,522][03784] Avg episode reward: [(0, '1.468')] [2024-03-21 06:12:15,521][03784] Fps is (10 sec: 19660.7, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 1254588416. Throughput: 0: 45484.6. Samples: 1255964000. Policy #0 lag: (min: 2.0, avg: 40.9, max: 88.0) [2024-03-21 06:12:15,522][03784] Avg episode reward: [(0, '0.720')] [2024-03-21 06:12:18,498][04017] Updated weights for policy 0, policy_version 38294 (0.0011) [2024-03-21 06:12:20,521][03784] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 45653.1). Total num frames: 1254883328. Throughput: 0: 45246.6. Samples: 1256242700. Policy #0 lag: (min: 1.0, avg: 36.4, max: 74.0) [2024-03-21 06:12:20,522][03784] Avg episode reward: [(0, '1.112')] [2024-03-21 06:12:25,521][03784] Fps is (10 sec: 52429.2, 60 sec: 42052.3, 300 sec: 45764.2). Total num frames: 1255112704. Throughput: 0: 45538.2. Samples: 1256520300. Policy #0 lag: (min: 1.0, avg: 36.4, max: 74.0) [2024-03-21 06:12:25,522][03784] Avg episode reward: [(0, '0.621')] [2024-03-21 06:12:26,217][04017] Updated weights for policy 0, policy_version 38304 (0.0011) [2024-03-21 06:12:30,521][03784] Fps is (10 sec: 49150.9, 60 sec: 41506.1, 300 sec: 45875.2). Total num frames: 1255374848. Throughput: 0: 45546.4. Samples: 1256649900. Policy #0 lag: (min: 1.0, avg: 36.4, max: 74.0) [2024-03-21 06:12:30,523][03784] Avg episode reward: [(0, '1.361')] [2024-03-21 06:12:31,477][04017] Updated weights for policy 0, policy_version 38314 (0.0011) [2024-03-21 06:12:35,521][03784] Fps is (10 sec: 49152.0, 60 sec: 44236.8, 300 sec: 45875.2). Total num frames: 1255604224. Throughput: 0: 46122.3. Samples: 1256934100. Policy #0 lag: (min: 1.0, avg: 36.4, max: 74.0) [2024-03-21 06:12:35,522][03784] Avg episode reward: [(0, '0.843')] [2024-03-21 06:12:37,951][04017] Updated weights for policy 0, policy_version 38324 (0.0012) [2024-03-21 06:12:40,521][03784] Fps is (10 sec: 55706.8, 60 sec: 45875.4, 300 sec: 45319.8). Total num frames: 1255931904. Throughput: 0: 45824.5. Samples: 1257192400. Policy #0 lag: (min: 1.0, avg: 36.4, max: 74.0) [2024-03-21 06:12:40,522][03784] Avg episode reward: [(0, '0.741')] [2024-03-21 06:12:42,559][04017] Updated weights for policy 0, policy_version 38334 (0.0020) [2024-03-21 06:12:45,521][03784] Fps is (10 sec: 58982.0, 60 sec: 46421.3, 300 sec: 45097.7). Total num frames: 1256194048. Throughput: 0: 45471.2. Samples: 1257323300. Policy #0 lag: (min: 0.0, avg: 50.1, max: 127.0) [2024-03-21 06:12:45,522][03784] Avg episode reward: [(0, '1.613')] [2024-03-21 06:12:48,872][04017] Updated weights for policy 0, policy_version 38344 (0.0015) [2024-03-21 06:12:50,521][03784] Fps is (10 sec: 55706.1, 60 sec: 46421.4, 300 sec: 45653.1). Total num frames: 1256488960. Throughput: 0: 46088.9. Samples: 1257606600. Policy #0 lag: (min: 0.0, avg: 50.1, max: 127.0) [2024-03-21 06:12:50,522][03784] Avg episode reward: [(0, '0.557')] [2024-03-21 06:12:55,521][03784] Fps is (10 sec: 45875.7, 60 sec: 43144.7, 300 sec: 45764.2). Total num frames: 1256652800. Throughput: 0: 46002.3. Samples: 1257884200. Policy #0 lag: (min: 0.0, avg: 50.1, max: 127.0) [2024-03-21 06:12:55,522][03784] Avg episode reward: [(0, '1.387')] [2024-03-21 06:13:00,077][04017] Updated weights for policy 0, policy_version 38354 (0.0016) [2024-03-21 06:13:00,521][03784] Fps is (10 sec: 29490.6, 60 sec: 42598.3, 300 sec: 45764.1). Total num frames: 1256783872. Throughput: 0: 46046.5. Samples: 1258036100. Policy #0 lag: (min: 0.0, avg: 50.1, max: 127.0) [2024-03-21 06:13:00,522][03784] Avg episode reward: [(0, '0.701')] [2024-03-21 06:13:01,420][03995] Signal inference workers to stop experience collection... (25300 times) [2024-03-21 06:13:01,420][03995] Signal inference workers to resume experience collection... (25300 times) [2024-03-21 06:13:01,482][04017] InferenceWorker_p0-w0: stopping experience collection (25300 times) [2024-03-21 06:13:01,482][04017] InferenceWorker_p0-w0: resuming experience collection (25300 times) [2024-03-21 06:13:05,129][04017] Updated weights for policy 0, policy_version 38364 (0.0013) [2024-03-21 06:13:05,521][03784] Fps is (10 sec: 49151.7, 60 sec: 45875.2, 300 sec: 46097.4). Total num frames: 1257144320. Throughput: 0: 46175.6. Samples: 1258320600. Policy #0 lag: (min: 0.0, avg: 50.1, max: 127.0) [2024-03-21 06:13:05,522][03784] Avg episode reward: [(0, '1.335')] [2024-03-21 06:13:10,521][03784] Fps is (10 sec: 55707.2, 60 sec: 47513.7, 300 sec: 45875.2). Total num frames: 1257340928. Throughput: 0: 46084.5. Samples: 1258594100. Policy #0 lag: (min: 0.0, avg: 51.8, max: 119.0) [2024-03-21 06:13:10,522][03784] Avg episode reward: [(0, '0.782')] [2024-03-21 06:13:12,618][04017] Updated weights for policy 0, policy_version 38374 (0.0016) [2024-03-21 06:13:15,521][03784] Fps is (10 sec: 42598.0, 60 sec: 49698.1, 300 sec: 45875.2). Total num frames: 1257570304. Throughput: 0: 46058.0. Samples: 1258722500. Policy #0 lag: (min: 0.0, avg: 51.8, max: 119.0) [2024-03-21 06:13:15,522][03784] Avg episode reward: [(0, '0.819')] [2024-03-21 06:13:20,521][03784] Fps is (10 sec: 29491.0, 60 sec: 45875.3, 300 sec: 45430.9). Total num frames: 1257635840. Throughput: 0: 45993.3. Samples: 1259003800. Policy #0 lag: (min: 0.0, avg: 51.8, max: 119.0) [2024-03-21 06:13:20,522][03784] Avg episode reward: [(0, '0.989')] [2024-03-21 06:13:22,812][04017] Updated weights for policy 0, policy_version 38384 (0.0016) [2024-03-21 06:13:25,521][03784] Fps is (10 sec: 32768.3, 60 sec: 46421.3, 300 sec: 45097.7). Total num frames: 1257897984. Throughput: 0: 45962.3. Samples: 1259260700. Policy #0 lag: (min: 0.0, avg: 51.8, max: 119.0) [2024-03-21 06:13:25,522][03784] Avg episode reward: [(0, '0.591')] [2024-03-21 06:13:29,274][04017] Updated weights for policy 0, policy_version 38394 (0.0016) [2024-03-21 06:13:30,521][03784] Fps is (10 sec: 55705.1, 60 sec: 46967.6, 300 sec: 45208.7). Total num frames: 1258192896. Throughput: 0: 46093.3. Samples: 1259397500. Policy #0 lag: (min: 0.0, avg: 51.8, max: 119.0) [2024-03-21 06:13:30,522][03784] Avg episode reward: [(0, '1.436')] [2024-03-21 06:13:34,895][04017] Updated weights for policy 0, policy_version 38404 (0.0038) [2024-03-21 06:13:35,521][03784] Fps is (10 sec: 55705.4, 60 sec: 47513.6, 300 sec: 45542.0). Total num frames: 1258455040. Throughput: 0: 45655.5. Samples: 1259661100. Policy #0 lag: (min: 1.0, avg: 34.4, max: 67.0) [2024-03-21 06:13:35,522][03784] Avg episode reward: [(0, '1.586')] [2024-03-21 06:13:40,521][03784] Fps is (10 sec: 49152.4, 60 sec: 45875.3, 300 sec: 45430.9). Total num frames: 1258684416. Throughput: 0: 45104.4. Samples: 1259913900. Policy #0 lag: (min: 1.0, avg: 34.4, max: 67.0) [2024-03-21 06:13:40,522][03784] Avg episode reward: [(0, '1.375')] [2024-03-21 06:13:41,212][04017] Updated weights for policy 0, policy_version 38414 (0.0020) [2024-03-21 06:13:45,521][03784] Fps is (10 sec: 42598.5, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 1258881024. Throughput: 0: 44749.1. Samples: 1260049800. Policy #0 lag: (min: 1.0, avg: 34.4, max: 67.0) [2024-03-21 06:13:45,522][03784] Avg episode reward: [(0, '1.198')] [2024-03-21 06:13:50,521][03784] Fps is (10 sec: 26214.4, 60 sec: 40960.0, 300 sec: 45097.7). Total num frames: 1258946560. Throughput: 0: 44346.6. Samples: 1260316200. Policy #0 lag: (min: 1.0, avg: 34.4, max: 67.0) [2024-03-21 06:13:50,522][03784] Avg episode reward: [(0, '1.537')] [2024-03-21 06:13:51,720][04017] Updated weights for policy 0, policy_version 38424 (0.0019) [2024-03-21 06:13:53,094][03995] Signal inference workers to stop experience collection... (25350 times) [2024-03-21 06:13:53,171][03995] Signal inference workers to resume experience collection... (25350 times) [2024-03-21 06:13:53,186][04017] InferenceWorker_p0-w0: stopping experience collection (25350 times) [2024-03-21 06:13:53,245][04017] InferenceWorker_p0-w0: resuming experience collection (25350 times) [2024-03-21 06:13:55,161][04017] Updated weights for policy 0, policy_version 38434 (0.0017) [2024-03-21 06:13:55,521][03784] Fps is (10 sec: 55705.8, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 1259438080. Throughput: 0: 43728.9. Samples: 1260561900. Policy #0 lag: (min: 1.0, avg: 34.4, max: 67.0) [2024-03-21 06:13:55,522][03784] Avg episode reward: [(0, '1.260')] [2024-03-21 06:13:59,308][04017] Updated weights for policy 0, policy_version 38444 (0.0020) [2024-03-21 06:14:00,521][03784] Fps is (10 sec: 85196.9, 60 sec: 50244.4, 300 sec: 46208.4). Total num frames: 1259798528. Throughput: 0: 43844.5. Samples: 1260695500. Policy #0 lag: (min: 0.0, avg: 42.8, max: 81.0) [2024-03-21 06:14:00,522][03784] Avg episode reward: [(0, '1.083')] [2024-03-21 06:14:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000038446_1259798528.pth... [2024-03-21 06:14:00,656][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000038110_1248788480.pth [2024-03-21 06:14:05,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46421.4, 300 sec: 46097.4). Total num frames: 1259929600. Throughput: 0: 44000.0. Samples: 1260983800. Policy #0 lag: (min: 0.0, avg: 42.8, max: 81.0) [2024-03-21 06:14:05,522][03784] Avg episode reward: [(0, '0.816')] [2024-03-21 06:14:09,328][04017] Updated weights for policy 0, policy_version 38454 (0.0012) [2024-03-21 06:14:10,521][03784] Fps is (10 sec: 26214.6, 60 sec: 45329.1, 300 sec: 45208.8). Total num frames: 1260060672. Throughput: 0: 44788.9. Samples: 1261276200. Policy #0 lag: (min: 0.0, avg: 42.8, max: 81.0) [2024-03-21 06:14:10,522][03784] Avg episode reward: [(0, '1.443')] [2024-03-21 06:14:15,521][03784] Fps is (10 sec: 19660.6, 60 sec: 42598.4, 300 sec: 44320.1). Total num frames: 1260126208. Throughput: 0: 45140.1. Samples: 1261428800. Policy #0 lag: (min: 0.0, avg: 42.8, max: 81.0) [2024-03-21 06:14:15,522][03784] Avg episode reward: [(0, '0.831')] [2024-03-21 06:14:20,521][03784] Fps is (10 sec: 26214.1, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 1260322816. Throughput: 0: 45797.7. Samples: 1261722000. Policy #0 lag: (min: 0.0, avg: 42.8, max: 81.0) [2024-03-21 06:14:20,531][03784] Avg episode reward: [(0, '1.495')] [2024-03-21 06:14:21,151][04017] Updated weights for policy 0, policy_version 38464 (0.0020) [2024-03-21 06:14:25,521][03784] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1260519424. Throughput: 0: 46509.0. Samples: 1262006800. Policy #0 lag: (min: 0.0, avg: 42.6, max: 119.0) [2024-03-21 06:14:25,522][03784] Avg episode reward: [(0, '1.351')] [2024-03-21 06:14:30,014][04017] Updated weights for policy 0, policy_version 38474 (0.0011) [2024-03-21 06:14:30,521][03784] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 45208.7). Total num frames: 1260748800. Throughput: 0: 46573.4. Samples: 1262145600. Policy #0 lag: (min: 0.0, avg: 42.6, max: 119.0) [2024-03-21 06:14:30,522][03784] Avg episode reward: [(0, '1.525')] [2024-03-21 06:14:35,521][03784] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 45097.7). Total num frames: 1260945408. Throughput: 0: 46773.3. Samples: 1262421000. Policy #0 lag: (min: 0.0, avg: 42.6, max: 119.0) [2024-03-21 06:14:35,522][03784] Avg episode reward: [(0, '1.094')] [2024-03-21 06:14:36,672][04017] Updated weights for policy 0, policy_version 38484 (0.0011) [2024-03-21 06:14:40,521][03784] Fps is (10 sec: 49151.9, 60 sec: 42598.4, 300 sec: 45653.1). Total num frames: 1261240320. Throughput: 0: 47666.6. Samples: 1262706900. Policy #0 lag: (min: 0.0, avg: 42.6, max: 119.0) [2024-03-21 06:14:40,522][03784] Avg episode reward: [(0, '1.094')] [2024-03-21 06:14:42,675][04017] Updated weights for policy 0, policy_version 38494 (0.0015) [2024-03-21 06:14:45,521][03784] Fps is (10 sec: 62259.5, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 1261568000. Throughput: 0: 47420.0. Samples: 1262829400. Policy #0 lag: (min: 0.0, avg: 42.6, max: 119.0) [2024-03-21 06:14:45,522][03784] Avg episode reward: [(0, '1.623')] [2024-03-21 06:14:47,068][04017] Updated weights for policy 0, policy_version 38504 (0.0011) [2024-03-21 06:14:48,839][03995] Signal inference workers to stop experience collection... (25400 times) [2024-03-21 06:14:48,936][04017] InferenceWorker_p0-w0: stopping experience collection (25400 times) [2024-03-21 06:14:48,962][03995] Signal inference workers to resume experience collection... (25400 times) [2024-03-21 06:14:48,995][04017] InferenceWorker_p0-w0: resuming experience collection (25400 times) [2024-03-21 06:14:50,521][03784] Fps is (10 sec: 75366.6, 60 sec: 50790.5, 300 sec: 45875.2). Total num frames: 1261993984. Throughput: 0: 46995.5. Samples: 1263098600. Policy #0 lag: (min: 2.0, avg: 48.7, max: 94.0) [2024-03-21 06:14:50,521][03784] Avg episode reward: [(0, '1.623')] [2024-03-21 06:14:50,550][04017] Updated weights for policy 0, policy_version 38514 (0.0012) [2024-03-21 06:14:55,104][04017] Updated weights for policy 0, policy_version 38524 (0.0022) [2024-03-21 06:14:55,521][03784] Fps is (10 sec: 81920.4, 60 sec: 49152.0, 300 sec: 46097.4). Total num frames: 1262387200. Throughput: 0: 45842.2. Samples: 1263339100. Policy #0 lag: (min: 2.0, avg: 48.7, max: 94.0) [2024-03-21 06:14:55,522][03784] Avg episode reward: [(0, '1.596')] [2024-03-21 06:15:00,521][03784] Fps is (10 sec: 58982.3, 60 sec: 46421.4, 300 sec: 45986.3). Total num frames: 1262583808. Throughput: 0: 45413.4. Samples: 1263472400. Policy #0 lag: (min: 2.0, avg: 48.7, max: 94.0) [2024-03-21 06:15:00,522][03784] Avg episode reward: [(0, '1.070')] [2024-03-21 06:15:05,521][03784] Fps is (10 sec: 22937.5, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 1262616576. Throughput: 0: 44995.6. Samples: 1263746800. Policy #0 lag: (min: 2.0, avg: 48.7, max: 94.0) [2024-03-21 06:15:05,522][03784] Avg episode reward: [(0, '1.070')] [2024-03-21 06:15:10,186][04017] Updated weights for policy 0, policy_version 38534 (0.0011) [2024-03-21 06:15:10,521][03784] Fps is (10 sec: 13107.2, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1262714880. Throughput: 0: 44515.5. Samples: 1264010000. Policy #0 lag: (min: 2.0, avg: 48.7, max: 94.0) [2024-03-21 06:15:10,522][03784] Avg episode reward: [(0, '1.362')] [2024-03-21 06:15:15,521][03784] Fps is (10 sec: 32768.1, 60 sec: 46967.5, 300 sec: 44764.5). Total num frames: 1262944256. Throughput: 0: 44544.4. Samples: 1264150100. Policy #0 lag: (min: 1.0, avg: 31.6, max: 76.0) [2024-03-21 06:15:15,522][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 06:15:16,494][04017] Updated weights for policy 0, policy_version 38544 (0.0020) [2024-03-21 06:15:20,521][03784] Fps is (10 sec: 39321.2, 60 sec: 46421.3, 300 sec: 44542.3). Total num frames: 1263108096. Throughput: 0: 44773.3. Samples: 1264435800. Policy #0 lag: (min: 1.0, avg: 31.6, max: 76.0) [2024-03-21 06:15:20,522][03784] Avg episode reward: [(0, '0.498')] [2024-03-21 06:15:23,432][04017] Updated weights for policy 0, policy_version 38554 (0.0015) [2024-03-21 06:15:25,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48059.7, 300 sec: 45320.6). Total num frames: 1263403008. Throughput: 0: 44408.9. Samples: 1264705300. Policy #0 lag: (min: 1.0, avg: 31.6, max: 76.0) [2024-03-21 06:15:25,522][03784] Avg episode reward: [(0, '1.532')] [2024-03-21 06:15:29,958][04017] Updated weights for policy 0, policy_version 38564 (0.0011) [2024-03-21 06:15:30,521][03784] Fps is (10 sec: 62260.2, 60 sec: 49698.2, 300 sec: 45542.0). Total num frames: 1263730688. Throughput: 0: 44955.7. Samples: 1264852400. Policy #0 lag: (min: 1.0, avg: 31.6, max: 76.0) [2024-03-21 06:15:30,521][03784] Avg episode reward: [(0, '0.901')] [2024-03-21 06:15:35,484][04017] Updated weights for policy 0, policy_version 38574 (0.0019) [2024-03-21 06:15:35,521][03784] Fps is (10 sec: 58982.0, 60 sec: 50790.4, 300 sec: 45764.1). Total num frames: 1263992832. Throughput: 0: 45028.8. Samples: 1265124900. Policy #0 lag: (min: 1.0, avg: 31.6, max: 76.0) [2024-03-21 06:15:35,522][03784] Avg episode reward: [(0, '0.901')] [2024-03-21 06:15:40,463][03995] Signal inference workers to stop experience collection... (25450 times) [2024-03-21 06:15:40,464][03995] Signal inference workers to resume experience collection... (25450 times) [2024-03-21 06:15:40,521][03784] Fps is (10 sec: 49151.4, 60 sec: 49698.1, 300 sec: 45986.3). Total num frames: 1264222208. Throughput: 0: 45711.0. Samples: 1265396100. Policy #0 lag: (min: 0.0, avg: 34.5, max: 77.0) [2024-03-21 06:15:40,523][03784] Avg episode reward: [(0, '1.101')] [2024-03-21 06:15:40,523][04017] InferenceWorker_p0-w0: stopping experience collection (25450 times) [2024-03-21 06:15:40,524][04017] InferenceWorker_p0-w0: resuming experience collection (25450 times) [2024-03-21 06:15:43,616][04017] Updated weights for policy 0, policy_version 38584 (0.0016) [2024-03-21 06:15:45,521][03784] Fps is (10 sec: 42598.8, 60 sec: 47513.6, 300 sec: 46319.5). Total num frames: 1264418816. Throughput: 0: 45966.7. Samples: 1265540900. Policy #0 lag: (min: 0.0, avg: 34.5, max: 77.0) [2024-03-21 06:15:45,522][03784] Avg episode reward: [(0, '0.868')] [2024-03-21 06:15:50,521][03784] Fps is (10 sec: 32768.0, 60 sec: 42598.3, 300 sec: 46208.5). Total num frames: 1264549888. Throughput: 0: 46231.1. Samples: 1265827200. Policy #0 lag: (min: 0.0, avg: 34.5, max: 77.0) [2024-03-21 06:15:50,522][03784] Avg episode reward: [(0, '1.357')] [2024-03-21 06:15:55,521][03784] Fps is (10 sec: 16384.0, 60 sec: 36590.9, 300 sec: 44875.5). Total num frames: 1264582656. Throughput: 0: 47204.4. Samples: 1266134200. Policy #0 lag: (min: 0.0, avg: 34.5, max: 77.0) [2024-03-21 06:15:55,522][03784] Avg episode reward: [(0, '1.467')] [2024-03-21 06:15:56,253][04017] Updated weights for policy 0, policy_version 38594 (0.0015) [2024-03-21 06:16:00,521][03784] Fps is (10 sec: 39321.2, 60 sec: 39321.5, 300 sec: 45097.7). Total num frames: 1264943104. Throughput: 0: 47004.3. Samples: 1266265300. Policy #0 lag: (min: 0.0, avg: 34.5, max: 77.0) [2024-03-21 06:16:00,522][03784] Avg episode reward: [(0, '1.200')] [2024-03-21 06:16:00,804][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000038604_1264975872.pth... [2024-03-21 06:16:00,840][04017] Updated weights for policy 0, policy_version 38604 (0.0011) [2024-03-21 06:16:00,932][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000038277_1254260736.pth [2024-03-21 06:16:05,521][03784] Fps is (10 sec: 58982.2, 60 sec: 42598.4, 300 sec: 45208.7). Total num frames: 1265172480. Throughput: 0: 46786.7. Samples: 1266541200. Policy #0 lag: (min: 0.0, avg: 29.7, max: 67.0) [2024-03-21 06:16:05,522][03784] Avg episode reward: [(0, '1.301')] [2024-03-21 06:16:07,804][04017] Updated weights for policy 0, policy_version 38614 (0.0015) [2024-03-21 06:16:10,521][03784] Fps is (10 sec: 52429.6, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 1265467392. Throughput: 0: 46533.4. Samples: 1266799300. Policy #0 lag: (min: 0.0, avg: 29.7, max: 67.0) [2024-03-21 06:16:10,522][03784] Avg episode reward: [(0, '1.022')] [2024-03-21 06:16:12,325][04017] Updated weights for policy 0, policy_version 38625 (0.0019) [2024-03-21 06:16:15,521][03784] Fps is (10 sec: 62259.3, 60 sec: 47513.6, 300 sec: 46097.4). Total num frames: 1265795072. Throughput: 0: 46035.5. Samples: 1266924000. Policy #0 lag: (min: 0.0, avg: 29.7, max: 67.0) [2024-03-21 06:16:15,522][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 06:16:19,422][04017] Updated weights for policy 0, policy_version 38635 (0.0015) [2024-03-21 06:16:20,521][03784] Fps is (10 sec: 62258.8, 60 sec: 49698.2, 300 sec: 45764.1). Total num frames: 1266089984. Throughput: 0: 46100.0. Samples: 1267199400. Policy #0 lag: (min: 0.0, avg: 29.7, max: 67.0) [2024-03-21 06:16:20,523][03784] Avg episode reward: [(0, '0.696')] [2024-03-21 06:16:22,818][04017] Updated weights for policy 0, policy_version 38645 (0.0025) [2024-03-21 06:16:25,521][03784] Fps is (10 sec: 62258.5, 60 sec: 50244.2, 300 sec: 45875.2). Total num frames: 1266417664. Throughput: 0: 45573.3. Samples: 1267446900. Policy #0 lag: (min: 0.0, avg: 29.7, max: 67.0) [2024-03-21 06:16:25,522][03784] Avg episode reward: [(0, '1.451')] [2024-03-21 06:16:30,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46421.2, 300 sec: 45986.3). Total num frames: 1266515968. Throughput: 0: 45364.4. Samples: 1267582300. Policy #0 lag: (min: 0.0, avg: 29.7, max: 67.0) [2024-03-21 06:16:30,522][03784] Avg episode reward: [(0, '1.749')] [2024-03-21 06:16:32,940][04017] Updated weights for policy 0, policy_version 38655 (0.0015) [2024-03-21 06:16:32,976][03995] Signal inference workers to stop experience collection... (25500 times) [2024-03-21 06:16:32,976][03995] Signal inference workers to resume experience collection... (25500 times) [2024-03-21 06:16:33,165][04017] InferenceWorker_p0-w0: stopping experience collection (25500 times) [2024-03-21 06:16:33,165][04017] InferenceWorker_p0-w0: resuming experience collection (25500 times) [2024-03-21 06:16:35,521][03784] Fps is (10 sec: 26214.5, 60 sec: 44783.0, 300 sec: 45764.2). Total num frames: 1266679808. Throughput: 0: 45126.7. Samples: 1267857900. Policy #0 lag: (min: 0.0, avg: 40.3, max: 76.0) [2024-03-21 06:16:35,522][03784] Avg episode reward: [(0, '0.987')] [2024-03-21 06:16:40,521][03784] Fps is (10 sec: 36044.6, 60 sec: 44236.8, 300 sec: 45653.0). Total num frames: 1266876416. Throughput: 0: 44073.2. Samples: 1268117500. Policy #0 lag: (min: 0.0, avg: 40.3, max: 76.0) [2024-03-21 06:16:40,522][03784] Avg episode reward: [(0, '1.163')] [2024-03-21 06:16:44,465][04017] Updated weights for policy 0, policy_version 38665 (0.0011) [2024-03-21 06:16:45,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43690.6, 300 sec: 45208.7). Total num frames: 1267040256. Throughput: 0: 44362.3. Samples: 1268261600. Policy #0 lag: (min: 0.0, avg: 40.3, max: 76.0) [2024-03-21 06:16:45,522][03784] Avg episode reward: [(0, '1.490')] [2024-03-21 06:16:50,521][03784] Fps is (10 sec: 36045.1, 60 sec: 44783.0, 300 sec: 44653.4). Total num frames: 1267236864. Throughput: 0: 44420.0. Samples: 1268540100. Policy #0 lag: (min: 0.0, avg: 40.3, max: 76.0) [2024-03-21 06:16:50,522][03784] Avg episode reward: [(0, '1.490')] [2024-03-21 06:16:50,922][04017] Updated weights for policy 0, policy_version 38675 (0.0020) [2024-03-21 06:16:55,521][03784] Fps is (10 sec: 32767.8, 60 sec: 46421.3, 300 sec: 44542.3). Total num frames: 1267367936. Throughput: 0: 44577.7. Samples: 1268805300. Policy #0 lag: (min: 0.0, avg: 40.3, max: 76.0) [2024-03-21 06:16:55,522][03784] Avg episode reward: [(0, '0.974')] [2024-03-21 06:16:58,721][04017] Updated weights for policy 0, policy_version 38685 (0.0011) [2024-03-21 06:17:00,521][03784] Fps is (10 sec: 45875.4, 60 sec: 45875.3, 300 sec: 45097.7). Total num frames: 1267695616. Throughput: 0: 44557.8. Samples: 1268929100. Policy #0 lag: (min: 1.0, avg: 25.7, max: 69.0) [2024-03-21 06:17:00,522][03784] Avg episode reward: [(0, '1.477')] [2024-03-21 06:17:04,698][04017] Updated weights for policy 0, policy_version 38695 (0.0012) [2024-03-21 06:17:05,521][03784] Fps is (10 sec: 62259.6, 60 sec: 46967.5, 300 sec: 45764.1). Total num frames: 1267990528. Throughput: 0: 44002.2. Samples: 1269179500. Policy #0 lag: (min: 1.0, avg: 25.7, max: 69.0) [2024-03-21 06:17:05,522][03784] Avg episode reward: [(0, '1.444')] [2024-03-21 06:17:10,521][03784] Fps is (10 sec: 52428.4, 60 sec: 45875.2, 300 sec: 46208.4). Total num frames: 1268219904. Throughput: 0: 44397.8. Samples: 1269444800. Policy #0 lag: (min: 1.0, avg: 25.7, max: 69.0) [2024-03-21 06:17:10,522][03784] Avg episode reward: [(0, '0.839')] [2024-03-21 06:17:11,251][04017] Updated weights for policy 0, policy_version 38705 (0.0016) [2024-03-21 06:17:15,521][03784] Fps is (10 sec: 52429.1, 60 sec: 45329.1, 300 sec: 46208.5). Total num frames: 1268514816. Throughput: 0: 44666.7. Samples: 1269592300. Policy #0 lag: (min: 1.0, avg: 25.7, max: 69.0) [2024-03-21 06:17:15,522][03784] Avg episode reward: [(0, '0.839')] [2024-03-21 06:17:16,935][04017] Updated weights for policy 0, policy_version 38715 (0.0024) [2024-03-21 06:17:20,521][03784] Fps is (10 sec: 58982.7, 60 sec: 45329.1, 300 sec: 46430.6). Total num frames: 1268809728. Throughput: 0: 44593.4. Samples: 1269864600. Policy #0 lag: (min: 1.0, avg: 25.7, max: 69.0) [2024-03-21 06:17:20,522][03784] Avg episode reward: [(0, '0.502')] [2024-03-21 06:17:23,547][04017] Updated weights for policy 0, policy_version 38725 (0.0015) [2024-03-21 06:17:25,521][03784] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 46097.4). Total num frames: 1268973568. Throughput: 0: 44946.7. Samples: 1270140100. Policy #0 lag: (min: 0.0, avg: 43.0, max: 76.0) [2024-03-21 06:17:25,522][03784] Avg episode reward: [(0, '0.721')] [2024-03-21 06:17:30,521][03784] Fps is (10 sec: 26214.4, 60 sec: 42598.4, 300 sec: 45653.0). Total num frames: 1269071872. Throughput: 0: 44977.8. Samples: 1270285600. Policy #0 lag: (min: 0.0, avg: 43.0, max: 76.0) [2024-03-21 06:17:30,522][03784] Avg episode reward: [(0, '1.500')] [2024-03-21 06:17:35,521][03784] Fps is (10 sec: 19660.7, 60 sec: 41506.1, 300 sec: 44875.5). Total num frames: 1269170176. Throughput: 0: 44795.5. Samples: 1270555900. Policy #0 lag: (min: 0.0, avg: 43.0, max: 76.0) [2024-03-21 06:17:35,522][03784] Avg episode reward: [(0, '0.983')] [2024-03-21 06:17:37,513][04017] Updated weights for policy 0, policy_version 38735 (0.0012) [2024-03-21 06:17:39,909][03995] Signal inference workers to stop experience collection... (25550 times) [2024-03-21 06:17:39,910][03995] Signal inference workers to resume experience collection... (25550 times) [2024-03-21 06:17:39,982][04017] InferenceWorker_p0-w0: stopping experience collection (25550 times) [2024-03-21 06:17:39,982][04017] InferenceWorker_p0-w0: resuming experience collection (25550 times) [2024-03-21 06:17:40,521][03784] Fps is (10 sec: 32767.9, 60 sec: 42052.3, 300 sec: 44764.4). Total num frames: 1269399552. Throughput: 0: 45491.2. Samples: 1270852400. Policy #0 lag: (min: 0.0, avg: 43.0, max: 76.0) [2024-03-21 06:17:40,522][03784] Avg episode reward: [(0, '0.983')] [2024-03-21 06:17:43,748][04017] Updated weights for policy 0, policy_version 38745 (0.0013) [2024-03-21 06:17:45,521][03784] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 44542.3). Total num frames: 1269628928. Throughput: 0: 45700.0. Samples: 1270985600. Policy #0 lag: (min: 0.0, avg: 43.0, max: 76.0) [2024-03-21 06:17:45,522][03784] Avg episode reward: [(0, '1.474')] [2024-03-21 06:17:48,578][04017] Updated weights for policy 0, policy_version 38755 (0.0014) [2024-03-21 06:17:50,521][03784] Fps is (10 sec: 65535.3, 60 sec: 46967.4, 300 sec: 45430.9). Total num frames: 1270054912. Throughput: 0: 45895.4. Samples: 1271244800. Policy #0 lag: (min: 4.0, avg: 32.1, max: 116.0) [2024-03-21 06:17:50,522][03784] Avg episode reward: [(0, '1.334')] [2024-03-21 06:17:55,521][03784] Fps is (10 sec: 52428.7, 60 sec: 46421.4, 300 sec: 45319.8). Total num frames: 1270153216. Throughput: 0: 46464.5. Samples: 1271535700. Policy #0 lag: (min: 4.0, avg: 32.1, max: 116.0) [2024-03-21 06:17:55,522][03784] Avg episode reward: [(0, '1.334')] [2024-03-21 06:17:56,563][04017] Updated weights for policy 0, policy_version 38765 (0.0014) [2024-03-21 06:18:00,521][03784] Fps is (10 sec: 39322.2, 60 sec: 45875.2, 300 sec: 45097.7). Total num frames: 1270448128. Throughput: 0: 45840.0. Samples: 1271655100. Policy #0 lag: (min: 4.0, avg: 32.1, max: 116.0) [2024-03-21 06:18:00,522][03784] Avg episode reward: [(0, '1.303')] [2024-03-21 06:18:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000038771_1270448128.pth... [2024-03-21 06:18:00,660][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000038446_1259798528.pth [2024-03-21 06:18:03,753][04017] Updated weights for policy 0, policy_version 38775 (0.0016) [2024-03-21 06:18:05,521][03784] Fps is (10 sec: 52428.7, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 1270677504. Throughput: 0: 46106.6. Samples: 1271939400. Policy #0 lag: (min: 4.0, avg: 32.1, max: 116.0) [2024-03-21 06:18:05,522][03784] Avg episode reward: [(0, '1.396')] [2024-03-21 06:18:08,370][04017] Updated weights for policy 0, policy_version 38785 (0.0024) [2024-03-21 06:18:10,521][03784] Fps is (10 sec: 52428.5, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1270972416. Throughput: 0: 46206.6. Samples: 1272219400. Policy #0 lag: (min: 4.0, avg: 32.1, max: 116.0) [2024-03-21 06:18:10,522][03784] Avg episode reward: [(0, '1.396')] [2024-03-21 06:18:15,521][03784] Fps is (10 sec: 49152.1, 60 sec: 44236.8, 300 sec: 45875.2). Total num frames: 1271169024. Throughput: 0: 46002.2. Samples: 1272355700. Policy #0 lag: (min: 1.0, avg: 49.2, max: 116.0) [2024-03-21 06:18:15,522][03784] Avg episode reward: [(0, '1.396')] [2024-03-21 06:18:16,132][04017] Updated weights for policy 0, policy_version 38795 (0.0011) [2024-03-21 06:18:20,521][03784] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 45764.1). Total num frames: 1271398400. Throughput: 0: 45711.1. Samples: 1272612900. Policy #0 lag: (min: 1.0, avg: 49.2, max: 116.0) [2024-03-21 06:18:20,522][03784] Avg episode reward: [(0, '0.868')] [2024-03-21 06:18:25,359][04017] Updated weights for policy 0, policy_version 38805 (0.0011) [2024-03-21 06:18:25,521][03784] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 45319.8). Total num frames: 1271562240. Throughput: 0: 45171.1. Samples: 1272885100. Policy #0 lag: (min: 1.0, avg: 49.2, max: 116.0) [2024-03-21 06:18:25,522][03784] Avg episode reward: [(0, '1.145')] [2024-03-21 06:18:30,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1271824384. Throughput: 0: 44926.7. Samples: 1273007300. Policy #0 lag: (min: 1.0, avg: 49.2, max: 116.0) [2024-03-21 06:18:30,522][03784] Avg episode reward: [(0, '0.432')] [2024-03-21 06:18:32,016][04017] Updated weights for policy 0, policy_version 38815 (0.0012) [2024-03-21 06:18:33,313][03995] Signal inference workers to stop experience collection... (25600 times) [2024-03-21 06:18:33,383][03995] Signal inference workers to resume experience collection... (25600 times) [2024-03-21 06:18:33,387][04017] InferenceWorker_p0-w0: stopping experience collection (25600 times) [2024-03-21 06:18:33,429][04017] InferenceWorker_p0-w0: resuming experience collection (25600 times) [2024-03-21 06:18:35,473][04017] Updated weights for policy 0, policy_version 38825 (0.0020) [2024-03-21 06:18:35,521][03784] Fps is (10 sec: 65535.8, 60 sec: 50790.4, 300 sec: 45875.2). Total num frames: 1272217600. Throughput: 0: 44655.6. Samples: 1273254300. Policy #0 lag: (min: 1.0, avg: 49.2, max: 116.0) [2024-03-21 06:18:35,522][03784] Avg episode reward: [(0, '1.240')] [2024-03-21 06:18:40,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 45653.0). Total num frames: 1272348672. Throughput: 0: 44264.4. Samples: 1273527600. Policy #0 lag: (min: 1.0, avg: 43.1, max: 85.0) [2024-03-21 06:18:40,522][03784] Avg episode reward: [(0, '1.191')] [2024-03-21 06:18:45,116][04017] Updated weights for policy 0, policy_version 38835 (0.0010) [2024-03-21 06:18:45,521][03784] Fps is (10 sec: 32768.4, 60 sec: 48605.9, 300 sec: 46097.4). Total num frames: 1272545280. Throughput: 0: 44680.0. Samples: 1273665700. Policy #0 lag: (min: 1.0, avg: 43.1, max: 85.0) [2024-03-21 06:18:45,522][03784] Avg episode reward: [(0, '1.212')] [2024-03-21 06:18:50,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45875.3, 300 sec: 45319.8). Total num frames: 1272807424. Throughput: 0: 44813.3. Samples: 1273956000. Policy #0 lag: (min: 1.0, avg: 43.1, max: 85.0) [2024-03-21 06:18:50,522][03784] Avg episode reward: [(0, '1.557')] [2024-03-21 06:18:50,950][04017] Updated weights for policy 0, policy_version 38845 (0.0022) [2024-03-21 06:18:55,521][03784] Fps is (10 sec: 52428.6, 60 sec: 48605.9, 300 sec: 44986.6). Total num frames: 1273069568. Throughput: 0: 44666.7. Samples: 1274229400. Policy #0 lag: (min: 1.0, avg: 43.1, max: 85.0) [2024-03-21 06:18:55,522][03784] Avg episode reward: [(0, '1.341')] [2024-03-21 06:18:59,425][04017] Updated weights for policy 0, policy_version 38855 (0.0015) [2024-03-21 06:19:00,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46967.4, 300 sec: 45208.7). Total num frames: 1273266176. Throughput: 0: 44833.3. Samples: 1274373200. Policy #0 lag: (min: 1.0, avg: 43.1, max: 85.0) [2024-03-21 06:19:00,522][03784] Avg episode reward: [(0, '1.396')] [2024-03-21 06:19:05,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1273430016. Throughput: 0: 45651.1. Samples: 1274667200. Policy #0 lag: (min: 0.0, avg: 37.6, max: 75.0) [2024-03-21 06:19:05,522][03784] Avg episode reward: [(0, '1.396')] [2024-03-21 06:19:07,495][04017] Updated weights for policy 0, policy_version 38865 (0.0011) [2024-03-21 06:19:10,521][03784] Fps is (10 sec: 32767.9, 60 sec: 43690.6, 300 sec: 45653.0). Total num frames: 1273593856. Throughput: 0: 46071.1. Samples: 1274958300. Policy #0 lag: (min: 0.0, avg: 37.6, max: 75.0) [2024-03-21 06:19:10,522][03784] Avg episode reward: [(0, '1.310')] [2024-03-21 06:19:15,521][03784] Fps is (10 sec: 29491.2, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 1273724928. Throughput: 0: 46495.5. Samples: 1275099600. Policy #0 lag: (min: 0.0, avg: 37.6, max: 75.0) [2024-03-21 06:19:15,522][03784] Avg episode reward: [(0, '1.066')] [2024-03-21 06:19:18,889][04017] Updated weights for policy 0, policy_version 38875 (0.0014) [2024-03-21 06:19:20,521][03784] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 45653.0). Total num frames: 1273987072. Throughput: 0: 47493.4. Samples: 1275391500. Policy #0 lag: (min: 0.0, avg: 37.6, max: 75.0) [2024-03-21 06:19:20,523][03784] Avg episode reward: [(0, '1.071')] [2024-03-21 06:19:23,900][04017] Updated weights for policy 0, policy_version 38885 (0.0011) [2024-03-21 06:19:25,527][03784] Fps is (10 sec: 58948.6, 60 sec: 45870.8, 300 sec: 45985.4). Total num frames: 1274314752. Throughput: 0: 47260.6. Samples: 1275654600. Policy #0 lag: (min: 0.0, avg: 37.6, max: 75.0) [2024-03-21 06:19:25,528][03784] Avg episode reward: [(0, '1.710')] [2024-03-21 06:19:28,247][04017] Updated weights for policy 0, policy_version 38895 (0.0013) [2024-03-21 06:19:30,521][03784] Fps is (10 sec: 52428.7, 60 sec: 44782.9, 300 sec: 45986.3). Total num frames: 1274511360. Throughput: 0: 47044.4. Samples: 1275782700. Policy #0 lag: (min: 0.0, avg: 47.1, max: 116.0) [2024-03-21 06:19:30,522][03784] Avg episode reward: [(0, '1.577')] [2024-03-21 06:19:34,541][03995] Signal inference workers to stop experience collection... (25650 times) [2024-03-21 06:19:34,606][03995] Signal inference workers to resume experience collection... (25650 times) [2024-03-21 06:19:34,613][04017] InferenceWorker_p0-w0: stopping experience collection (25650 times) [2024-03-21 06:19:34,680][04017] InferenceWorker_p0-w0: resuming experience collection (25650 times) [2024-03-21 06:19:35,521][03784] Fps is (10 sec: 39344.3, 60 sec: 41506.2, 300 sec: 45653.0). Total num frames: 1274707968. Throughput: 0: 46975.6. Samples: 1276069900. Policy #0 lag: (min: 0.0, avg: 47.1, max: 116.0) [2024-03-21 06:19:35,522][03784] Avg episode reward: [(0, '1.609')] [2024-03-21 06:19:36,742][04017] Updated weights for policy 0, policy_version 38905 (0.0022) [2024-03-21 06:19:40,527][03784] Fps is (10 sec: 62224.4, 60 sec: 46417.0, 300 sec: 45985.4). Total num frames: 1275133952. Throughput: 0: 46265.3. Samples: 1276311600. Policy #0 lag: (min: 0.0, avg: 47.1, max: 116.0) [2024-03-21 06:19:40,527][03784] Avg episode reward: [(0, '1.064')] [2024-03-21 06:19:40,792][04017] Updated weights for policy 0, policy_version 38915 (0.0012) [2024-03-21 06:19:45,521][03784] Fps is (10 sec: 68812.8, 60 sec: 47513.6, 300 sec: 45430.9). Total num frames: 1275396096. Throughput: 0: 46060.1. Samples: 1276445900. Policy #0 lag: (min: 0.0, avg: 47.1, max: 116.0) [2024-03-21 06:19:45,522][03784] Avg episode reward: [(0, '1.640')] [2024-03-21 06:19:47,137][04017] Updated weights for policy 0, policy_version 38925 (0.0016) [2024-03-21 06:19:50,521][03784] Fps is (10 sec: 45901.0, 60 sec: 46421.4, 300 sec: 44764.4). Total num frames: 1275592704. Throughput: 0: 45388.9. Samples: 1276709700. Policy #0 lag: (min: 0.0, avg: 47.1, max: 116.0) [2024-03-21 06:19:50,522][03784] Avg episode reward: [(0, '1.448')] [2024-03-21 06:19:55,521][03784] Fps is (10 sec: 32768.0, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1275723776. Throughput: 0: 44760.1. Samples: 1276972500. Policy #0 lag: (min: 0.0, avg: 39.9, max: 89.0) [2024-03-21 06:19:55,522][03784] Avg episode reward: [(0, '0.765')] [2024-03-21 06:19:56,857][04017] Updated weights for policy 0, policy_version 38935 (0.0015) [2024-03-21 06:20:00,521][03784] Fps is (10 sec: 42598.3, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1276018688. Throughput: 0: 44504.5. Samples: 1277102300. Policy #0 lag: (min: 0.0, avg: 39.9, max: 89.0) [2024-03-21 06:20:00,522][03784] Avg episode reward: [(0, '1.634')] [2024-03-21 06:20:00,607][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000038942_1276051456.pth... [2024-03-21 06:20:00,735][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000038604_1264975872.pth [2024-03-21 06:20:03,997][04017] Updated weights for policy 0, policy_version 38945 (0.0025) [2024-03-21 06:20:05,521][03784] Fps is (10 sec: 52428.7, 60 sec: 46967.5, 300 sec: 45875.2). Total num frames: 1276248064. Throughput: 0: 44124.5. Samples: 1277377100. Policy #0 lag: (min: 0.0, avg: 39.9, max: 89.0) [2024-03-21 06:20:05,522][03784] Avg episode reward: [(0, '0.830')] [2024-03-21 06:20:10,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46967.5, 300 sec: 45653.0). Total num frames: 1276411904. Throughput: 0: 44370.1. Samples: 1277651000. Policy #0 lag: (min: 0.0, avg: 39.9, max: 89.0) [2024-03-21 06:20:10,522][03784] Avg episode reward: [(0, '1.373')] [2024-03-21 06:20:10,962][04017] Updated weights for policy 0, policy_version 38955 (0.0021) [2024-03-21 06:20:15,521][03784] Fps is (10 sec: 45875.2, 60 sec: 49698.1, 300 sec: 46097.4). Total num frames: 1276706816. Throughput: 0: 44228.9. Samples: 1277773000. Policy #0 lag: (min: 0.0, avg: 39.9, max: 89.0) [2024-03-21 06:20:15,522][03784] Avg episode reward: [(0, '0.579')] [2024-03-21 06:20:18,789][04017] Updated weights for policy 0, policy_version 38965 (0.0011) [2024-03-21 06:20:20,521][03784] Fps is (10 sec: 39321.7, 60 sec: 46967.5, 300 sec: 45430.9). Total num frames: 1276805120. Throughput: 0: 43828.9. Samples: 1278042200. Policy #0 lag: (min: 0.0, avg: 39.6, max: 127.0) [2024-03-21 06:20:20,522][03784] Avg episode reward: [(0, '1.022')] [2024-03-21 06:20:25,521][03784] Fps is (10 sec: 32768.0, 60 sec: 45333.4, 300 sec: 45097.6). Total num frames: 1277034496. Throughput: 0: 44354.4. Samples: 1278307300. Policy #0 lag: (min: 0.0, avg: 39.6, max: 127.0) [2024-03-21 06:20:25,522][03784] Avg episode reward: [(0, '1.527')] [2024-03-21 06:20:25,539][03995] Signal inference workers to stop experience collection... (25700 times) [2024-03-21 06:20:25,657][04017] InferenceWorker_p0-w0: stopping experience collection (25700 times) [2024-03-21 06:20:25,727][03995] Signal inference workers to resume experience collection... (25700 times) [2024-03-21 06:20:25,727][04017] InferenceWorker_p0-w0: resuming experience collection (25700 times) [2024-03-21 06:20:26,388][04017] Updated weights for policy 0, policy_version 38975 (0.0016) [2024-03-21 06:20:30,521][03784] Fps is (10 sec: 49151.7, 60 sec: 46421.3, 300 sec: 45097.7). Total num frames: 1277296640. Throughput: 0: 44466.6. Samples: 1278446900. Policy #0 lag: (min: 0.0, avg: 39.6, max: 127.0) [2024-03-21 06:20:30,522][03784] Avg episode reward: [(0, '1.114')] [2024-03-21 06:20:34,907][04017] Updated weights for policy 0, policy_version 38985 (0.0018) [2024-03-21 06:20:35,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1277460480. Throughput: 0: 44633.4. Samples: 1278718200. Policy #0 lag: (min: 0.0, avg: 39.6, max: 127.0) [2024-03-21 06:20:35,522][03784] Avg episode reward: [(0, '1.335')] [2024-03-21 06:20:40,521][03784] Fps is (10 sec: 39322.0, 60 sec: 42602.4, 300 sec: 44986.6). Total num frames: 1277689856. Throughput: 0: 45226.7. Samples: 1279007700. Policy #0 lag: (min: 0.0, avg: 39.6, max: 127.0) [2024-03-21 06:20:40,522][03784] Avg episode reward: [(0, '1.070')] [2024-03-21 06:20:44,462][04017] Updated weights for policy 0, policy_version 38995 (0.0011) [2024-03-21 06:20:45,521][03784] Fps is (10 sec: 42597.7, 60 sec: 41506.0, 300 sec: 45208.7). Total num frames: 1277886464. Throughput: 0: 45848.8. Samples: 1279165500. Policy #0 lag: (min: 0.0, avg: 36.2, max: 95.0) [2024-03-21 06:20:45,522][03784] Avg episode reward: [(0, '1.047')] [2024-03-21 06:20:50,521][03784] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 45764.1). Total num frames: 1278083072. Throughput: 0: 46055.6. Samples: 1279449600. Policy #0 lag: (min: 0.0, avg: 36.2, max: 95.0) [2024-03-21 06:20:50,522][03784] Avg episode reward: [(0, '0.747')] [2024-03-21 06:20:50,951][04017] Updated weights for policy 0, policy_version 39005 (0.0011) [2024-03-21 06:20:55,521][03784] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 45208.8). Total num frames: 1278279680. Throughput: 0: 45951.2. Samples: 1279718800. Policy #0 lag: (min: 0.0, avg: 36.2, max: 95.0) [2024-03-21 06:20:55,522][03784] Avg episode reward: [(0, '1.154')] [2024-03-21 06:20:58,162][04017] Updated weights for policy 0, policy_version 39015 (0.0014) [2024-03-21 06:21:00,521][03784] Fps is (10 sec: 52428.3, 60 sec: 43144.5, 300 sec: 45542.0). Total num frames: 1278607360. Throughput: 0: 46366.6. Samples: 1279859500. Policy #0 lag: (min: 0.0, avg: 36.2, max: 95.0) [2024-03-21 06:21:00,522][03784] Avg episode reward: [(0, '1.422')] [2024-03-21 06:21:02,548][04017] Updated weights for policy 0, policy_version 39025 (0.0020) [2024-03-21 06:21:05,521][03784] Fps is (10 sec: 58981.9, 60 sec: 43690.6, 300 sec: 45430.9). Total num frames: 1278869504. Throughput: 0: 46048.8. Samples: 1280114400. Policy #0 lag: (min: 0.0, avg: 36.2, max: 95.0) [2024-03-21 06:21:05,522][03784] Avg episode reward: [(0, '0.936')] [2024-03-21 06:21:08,007][04017] Updated weights for policy 0, policy_version 39035 (0.0014) [2024-03-21 06:21:10,521][03784] Fps is (10 sec: 62259.8, 60 sec: 46967.5, 300 sec: 45542.0). Total num frames: 1279229952. Throughput: 0: 45257.8. Samples: 1280343900. Policy #0 lag: (min: 0.0, avg: 37.8, max: 112.0) [2024-03-21 06:21:10,522][03784] Avg episode reward: [(0, '1.164')] [2024-03-21 06:21:15,521][03784] Fps is (10 sec: 49152.2, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 1279361024. Throughput: 0: 45415.6. Samples: 1280490600. Policy #0 lag: (min: 0.0, avg: 37.8, max: 112.0) [2024-03-21 06:21:15,522][03784] Avg episode reward: [(0, '1.333')] [2024-03-21 06:21:16,758][04017] Updated weights for policy 0, policy_version 39045 (0.0019) [2024-03-21 06:21:17,747][03995] Signal inference workers to stop experience collection... (25750 times) [2024-03-21 06:21:17,811][03995] Signal inference workers to resume experience collection... (25750 times) [2024-03-21 06:21:17,823][04017] InferenceWorker_p0-w0: stopping experience collection (25750 times) [2024-03-21 06:21:17,869][04017] InferenceWorker_p0-w0: resuming experience collection (25750 times) [2024-03-21 06:21:20,521][03784] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 45097.7). Total num frames: 1279721472. Throughput: 0: 45300.0. Samples: 1280756700. Policy #0 lag: (min: 0.0, avg: 37.8, max: 112.0) [2024-03-21 06:21:20,522][03784] Avg episode reward: [(0, '1.548')] [2024-03-21 06:21:22,121][04017] Updated weights for policy 0, policy_version 39055 (0.0013) [2024-03-21 06:21:25,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46421.3, 300 sec: 45097.7). Total num frames: 1279819776. Throughput: 0: 45064.4. Samples: 1281035600. Policy #0 lag: (min: 0.0, avg: 37.8, max: 112.0) [2024-03-21 06:21:25,522][03784] Avg episode reward: [(0, '1.299')] [2024-03-21 06:21:30,521][03784] Fps is (10 sec: 32768.0, 60 sec: 45875.3, 300 sec: 45319.8). Total num frames: 1280049152. Throughput: 0: 44649.0. Samples: 1281174700. Policy #0 lag: (min: 0.0, avg: 37.8, max: 112.0) [2024-03-21 06:21:30,522][03784] Avg episode reward: [(0, '1.008')] [2024-03-21 06:21:31,081][04017] Updated weights for policy 0, policy_version 39065 (0.0012) [2024-03-21 06:21:35,521][03784] Fps is (10 sec: 49152.1, 60 sec: 47513.6, 300 sec: 45542.0). Total num frames: 1280311296. Throughput: 0: 44528.9. Samples: 1281453400. Policy #0 lag: (min: 0.0, avg: 51.8, max: 109.0) [2024-03-21 06:21:35,522][03784] Avg episode reward: [(0, '0.508')] [2024-03-21 06:21:36,386][04017] Updated weights for policy 0, policy_version 39075 (0.0016) [2024-03-21 06:21:40,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.4, 300 sec: 45653.0). Total num frames: 1280507904. Throughput: 0: 44568.9. Samples: 1281724400. Policy #0 lag: (min: 0.0, avg: 51.8, max: 109.0) [2024-03-21 06:21:40,522][03784] Avg episode reward: [(0, '1.387')] [2024-03-21 06:21:45,521][03784] Fps is (10 sec: 29490.9, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1280606208. Throughput: 0: 44673.4. Samples: 1281869800. Policy #0 lag: (min: 0.0, avg: 51.8, max: 109.0) [2024-03-21 06:21:45,522][03784] Avg episode reward: [(0, '1.387')] [2024-03-21 06:21:47,444][04017] Updated weights for policy 0, policy_version 39085 (0.0014) [2024-03-21 06:21:50,521][03784] Fps is (10 sec: 36044.7, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 1280868352. Throughput: 0: 44489.0. Samples: 1282116400. Policy #0 lag: (min: 0.0, avg: 51.8, max: 109.0) [2024-03-21 06:21:50,522][03784] Avg episode reward: [(0, '1.185')] [2024-03-21 06:21:54,379][04017] Updated weights for policy 0, policy_version 39095 (0.0011) [2024-03-21 06:21:55,521][03784] Fps is (10 sec: 49152.4, 60 sec: 46967.5, 300 sec: 45430.9). Total num frames: 1281097728. Throughput: 0: 45280.0. Samples: 1282381500. Policy #0 lag: (min: 0.0, avg: 51.8, max: 109.0) [2024-03-21 06:21:55,522][03784] Avg episode reward: [(0, '1.438')] [2024-03-21 06:22:00,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44783.0, 300 sec: 45097.7). Total num frames: 1281294336. Throughput: 0: 45177.8. Samples: 1282523600. Policy #0 lag: (min: 1.0, avg: 42.0, max: 108.0) [2024-03-21 06:22:00,522][03784] Avg episode reward: [(0, '1.118')] [2024-03-21 06:22:00,649][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000039103_1281327104.pth... [2024-03-21 06:22:00,773][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000038771_1270448128.pth [2024-03-21 06:22:02,749][04017] Updated weights for policy 0, policy_version 39105 (0.0010) [2024-03-21 06:22:05,521][03784] Fps is (10 sec: 42598.5, 60 sec: 44236.9, 300 sec: 45097.7). Total num frames: 1281523712. Throughput: 0: 45777.8. Samples: 1282816700. Policy #0 lag: (min: 1.0, avg: 42.0, max: 108.0) [2024-03-21 06:22:05,522][03784] Avg episode reward: [(0, '1.118')] [2024-03-21 06:22:10,149][04017] Updated weights for policy 0, policy_version 39115 (0.0016) [2024-03-21 06:22:10,521][03784] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 44875.5). Total num frames: 1281753088. Throughput: 0: 45491.0. Samples: 1283082700. Policy #0 lag: (min: 1.0, avg: 42.0, max: 108.0) [2024-03-21 06:22:10,522][03784] Avg episode reward: [(0, '1.246')] [2024-03-21 06:22:14,578][04017] Updated weights for policy 0, policy_version 39125 (0.0010) [2024-03-21 06:22:15,269][03995] Signal inference workers to stop experience collection... (25800 times) [2024-03-21 06:22:15,332][03995] Signal inference workers to resume experience collection... (25800 times) [2024-03-21 06:22:15,384][04017] InferenceWorker_p0-w0: stopping experience collection (25800 times) [2024-03-21 06:22:15,434][04017] InferenceWorker_p0-w0: resuming experience collection (25800 times) [2024-03-21 06:22:15,521][03784] Fps is (10 sec: 58982.0, 60 sec: 45875.2, 300 sec: 45097.6). Total num frames: 1282113536. Throughput: 0: 45146.6. Samples: 1283206300. Policy #0 lag: (min: 1.0, avg: 42.0, max: 108.0) [2024-03-21 06:22:15,522][03784] Avg episode reward: [(0, '1.293')] [2024-03-21 06:22:20,521][03784] Fps is (10 sec: 58983.0, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1282342912. Throughput: 0: 44902.2. Samples: 1283474000. Policy #0 lag: (min: 1.0, avg: 42.0, max: 108.0) [2024-03-21 06:22:20,522][03784] Avg episode reward: [(0, '1.508')] [2024-03-21 06:22:20,841][04017] Updated weights for policy 0, policy_version 39135 (0.0015) [2024-03-21 06:22:25,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 1282539520. Throughput: 0: 45142.2. Samples: 1283755800. Policy #0 lag: (min: 1.0, avg: 42.0, max: 108.0) [2024-03-21 06:22:25,522][03784] Avg episode reward: [(0, '0.903')] [2024-03-21 06:22:30,140][04017] Updated weights for policy 0, policy_version 39145 (0.0011) [2024-03-21 06:22:30,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44236.8, 300 sec: 45875.2). Total num frames: 1282703360. Throughput: 0: 45208.9. Samples: 1283904200. Policy #0 lag: (min: 0.0, avg: 45.6, max: 116.0) [2024-03-21 06:22:30,522][03784] Avg episode reward: [(0, '0.903')] [2024-03-21 06:22:35,521][03784] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 45986.3). Total num frames: 1282965504. Throughput: 0: 46182.3. Samples: 1284194600. Policy #0 lag: (min: 0.0, avg: 45.6, max: 116.0) [2024-03-21 06:22:35,522][03784] Avg episode reward: [(0, '0.816')] [2024-03-21 06:22:37,276][04017] Updated weights for policy 0, policy_version 39155 (0.0019) [2024-03-21 06:22:40,521][03784] Fps is (10 sec: 45875.1, 60 sec: 44236.7, 300 sec: 45875.2). Total num frames: 1283162112. Throughput: 0: 46606.6. Samples: 1284478800. Policy #0 lag: (min: 0.0, avg: 45.6, max: 116.0) [2024-03-21 06:22:40,522][03784] Avg episode reward: [(0, '1.389')] [2024-03-21 06:22:45,301][04017] Updated weights for policy 0, policy_version 39165 (0.0015) [2024-03-21 06:22:45,521][03784] Fps is (10 sec: 39321.5, 60 sec: 45875.3, 300 sec: 45097.7). Total num frames: 1283358720. Throughput: 0: 46615.6. Samples: 1284621300. Policy #0 lag: (min: 0.0, avg: 45.6, max: 116.0) [2024-03-21 06:22:45,522][03784] Avg episode reward: [(0, '1.554')] [2024-03-21 06:22:50,521][03784] Fps is (10 sec: 36045.0, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 1283522560. Throughput: 0: 46337.7. Samples: 1284901900. Policy #0 lag: (min: 0.0, avg: 45.6, max: 116.0) [2024-03-21 06:22:50,522][03784] Avg episode reward: [(0, '1.554')] [2024-03-21 06:22:52,711][04017] Updated weights for policy 0, policy_version 39175 (0.0016) [2024-03-21 06:22:55,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1283883008. Throughput: 0: 46389.0. Samples: 1285170200. Policy #0 lag: (min: 1.0, avg: 35.5, max: 76.0) [2024-03-21 06:22:55,522][03784] Avg episode reward: [(0, '0.606')] [2024-03-21 06:22:56,754][04017] Updated weights for policy 0, policy_version 39185 (0.0016) [2024-03-21 06:23:00,521][03784] Fps is (10 sec: 62259.3, 60 sec: 47513.6, 300 sec: 45653.0). Total num frames: 1284145152. Throughput: 0: 46402.3. Samples: 1285294400. Policy #0 lag: (min: 1.0, avg: 35.5, max: 76.0) [2024-03-21 06:23:00,522][03784] Avg episode reward: [(0, '1.057')] [2024-03-21 06:23:05,521][03784] Fps is (10 sec: 39321.6, 60 sec: 45875.2, 300 sec: 45097.7). Total num frames: 1284276224. Throughput: 0: 46406.7. Samples: 1285562300. Policy #0 lag: (min: 1.0, avg: 35.5, max: 76.0) [2024-03-21 06:23:05,522][03784] Avg episode reward: [(0, '1.074')] [2024-03-21 06:23:07,584][04017] Updated weights for policy 0, policy_version 39195 (0.0018) [2024-03-21 06:23:10,521][03784] Fps is (10 sec: 29491.2, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 1284440064. Throughput: 0: 45406.7. Samples: 1285799100. Policy #0 lag: (min: 1.0, avg: 35.5, max: 76.0) [2024-03-21 06:23:10,522][03784] Avg episode reward: [(0, '0.912')] [2024-03-21 06:23:15,504][04017] Updated weights for policy 0, policy_version 39205 (0.0012) [2024-03-21 06:23:15,521][03784] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 44986.6). Total num frames: 1284669440. Throughput: 0: 45362.2. Samples: 1285945500. Policy #0 lag: (min: 1.0, avg: 35.5, max: 76.0) [2024-03-21 06:23:15,522][03784] Avg episode reward: [(0, '1.374')] [2024-03-21 06:23:20,438][03995] Signal inference workers to stop experience collection... (25850 times) [2024-03-21 06:23:20,521][03784] Fps is (10 sec: 52428.1, 60 sec: 43690.6, 300 sec: 45430.9). Total num frames: 1284964352. Throughput: 0: 44746.5. Samples: 1286208200. Policy #0 lag: (min: 0.0, avg: 28.9, max: 60.0) [2024-03-21 06:23:20,522][03784] Avg episode reward: [(0, '0.839')] [2024-03-21 06:23:20,592][04017] InferenceWorker_p0-w0: stopping experience collection (25850 times) [2024-03-21 06:23:20,667][03995] Signal inference workers to resume experience collection... (25850 times) [2024-03-21 06:23:20,668][04017] InferenceWorker_p0-w0: resuming experience collection (25850 times) [2024-03-21 06:23:20,672][04017] Updated weights for policy 0, policy_version 39215 (0.0018) [2024-03-21 06:23:25,013][04017] Updated weights for policy 0, policy_version 39225 (0.0012) [2024-03-21 06:23:25,521][03784] Fps is (10 sec: 68812.6, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 1285357568. Throughput: 0: 44042.2. Samples: 1286460700. Policy #0 lag: (min: 0.0, avg: 28.9, max: 60.0) [2024-03-21 06:23:25,522][03784] Avg episode reward: [(0, '0.865')] [2024-03-21 06:23:30,521][03784] Fps is (10 sec: 49152.6, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1285455872. Throughput: 0: 43775.5. Samples: 1286591200. Policy #0 lag: (min: 0.0, avg: 28.9, max: 60.0) [2024-03-21 06:23:30,522][03784] Avg episode reward: [(0, '1.500')] [2024-03-21 06:23:35,521][03784] Fps is (10 sec: 26214.7, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 1285619712. Throughput: 0: 44066.7. Samples: 1286884900. Policy #0 lag: (min: 0.0, avg: 28.9, max: 60.0) [2024-03-21 06:23:35,522][03784] Avg episode reward: [(0, '0.924')] [2024-03-21 06:23:36,241][04017] Updated weights for policy 0, policy_version 39235 (0.0012) [2024-03-21 06:23:40,521][03784] Fps is (10 sec: 42598.2, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 1285881856. Throughput: 0: 44351.0. Samples: 1287166000. Policy #0 lag: (min: 0.0, avg: 28.9, max: 60.0) [2024-03-21 06:23:40,522][03784] Avg episode reward: [(0, '1.388')] [2024-03-21 06:23:44,168][04017] Updated weights for policy 0, policy_version 39245 (0.0012) [2024-03-21 06:23:45,521][03784] Fps is (10 sec: 39321.4, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 1286012928. Throughput: 0: 44775.5. Samples: 1287309300. Policy #0 lag: (min: 0.0, avg: 31.9, max: 71.0) [2024-03-21 06:23:45,522][03784] Avg episode reward: [(0, '1.267')] [2024-03-21 06:23:48,785][04017] Updated weights for policy 0, policy_version 39255 (0.0020) [2024-03-21 06:23:50,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46967.4, 300 sec: 44986.6). Total num frames: 1286340608. Throughput: 0: 44808.8. Samples: 1287578700. Policy #0 lag: (min: 0.0, avg: 31.9, max: 71.0) [2024-03-21 06:23:50,522][03784] Avg episode reward: [(0, '1.390')] [2024-03-21 06:23:55,521][03784] Fps is (10 sec: 55705.1, 60 sec: 44782.8, 300 sec: 45097.6). Total num frames: 1286569984. Throughput: 0: 45895.4. Samples: 1287864400. Policy #0 lag: (min: 0.0, avg: 31.9, max: 71.0) [2024-03-21 06:23:55,522][03784] Avg episode reward: [(0, '1.390')] [2024-03-21 06:23:56,502][04017] Updated weights for policy 0, policy_version 39265 (0.0012) [2024-03-21 06:24:00,521][03784] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 45097.6). Total num frames: 1286733824. Throughput: 0: 45882.2. Samples: 1288010200. Policy #0 lag: (min: 0.0, avg: 31.9, max: 71.0) [2024-03-21 06:24:00,522][03784] Avg episode reward: [(0, '1.059')] [2024-03-21 06:24:00,706][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000039269_1286766592.pth... [2024-03-21 06:24:00,831][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000038942_1276051456.pth [2024-03-21 06:24:03,571][04017] Updated weights for policy 0, policy_version 39275 (0.0012) [2024-03-21 06:24:05,521][03784] Fps is (10 sec: 49152.8, 60 sec: 46421.3, 300 sec: 45653.1). Total num frames: 1287061504. Throughput: 0: 45702.4. Samples: 1288264800. Policy #0 lag: (min: 0.0, avg: 31.9, max: 71.0) [2024-03-21 06:24:05,522][03784] Avg episode reward: [(0, '0.429')] [2024-03-21 06:24:08,096][03995] Signal inference workers to stop experience collection... (25900 times) [2024-03-21 06:24:08,096][03995] Signal inference workers to resume experience collection... (25900 times) [2024-03-21 06:24:08,183][04017] InferenceWorker_p0-w0: stopping experience collection (25900 times) [2024-03-21 06:24:08,183][04017] InferenceWorker_p0-w0: resuming experience collection (25900 times) [2024-03-21 06:24:10,514][04017] Updated weights for policy 0, policy_version 39285 (0.0015) [2024-03-21 06:24:10,521][03784] Fps is (10 sec: 55706.2, 60 sec: 47513.6, 300 sec: 45986.3). Total num frames: 1287290880. Throughput: 0: 46269.0. Samples: 1288542800. Policy #0 lag: (min: 3.0, avg: 42.7, max: 122.0) [2024-03-21 06:24:10,522][03784] Avg episode reward: [(0, '1.423')] [2024-03-21 06:24:15,521][03784] Fps is (10 sec: 39320.9, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 1287454720. Throughput: 0: 46608.8. Samples: 1288688600. Policy #0 lag: (min: 3.0, avg: 42.7, max: 122.0) [2024-03-21 06:24:15,522][03784] Avg episode reward: [(0, '1.463')] [2024-03-21 06:24:18,344][04017] Updated weights for policy 0, policy_version 39295 (0.0022) [2024-03-21 06:24:20,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44783.0, 300 sec: 45209.6). Total num frames: 1287651328. Throughput: 0: 46051.1. Samples: 1288957200. Policy #0 lag: (min: 3.0, avg: 42.7, max: 122.0) [2024-03-21 06:24:20,522][03784] Avg episode reward: [(0, '1.463')] [2024-03-21 06:24:25,521][03784] Fps is (10 sec: 36045.2, 60 sec: 40960.0, 300 sec: 45097.7). Total num frames: 1287815168. Throughput: 0: 45526.7. Samples: 1289214700. Policy #0 lag: (min: 3.0, avg: 42.7, max: 122.0) [2024-03-21 06:24:25,522][03784] Avg episode reward: [(0, '1.109')] [2024-03-21 06:24:29,520][04017] Updated weights for policy 0, policy_version 39305 (0.0020) [2024-03-21 06:24:30,521][03784] Fps is (10 sec: 36044.6, 60 sec: 42598.4, 300 sec: 45097.6). Total num frames: 1288011776. Throughput: 0: 45442.2. Samples: 1289354200. Policy #0 lag: (min: 3.0, avg: 42.7, max: 122.0) [2024-03-21 06:24:30,522][03784] Avg episode reward: [(0, '0.701')] [2024-03-21 06:24:35,521][03784] Fps is (10 sec: 29491.2, 60 sec: 41506.1, 300 sec: 43987.7). Total num frames: 1288110080. Throughput: 0: 45700.0. Samples: 1289635200. Policy #0 lag: (min: 0.0, avg: 38.5, max: 78.0) [2024-03-21 06:24:35,522][03784] Avg episode reward: [(0, '0.580')] [2024-03-21 06:24:38,344][04017] Updated weights for policy 0, policy_version 39315 (0.0014) [2024-03-21 06:24:40,521][03784] Fps is (10 sec: 36044.4, 60 sec: 41506.1, 300 sec: 43986.8). Total num frames: 1288372224. Throughput: 0: 45188.8. Samples: 1289897900. Policy #0 lag: (min: 0.0, avg: 38.5, max: 78.0) [2024-03-21 06:24:40,523][03784] Avg episode reward: [(0, '0.578')] [2024-03-21 06:24:43,595][04017] Updated weights for policy 0, policy_version 39325 (0.0011) [2024-03-21 06:24:45,521][03784] Fps is (10 sec: 58982.5, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 1288699904. Throughput: 0: 44804.5. Samples: 1290026400. Policy #0 lag: (min: 0.0, avg: 38.5, max: 78.0) [2024-03-21 06:24:45,522][03784] Avg episode reward: [(0, '1.465')] [2024-03-21 06:24:49,281][04017] Updated weights for policy 0, policy_version 39335 (0.0016) [2024-03-21 06:24:50,521][03784] Fps is (10 sec: 65536.8, 60 sec: 44782.9, 300 sec: 45097.6). Total num frames: 1289027584. Throughput: 0: 45079.9. Samples: 1290293400. Policy #0 lag: (min: 0.0, avg: 38.5, max: 78.0) [2024-03-21 06:24:50,522][03784] Avg episode reward: [(0, '0.646')] [2024-03-21 06:24:53,755][04017] Updated weights for policy 0, policy_version 39345 (0.0014) [2024-03-21 06:24:55,521][03784] Fps is (10 sec: 65536.1, 60 sec: 46421.4, 300 sec: 45208.7). Total num frames: 1289355264. Throughput: 0: 44042.2. Samples: 1290524700. Policy #0 lag: (min: 0.0, avg: 38.5, max: 78.0) [2024-03-21 06:24:55,522][03784] Avg episode reward: [(0, '1.173')] [2024-03-21 06:24:59,329][04017] Updated weights for policy 0, policy_version 39355 (0.0012) [2024-03-21 06:24:59,733][03995] Signal inference workers to stop experience collection... (25950 times) [2024-03-21 06:24:59,734][03995] Signal inference workers to resume experience collection... (25950 times) [2024-03-21 06:24:59,808][04017] InferenceWorker_p0-w0: stopping experience collection (25950 times) [2024-03-21 06:24:59,808][04017] InferenceWorker_p0-w0: resuming experience collection (25950 times) [2024-03-21 06:25:00,521][03784] Fps is (10 sec: 65536.0, 60 sec: 49152.0, 300 sec: 45542.0). Total num frames: 1289682944. Throughput: 0: 43760.1. Samples: 1290657800. Policy #0 lag: (min: 0.0, avg: 40.2, max: 78.0) [2024-03-21 06:25:00,522][03784] Avg episode reward: [(0, '0.768')] [2024-03-21 06:25:05,521][03784] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 1289748480. Throughput: 0: 44211.1. Samples: 1290946700. Policy #0 lag: (min: 0.0, avg: 40.2, max: 78.0) [2024-03-21 06:25:05,522][03784] Avg episode reward: [(0, '0.516')] [2024-03-21 06:25:10,163][04017] Updated weights for policy 0, policy_version 39365 (0.0020) [2024-03-21 06:25:10,521][03784] Fps is (10 sec: 22937.7, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 1289912320. Throughput: 0: 44562.3. Samples: 1291220000. Policy #0 lag: (min: 0.0, avg: 40.2, max: 78.0) [2024-03-21 06:25:10,522][03784] Avg episode reward: [(0, '1.372')] [2024-03-21 06:25:15,521][03784] Fps is (10 sec: 26214.6, 60 sec: 42598.5, 300 sec: 44764.4). Total num frames: 1290010624. Throughput: 0: 44542.3. Samples: 1291358600. Policy #0 lag: (min: 0.0, avg: 40.2, max: 78.0) [2024-03-21 06:25:15,522][03784] Avg episode reward: [(0, '0.721')] [2024-03-21 06:25:19,117][04017] Updated weights for policy 0, policy_version 39375 (0.0016) [2024-03-21 06:25:20,521][03784] Fps is (10 sec: 39321.3, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 1290305536. Throughput: 0: 44277.7. Samples: 1291627700. Policy #0 lag: (min: 0.0, avg: 40.2, max: 78.0) [2024-03-21 06:25:20,522][03784] Avg episode reward: [(0, '1.499')] [2024-03-21 06:25:25,019][04017] Updated weights for policy 0, policy_version 39385 (0.0016) [2024-03-21 06:25:25,521][03784] Fps is (10 sec: 55705.4, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 1290567680. Throughput: 0: 44757.9. Samples: 1291912000. Policy #0 lag: (min: 0.0, avg: 27.9, max: 66.0) [2024-03-21 06:25:25,522][03784] Avg episode reward: [(0, '1.499')] [2024-03-21 06:25:30,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 1290731520. Throughput: 0: 45148.9. Samples: 1292058100. Policy #0 lag: (min: 0.0, avg: 27.9, max: 66.0) [2024-03-21 06:25:30,522][03784] Avg episode reward: [(0, '0.605')] [2024-03-21 06:25:35,521][03784] Fps is (10 sec: 29490.9, 60 sec: 45875.1, 300 sec: 44653.3). Total num frames: 1290862592. Throughput: 0: 45415.5. Samples: 1292337100. Policy #0 lag: (min: 0.0, avg: 27.9, max: 66.0) [2024-03-21 06:25:35,522][03784] Avg episode reward: [(0, '0.840')] [2024-03-21 06:25:35,785][04017] Updated weights for policy 0, policy_version 39395 (0.0011) [2024-03-21 06:25:40,521][03784] Fps is (10 sec: 42597.9, 60 sec: 46421.4, 300 sec: 44986.6). Total num frames: 1291157504. Throughput: 0: 46208.8. Samples: 1292604100. Policy #0 lag: (min: 0.0, avg: 27.9, max: 66.0) [2024-03-21 06:25:40,522][03784] Avg episode reward: [(0, '0.840')] [2024-03-21 06:25:41,098][04017] Updated weights for policy 0, policy_version 39405 (0.0011) [2024-03-21 06:25:44,792][04017] Updated weights for policy 0, policy_version 39415 (0.0016) [2024-03-21 06:25:45,521][03784] Fps is (10 sec: 75367.2, 60 sec: 48605.9, 300 sec: 45875.2). Total num frames: 1291616256. Throughput: 0: 45935.6. Samples: 1292724900. Policy #0 lag: (min: 0.0, avg: 27.9, max: 66.0) [2024-03-21 06:25:45,522][03784] Avg episode reward: [(0, '1.076')] [2024-03-21 06:25:50,521][03784] Fps is (10 sec: 68813.1, 60 sec: 46967.5, 300 sec: 45986.3). Total num frames: 1291845632. Throughput: 0: 45188.9. Samples: 1292980200. Policy #0 lag: (min: 0.0, avg: 39.9, max: 75.0) [2024-03-21 06:25:50,522][03784] Avg episode reward: [(0, '1.014')] [2024-03-21 06:25:50,823][04017] Updated weights for policy 0, policy_version 39425 (0.0013) [2024-03-21 06:25:55,521][03784] Fps is (10 sec: 39321.6, 60 sec: 44236.8, 300 sec: 45430.9). Total num frames: 1292009472. Throughput: 0: 45026.7. Samples: 1293246200. Policy #0 lag: (min: 0.0, avg: 39.9, max: 75.0) [2024-03-21 06:25:55,522][03784] Avg episode reward: [(0, '1.368')] [2024-03-21 06:26:00,521][03784] Fps is (10 sec: 26214.4, 60 sec: 40413.9, 300 sec: 44875.5). Total num frames: 1292107776. Throughput: 0: 45437.7. Samples: 1293403300. Policy #0 lag: (min: 0.0, avg: 39.9, max: 75.0) [2024-03-21 06:26:00,522][03784] Avg episode reward: [(0, '1.075')] [2024-03-21 06:26:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000039432_1292107776.pth... [2024-03-21 06:26:00,703][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000039103_1281327104.pth [2024-03-21 06:26:02,428][04017] Updated weights for policy 0, policy_version 39435 (0.0011) [2024-03-21 06:26:02,459][03995] Signal inference workers to stop experience collection... (26000 times) [2024-03-21 06:26:02,460][03995] Signal inference workers to resume experience collection... (26000 times) [2024-03-21 06:26:02,528][04017] InferenceWorker_p0-w0: stopping experience collection (26000 times) [2024-03-21 06:26:02,534][04017] InferenceWorker_p0-w0: resuming experience collection (26000 times) [2024-03-21 06:26:05,521][03784] Fps is (10 sec: 36044.4, 60 sec: 43690.6, 300 sec: 44542.2). Total num frames: 1292369920. Throughput: 0: 45324.4. Samples: 1293667300. Policy #0 lag: (min: 0.0, avg: 39.9, max: 75.0) [2024-03-21 06:26:05,522][03784] Avg episode reward: [(0, '0.736')] [2024-03-21 06:26:10,162][04017] Updated weights for policy 0, policy_version 39445 (0.0012) [2024-03-21 06:26:10,521][03784] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 1292533760. Throughput: 0: 44888.9. Samples: 1293932000. Policy #0 lag: (min: 0.0, avg: 39.9, max: 75.0) [2024-03-21 06:26:10,522][03784] Avg episode reward: [(0, '1.196')] [2024-03-21 06:26:15,521][03784] Fps is (10 sec: 42598.9, 60 sec: 46421.3, 300 sec: 44320.1). Total num frames: 1292795904. Throughput: 0: 44833.3. Samples: 1294075600. Policy #0 lag: (min: 3.0, avg: 64.9, max: 112.0) [2024-03-21 06:26:15,522][03784] Avg episode reward: [(0, '1.349')] [2024-03-21 06:26:18,435][04017] Updated weights for policy 0, policy_version 39455 (0.0025) [2024-03-21 06:26:20,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 1292894208. Throughput: 0: 44820.0. Samples: 1294354000. Policy #0 lag: (min: 3.0, avg: 64.9, max: 112.0) [2024-03-21 06:26:20,522][03784] Avg episode reward: [(0, '0.598')] [2024-03-21 06:26:25,521][03784] Fps is (10 sec: 36044.9, 60 sec: 43144.6, 300 sec: 44431.2). Total num frames: 1293156352. Throughput: 0: 44169.0. Samples: 1294591700. Policy #0 lag: (min: 3.0, avg: 64.9, max: 112.0) [2024-03-21 06:26:25,522][03784] Avg episode reward: [(0, '1.457')] [2024-03-21 06:26:25,654][04017] Updated weights for policy 0, policy_version 39465 (0.0017) [2024-03-21 06:26:30,521][03784] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 44097.9). Total num frames: 1293320192. Throughput: 0: 44186.6. Samples: 1294713300. Policy #0 lag: (min: 3.0, avg: 64.9, max: 112.0) [2024-03-21 06:26:30,522][03784] Avg episode reward: [(0, '1.272')] [2024-03-21 06:26:32,571][04017] Updated weights for policy 0, policy_version 39475 (0.0012) [2024-03-21 06:26:35,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45329.2, 300 sec: 44320.1). Total num frames: 1293582336. Throughput: 0: 44482.3. Samples: 1294981900. Policy #0 lag: (min: 3.0, avg: 64.9, max: 112.0) [2024-03-21 06:26:35,522][03784] Avg episode reward: [(0, '0.911')] [2024-03-21 06:26:38,229][04017] Updated weights for policy 0, policy_version 39485 (0.0021) [2024-03-21 06:26:40,521][03784] Fps is (10 sec: 62259.0, 60 sec: 46421.3, 300 sec: 45208.7). Total num frames: 1293942784. Throughput: 0: 43768.8. Samples: 1295215800. Policy #0 lag: (min: 0.0, avg: 43.2, max: 94.0) [2024-03-21 06:26:40,522][03784] Avg episode reward: [(0, '1.024')] [2024-03-21 06:26:43,738][04017] Updated weights for policy 0, policy_version 39495 (0.0015) [2024-03-21 06:26:45,521][03784] Fps is (10 sec: 68811.8, 60 sec: 44236.7, 300 sec: 45430.9). Total num frames: 1294270464. Throughput: 0: 43177.7. Samples: 1295346300. Policy #0 lag: (min: 0.0, avg: 43.2, max: 94.0) [2024-03-21 06:26:45,522][03784] Avg episode reward: [(0, '1.349')] [2024-03-21 06:26:50,521][03784] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 1294467072. Throughput: 0: 43635.5. Samples: 1295630900. Policy #0 lag: (min: 0.0, avg: 43.2, max: 94.0) [2024-03-21 06:26:50,522][03784] Avg episode reward: [(0, '1.423')] [2024-03-21 06:26:50,694][04017] Updated weights for policy 0, policy_version 39505 (0.0011) [2024-03-21 06:26:55,521][03784] Fps is (10 sec: 36045.0, 60 sec: 43690.6, 300 sec: 45208.7). Total num frames: 1294630912. Throughput: 0: 43866.6. Samples: 1295906000. Policy #0 lag: (min: 0.0, avg: 43.2, max: 94.0) [2024-03-21 06:26:55,522][03784] Avg episode reward: [(0, '0.831')] [2024-03-21 06:27:00,521][03784] Fps is (10 sec: 26214.7, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 1294729216. Throughput: 0: 44097.8. Samples: 1296060000. Policy #0 lag: (min: 0.0, avg: 43.2, max: 94.0) [2024-03-21 06:27:00,522][03784] Avg episode reward: [(0, '1.228')] [2024-03-21 06:27:02,417][03995] Signal inference workers to stop experience collection... (26050 times) [2024-03-21 06:27:02,489][04017] InferenceWorker_p0-w0: stopping experience collection (26050 times) [2024-03-21 06:27:02,765][03995] Signal inference workers to resume experience collection... (26050 times) [2024-03-21 06:27:02,765][04017] InferenceWorker_p0-w0: resuming experience collection (26050 times) [2024-03-21 06:27:03,751][04017] Updated weights for policy 0, policy_version 39515 (0.0014) [2024-03-21 06:27:05,521][03784] Fps is (10 sec: 29491.0, 60 sec: 42598.4, 300 sec: 44653.3). Total num frames: 1294925824. Throughput: 0: 44351.0. Samples: 1296349800. Policy #0 lag: (min: 0.0, avg: 43.2, max: 94.0) [2024-03-21 06:27:05,522][03784] Avg episode reward: [(0, '1.228')] [2024-03-21 06:27:07,840][04017] Updated weights for policy 0, policy_version 39525 (0.0013) [2024-03-21 06:27:10,521][03784] Fps is (10 sec: 52428.9, 60 sec: 45329.1, 300 sec: 44542.3). Total num frames: 1295253504. Throughput: 0: 44777.7. Samples: 1296606700. Policy #0 lag: (min: 3.0, avg: 39.9, max: 73.0) [2024-03-21 06:27:10,522][03784] Avg episode reward: [(0, '0.918')] [2024-03-21 06:27:13,953][04017] Updated weights for policy 0, policy_version 39535 (0.0016) [2024-03-21 06:27:15,521][03784] Fps is (10 sec: 58982.5, 60 sec: 45329.0, 300 sec: 44653.3). Total num frames: 1295515648. Throughput: 0: 45191.1. Samples: 1296746900. Policy #0 lag: (min: 3.0, avg: 39.9, max: 73.0) [2024-03-21 06:27:15,522][03784] Avg episode reward: [(0, '0.918')] [2024-03-21 06:27:20,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46967.5, 300 sec: 44653.3). Total num frames: 1295712256. Throughput: 0: 44873.2. Samples: 1297001200. Policy #0 lag: (min: 3.0, avg: 39.9, max: 73.0) [2024-03-21 06:27:20,522][03784] Avg episode reward: [(0, '1.400')] [2024-03-21 06:27:22,105][04017] Updated weights for policy 0, policy_version 39545 (0.0012) [2024-03-21 06:27:25,521][03784] Fps is (10 sec: 32768.2, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 1295843328. Throughput: 0: 45662.3. Samples: 1297270600. Policy #0 lag: (min: 3.0, avg: 39.9, max: 73.0) [2024-03-21 06:27:25,522][03784] Avg episode reward: [(0, '1.462')] [2024-03-21 06:27:29,776][04017] Updated weights for policy 0, policy_version 39555 (0.0015) [2024-03-21 06:27:30,521][03784] Fps is (10 sec: 49151.6, 60 sec: 48059.7, 300 sec: 44875.5). Total num frames: 1296203776. Throughput: 0: 45606.6. Samples: 1297398600. Policy #0 lag: (min: 3.0, avg: 39.9, max: 73.0) [2024-03-21 06:27:30,522][03784] Avg episode reward: [(0, '0.607')] [2024-03-21 06:27:35,521][03784] Fps is (10 sec: 52428.9, 60 sec: 46421.3, 300 sec: 44764.4). Total num frames: 1296367616. Throughput: 0: 45640.1. Samples: 1297684700. Policy #0 lag: (min: 0.0, avg: 38.2, max: 81.0) [2024-03-21 06:27:35,522][03784] Avg episode reward: [(0, '1.440')] [2024-03-21 06:27:39,273][04017] Updated weights for policy 0, policy_version 39565 (0.0010) [2024-03-21 06:27:40,521][03784] Fps is (10 sec: 32768.3, 60 sec: 43144.6, 300 sec: 44653.3). Total num frames: 1296531456. Throughput: 0: 45795.6. Samples: 1297966800. Policy #0 lag: (min: 0.0, avg: 38.2, max: 81.0) [2024-03-21 06:27:40,522][03784] Avg episode reward: [(0, '0.706')] [2024-03-21 06:27:45,521][03784] Fps is (10 sec: 32767.9, 60 sec: 40413.9, 300 sec: 44653.3). Total num frames: 1296695296. Throughput: 0: 45268.9. Samples: 1298097100. Policy #0 lag: (min: 0.0, avg: 38.2, max: 81.0) [2024-03-21 06:27:45,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 06:27:46,408][04017] Updated weights for policy 0, policy_version 39575 (0.0011) [2024-03-21 06:27:50,521][03784] Fps is (10 sec: 55705.8, 60 sec: 43690.8, 300 sec: 44764.4). Total num frames: 1297088512. Throughput: 0: 43635.7. Samples: 1298313400. Policy #0 lag: (min: 0.0, avg: 38.2, max: 81.0) [2024-03-21 06:27:50,522][03784] Avg episode reward: [(0, '1.499')] [2024-03-21 06:27:51,166][04017] Updated weights for policy 0, policy_version 39585 (0.0020) [2024-03-21 06:27:55,521][03784] Fps is (10 sec: 55705.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1297252352. Throughput: 0: 44302.2. Samples: 1298600300. Policy #0 lag: (min: 0.0, avg: 38.2, max: 81.0) [2024-03-21 06:27:55,522][03784] Avg episode reward: [(0, '1.592')] [2024-03-21 06:27:57,140][03995] Signal inference workers to stop experience collection... (26100 times) [2024-03-21 06:27:57,244][04017] InferenceWorker_p0-w0: stopping experience collection (26100 times) [2024-03-21 06:27:57,265][03995] Signal inference workers to resume experience collection... (26100 times) [2024-03-21 06:27:57,306][04017] InferenceWorker_p0-w0: resuming experience collection (26100 times) [2024-03-21 06:27:58,594][04017] Updated weights for policy 0, policy_version 39595 (0.0011) [2024-03-21 06:28:00,521][03784] Fps is (10 sec: 52428.3, 60 sec: 48059.7, 300 sec: 45208.7). Total num frames: 1297612800. Throughput: 0: 44331.1. Samples: 1298741800. Policy #0 lag: (min: 3.0, avg: 36.8, max: 103.0) [2024-03-21 06:28:00,522][03784] Avg episode reward: [(0, '1.036')] [2024-03-21 06:28:00,664][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000039601_1297645568.pth... [2024-03-21 06:28:00,782][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000039269_1286766592.pth [2024-03-21 06:28:03,254][04017] Updated weights for policy 0, policy_version 39605 (0.0016) [2024-03-21 06:28:05,521][03784] Fps is (10 sec: 68812.9, 60 sec: 50244.4, 300 sec: 45764.1). Total num frames: 1297940480. Throughput: 0: 44680.1. Samples: 1299011800. Policy #0 lag: (min: 3.0, avg: 36.8, max: 103.0) [2024-03-21 06:28:05,522][03784] Avg episode reward: [(0, '1.691')] [2024-03-21 06:28:10,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1298006016. Throughput: 0: 44960.0. Samples: 1299293800. Policy #0 lag: (min: 3.0, avg: 36.8, max: 103.0) [2024-03-21 06:28:10,522][03784] Avg episode reward: [(0, '1.644')] [2024-03-21 06:28:12,523][04017] Updated weights for policy 0, policy_version 39615 (0.0012) [2024-03-21 06:28:15,521][03784] Fps is (10 sec: 22937.5, 60 sec: 44236.9, 300 sec: 44764.4). Total num frames: 1298169856. Throughput: 0: 45144.5. Samples: 1299430100. Policy #0 lag: (min: 3.0, avg: 36.8, max: 103.0) [2024-03-21 06:28:15,522][03784] Avg episode reward: [(0, '0.672')] [2024-03-21 06:28:20,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1298366464. Throughput: 0: 44702.2. Samples: 1299696300. Policy #0 lag: (min: 3.0, avg: 36.8, max: 103.0) [2024-03-21 06:28:20,522][03784] Avg episode reward: [(0, '1.449')] [2024-03-21 06:28:21,956][04017] Updated weights for policy 0, policy_version 39625 (0.0011) [2024-03-21 06:28:25,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46967.5, 300 sec: 44764.4). Total num frames: 1298661376. Throughput: 0: 44568.9. Samples: 1299972400. Policy #0 lag: (min: 0.0, avg: 26.6, max: 86.0) [2024-03-21 06:28:25,522][03784] Avg episode reward: [(0, '1.232')] [2024-03-21 06:28:27,289][04017] Updated weights for policy 0, policy_version 39635 (0.0012) [2024-03-21 06:28:30,521][03784] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 44542.3). Total num frames: 1298759680. Throughput: 0: 44844.5. Samples: 1300115100. Policy #0 lag: (min: 0.0, avg: 26.6, max: 86.0) [2024-03-21 06:28:30,522][03784] Avg episode reward: [(0, '0.708')] [2024-03-21 06:28:35,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1299021824. Throughput: 0: 45860.0. Samples: 1300377100. Policy #0 lag: (min: 0.0, avg: 26.6, max: 86.0) [2024-03-21 06:28:35,522][03784] Avg episode reward: [(0, '1.172')] [2024-03-21 06:28:36,117][04017] Updated weights for policy 0, policy_version 39645 (0.0015) [2024-03-21 06:28:40,521][03784] Fps is (10 sec: 62258.5, 60 sec: 47513.5, 300 sec: 45319.8). Total num frames: 1299382272. Throughput: 0: 45395.4. Samples: 1300643100. Policy #0 lag: (min: 0.0, avg: 26.6, max: 86.0) [2024-03-21 06:28:40,531][03784] Avg episode reward: [(0, '1.221')] [2024-03-21 06:28:40,935][04017] Updated weights for policy 0, policy_version 39655 (0.0017) [2024-03-21 06:28:45,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46421.4, 300 sec: 44542.3). Total num frames: 1299480576. Throughput: 0: 45462.3. Samples: 1300787600. Policy #0 lag: (min: 0.0, avg: 26.6, max: 86.0) [2024-03-21 06:28:45,530][03784] Avg episode reward: [(0, '1.476')] [2024-03-21 06:28:49,467][03995] Signal inference workers to stop experience collection... (26150 times) [2024-03-21 06:28:49,468][03995] Signal inference workers to resume experience collection... (26150 times) [2024-03-21 06:28:49,527][04017] InferenceWorker_p0-w0: stopping experience collection (26150 times) [2024-03-21 06:28:49,527][04017] InferenceWorker_p0-w0: resuming experience collection (26150 times) [2024-03-21 06:28:50,521][03784] Fps is (10 sec: 32768.5, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 1299709952. Throughput: 0: 45673.3. Samples: 1301067100. Policy #0 lag: (min: 1.0, avg: 30.9, max: 68.0) [2024-03-21 06:28:50,530][03784] Avg episode reward: [(0, '1.656')] [2024-03-21 06:28:51,340][04017] Updated weights for policy 0, policy_version 39665 (0.0016) [2024-03-21 06:28:55,521][03784] Fps is (10 sec: 52429.0, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 1300004864. Throughput: 0: 44826.7. Samples: 1301311000. Policy #0 lag: (min: 1.0, avg: 30.9, max: 68.0) [2024-03-21 06:28:55,522][03784] Avg episode reward: [(0, '1.306')] [2024-03-21 06:28:56,046][04017] Updated weights for policy 0, policy_version 39675 (0.0018) [2024-03-21 06:29:00,521][03784] Fps is (10 sec: 58981.7, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 1300299776. Throughput: 0: 44982.1. Samples: 1301454300. Policy #0 lag: (min: 1.0, avg: 30.9, max: 68.0) [2024-03-21 06:29:00,522][03784] Avg episode reward: [(0, '1.306')] [2024-03-21 06:29:01,779][04017] Updated weights for policy 0, policy_version 39685 (0.0015) [2024-03-21 06:29:05,521][03784] Fps is (10 sec: 55705.6, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 1300561920. Throughput: 0: 45346.7. Samples: 1301736900. Policy #0 lag: (min: 1.0, avg: 30.9, max: 68.0) [2024-03-21 06:29:05,522][03784] Avg episode reward: [(0, '1.306')] [2024-03-21 06:29:09,158][04017] Updated weights for policy 0, policy_version 39695 (0.0011) [2024-03-21 06:29:10,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 1300725760. Throughput: 0: 45373.3. Samples: 1302014200. Policy #0 lag: (min: 1.0, avg: 30.9, max: 68.0) [2024-03-21 06:29:10,522][03784] Avg episode reward: [(0, '1.306')] [2024-03-21 06:29:15,521][03784] Fps is (10 sec: 36044.5, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 1300922368. Throughput: 0: 45400.0. Samples: 1302158100. Policy #0 lag: (min: 1.0, avg: 30.9, max: 68.0) [2024-03-21 06:29:15,522][03784] Avg episode reward: [(0, '1.412')] [2024-03-21 06:29:20,521][03784] Fps is (10 sec: 26214.5, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 1300987904. Throughput: 0: 45868.9. Samples: 1302441200. Policy #0 lag: (min: 0.0, avg: 35.2, max: 80.0) [2024-03-21 06:29:20,522][03784] Avg episode reward: [(0, '1.088')] [2024-03-21 06:29:21,066][04017] Updated weights for policy 0, policy_version 39705 (0.0017) [2024-03-21 06:29:25,521][03784] Fps is (10 sec: 29491.3, 60 sec: 42598.4, 300 sec: 44764.4). Total num frames: 1301217280. Throughput: 0: 45435.7. Samples: 1302687700. Policy #0 lag: (min: 0.0, avg: 35.2, max: 80.0) [2024-03-21 06:29:25,522][03784] Avg episode reward: [(0, '1.265')] [2024-03-21 06:29:27,297][04017] Updated weights for policy 0, policy_version 39715 (0.0012) [2024-03-21 06:29:30,521][03784] Fps is (10 sec: 52428.9, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1301512192. Throughput: 0: 45204.4. Samples: 1302821800. Policy #0 lag: (min: 0.0, avg: 35.2, max: 80.0) [2024-03-21 06:29:30,522][03784] Avg episode reward: [(0, '0.732')] [2024-03-21 06:29:32,826][04017] Updated weights for policy 0, policy_version 39725 (0.0011) [2024-03-21 06:29:35,521][03784] Fps is (10 sec: 55705.5, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1301774336. Throughput: 0: 44884.4. Samples: 1303086900. Policy #0 lag: (min: 0.0, avg: 35.2, max: 80.0) [2024-03-21 06:29:35,522][03784] Avg episode reward: [(0, '1.047')] [2024-03-21 06:29:39,640][04017] Updated weights for policy 0, policy_version 39735 (0.0016) [2024-03-21 06:29:40,521][03784] Fps is (10 sec: 55705.6, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 1302069248. Throughput: 0: 45377.7. Samples: 1303353000. Policy #0 lag: (min: 0.0, avg: 35.2, max: 80.0) [2024-03-21 06:29:40,522][03784] Avg episode reward: [(0, '0.528')] [2024-03-21 06:29:42,547][03995] Signal inference workers to stop experience collection... (26200 times) [2024-03-21 06:29:42,615][04017] InferenceWorker_p0-w0: stopping experience collection (26200 times) [2024-03-21 06:29:42,826][03995] Signal inference workers to resume experience collection... (26200 times) [2024-03-21 06:29:42,826][04017] InferenceWorker_p0-w0: resuming experience collection (26200 times) [2024-03-21 06:29:43,866][04017] Updated weights for policy 0, policy_version 39745 (0.0011) [2024-03-21 06:29:45,521][03784] Fps is (10 sec: 68813.2, 60 sec: 49698.2, 300 sec: 45542.0). Total num frames: 1302462464. Throughput: 0: 44637.9. Samples: 1303463000. Policy #0 lag: (min: 4.0, avg: 63.5, max: 116.0) [2024-03-21 06:29:45,522][03784] Avg episode reward: [(0, '1.380')] [2024-03-21 06:29:50,521][03784] Fps is (10 sec: 49152.4, 60 sec: 47513.6, 300 sec: 44764.4). Total num frames: 1302560768. Throughput: 0: 44620.0. Samples: 1303744800. Policy #0 lag: (min: 4.0, avg: 63.5, max: 116.0) [2024-03-21 06:29:50,521][03784] Avg episode reward: [(0, '0.906')] [2024-03-21 06:29:55,521][03784] Fps is (10 sec: 19660.4, 60 sec: 44236.6, 300 sec: 43986.9). Total num frames: 1302659072. Throughput: 0: 44671.0. Samples: 1304024400. Policy #0 lag: (min: 4.0, avg: 63.5, max: 116.0) [2024-03-21 06:29:55,522][03784] Avg episode reward: [(0, '1.455')] [2024-03-21 06:29:55,867][04017] Updated weights for policy 0, policy_version 39755 (0.0010) [2024-03-21 06:30:00,521][03784] Fps is (10 sec: 26214.3, 60 sec: 42052.3, 300 sec: 44320.1). Total num frames: 1302822912. Throughput: 0: 44606.7. Samples: 1304165400. Policy #0 lag: (min: 4.0, avg: 63.5, max: 116.0) [2024-03-21 06:30:00,522][03784] Avg episode reward: [(0, '1.355')] [2024-03-21 06:30:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000039759_1302822912.pth... [2024-03-21 06:30:00,660][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000039432_1292107776.pth [2024-03-21 06:30:03,224][04017] Updated weights for policy 0, policy_version 39765 (0.0015) [2024-03-21 06:30:05,521][03784] Fps is (10 sec: 36045.1, 60 sec: 40959.9, 300 sec: 44431.2). Total num frames: 1303019520. Throughput: 0: 44322.2. Samples: 1304435700. Policy #0 lag: (min: 4.0, avg: 63.5, max: 116.0) [2024-03-21 06:30:05,522][03784] Avg episode reward: [(0, '1.586')] [2024-03-21 06:30:10,521][03784] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 44986.6). Total num frames: 1303281664. Throughput: 0: 44880.0. Samples: 1304707300. Policy #0 lag: (min: 2.0, avg: 35.0, max: 71.0) [2024-03-21 06:30:10,522][03784] Avg episode reward: [(0, '1.514')] [2024-03-21 06:30:12,170][04017] Updated weights for policy 0, policy_version 39775 (0.0016) [2024-03-21 06:30:15,521][03784] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1303543808. Throughput: 0: 45008.9. Samples: 1304847200. Policy #0 lag: (min: 2.0, avg: 35.0, max: 71.0) [2024-03-21 06:30:15,522][03784] Avg episode reward: [(0, '0.988')] [2024-03-21 06:30:16,748][04017] Updated weights for policy 0, policy_version 39785 (0.0011) [2024-03-21 06:30:20,521][03784] Fps is (10 sec: 65535.9, 60 sec: 49152.0, 300 sec: 45319.8). Total num frames: 1303937024. Throughput: 0: 44806.7. Samples: 1305103200. Policy #0 lag: (min: 2.0, avg: 35.0, max: 71.0) [2024-03-21 06:30:20,522][03784] Avg episode reward: [(0, '1.389')] [2024-03-21 06:30:21,492][04017] Updated weights for policy 0, policy_version 39795 (0.0011) [2024-03-21 06:30:25,521][03784] Fps is (10 sec: 62259.2, 60 sec: 49152.0, 300 sec: 45542.0). Total num frames: 1304166400. Throughput: 0: 44773.3. Samples: 1305367800. Policy #0 lag: (min: 2.0, avg: 35.0, max: 71.0) [2024-03-21 06:30:25,522][03784] Avg episode reward: [(0, '1.389')] [2024-03-21 06:30:27,957][04017] Updated weights for policy 0, policy_version 39805 (0.0011) [2024-03-21 06:30:30,521][03784] Fps is (10 sec: 52428.6, 60 sec: 49152.0, 300 sec: 46097.4). Total num frames: 1304461312. Throughput: 0: 45259.9. Samples: 1305499700. Policy #0 lag: (min: 2.0, avg: 35.0, max: 71.0) [2024-03-21 06:30:30,522][03784] Avg episode reward: [(0, '0.779')] [2024-03-21 06:30:35,521][03784] Fps is (10 sec: 42599.4, 60 sec: 46967.6, 300 sec: 45542.0). Total num frames: 1304592384. Throughput: 0: 45522.4. Samples: 1305793300. Policy #0 lag: (min: 0.0, avg: 42.2, max: 103.0) [2024-03-21 06:30:35,521][03784] Avg episode reward: [(0, '1.011')] [2024-03-21 06:30:37,311][04017] Updated weights for policy 0, policy_version 39815 (0.0015) [2024-03-21 06:30:40,521][03784] Fps is (10 sec: 26214.5, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1304723456. Throughput: 0: 45162.4. Samples: 1306056700. Policy #0 lag: (min: 0.0, avg: 42.2, max: 103.0) [2024-03-21 06:30:40,522][03784] Avg episode reward: [(0, '0.648')] [2024-03-21 06:30:45,521][03784] Fps is (10 sec: 32767.3, 60 sec: 40960.0, 300 sec: 44320.1). Total num frames: 1304920064. Throughput: 0: 44766.7. Samples: 1306179900. Policy #0 lag: (min: 0.0, avg: 42.2, max: 103.0) [2024-03-21 06:30:45,522][03784] Avg episode reward: [(0, '1.058')] [2024-03-21 06:30:45,951][04017] Updated weights for policy 0, policy_version 39825 (0.0020) [2024-03-21 06:30:50,521][03784] Fps is (10 sec: 29491.3, 60 sec: 40960.0, 300 sec: 44097.9). Total num frames: 1305018368. Throughput: 0: 44511.2. Samples: 1306438700. Policy #0 lag: (min: 0.0, avg: 42.2, max: 103.0) [2024-03-21 06:30:50,522][03784] Avg episode reward: [(0, '0.927')] [2024-03-21 06:30:55,521][03784] Fps is (10 sec: 19660.9, 60 sec: 40960.1, 300 sec: 44098.0). Total num frames: 1305116672. Throughput: 0: 44471.1. Samples: 1306708500. Policy #0 lag: (min: 0.0, avg: 42.2, max: 103.0) [2024-03-21 06:30:55,522][03784] Avg episode reward: [(0, '1.045')] [2024-03-21 06:30:57,341][03995] Signal inference workers to stop experience collection... (26250 times) [2024-03-21 06:30:57,407][04017] InferenceWorker_p0-w0: stopping experience collection (26250 times) [2024-03-21 06:30:57,588][03995] Signal inference workers to resume experience collection... (26250 times) [2024-03-21 06:30:57,588][04017] InferenceWorker_p0-w0: resuming experience collection (26250 times) [2024-03-21 06:30:59,235][04017] Updated weights for policy 0, policy_version 39835 (0.0021) [2024-03-21 06:31:00,521][03784] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 44209.0). Total num frames: 1305411584. Throughput: 0: 44226.7. Samples: 1306837400. Policy #0 lag: (min: 1.0, avg: 42.6, max: 99.0) [2024-03-21 06:31:00,522][03784] Avg episode reward: [(0, '0.823')] [2024-03-21 06:31:05,521][03784] Fps is (10 sec: 49152.2, 60 sec: 43144.6, 300 sec: 44320.1). Total num frames: 1305608192. Throughput: 0: 44389.0. Samples: 1307100700. Policy #0 lag: (min: 1.0, avg: 42.6, max: 99.0) [2024-03-21 06:31:05,522][03784] Avg episode reward: [(0, '1.286')] [2024-03-21 06:31:05,562][04017] Updated weights for policy 0, policy_version 39845 (0.0012) [2024-03-21 06:31:09,418][04017] Updated weights for policy 0, policy_version 39855 (0.0020) [2024-03-21 06:31:10,521][03784] Fps is (10 sec: 65535.7, 60 sec: 46421.3, 300 sec: 44986.6). Total num frames: 1306066944. Throughput: 0: 43100.0. Samples: 1307307300. Policy #0 lag: (min: 1.0, avg: 42.6, max: 99.0) [2024-03-21 06:31:10,522][03784] Avg episode reward: [(0, '1.879')] [2024-03-21 06:31:10,884][03995] Saving new best policy, reward=1.879! [2024-03-21 06:31:15,521][03784] Fps is (10 sec: 62258.4, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 1306230784. Throughput: 0: 43153.3. Samples: 1307441600. Policy #0 lag: (min: 1.0, avg: 42.6, max: 99.0) [2024-03-21 06:31:15,522][03784] Avg episode reward: [(0, '1.040')] [2024-03-21 06:31:18,153][04017] Updated weights for policy 0, policy_version 39865 (0.0026) [2024-03-21 06:31:20,521][03784] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 45208.7). Total num frames: 1306492928. Throughput: 0: 42715.4. Samples: 1307715500. Policy #0 lag: (min: 1.0, avg: 42.6, max: 99.0) [2024-03-21 06:31:20,522][03784] Avg episode reward: [(0, '1.241')] [2024-03-21 06:31:21,600][04017] Updated weights for policy 0, policy_version 39875 (0.0019) [2024-03-21 06:31:25,521][03784] Fps is (10 sec: 62259.7, 60 sec: 44783.0, 300 sec: 45875.2). Total num frames: 1306853376. Throughput: 0: 42317.8. Samples: 1307961000. Policy #0 lag: (min: 1.0, avg: 42.6, max: 99.0) [2024-03-21 06:31:25,522][03784] Avg episode reward: [(0, '1.183')] [2024-03-21 06:31:29,806][04017] Updated weights for policy 0, policy_version 39885 (0.0011) [2024-03-21 06:31:30,521][03784] Fps is (10 sec: 45875.0, 60 sec: 41506.2, 300 sec: 45319.8). Total num frames: 1306951680. Throughput: 0: 42955.5. Samples: 1308112900. Policy #0 lag: (min: 0.0, avg: 52.9, max: 103.0) [2024-03-21 06:31:30,522][03784] Avg episode reward: [(0, '1.110')] [2024-03-21 06:31:34,217][03995] Signal inference workers to stop experience collection... (26300 times) [2024-03-21 06:31:34,218][03995] Signal inference workers to resume experience collection... (26300 times) [2024-03-21 06:31:34,284][04017] InferenceWorker_p0-w0: stopping experience collection (26300 times) [2024-03-21 06:31:34,284][04017] InferenceWorker_p0-w0: resuming experience collection (26300 times) [2024-03-21 06:31:34,962][04017] Updated weights for policy 0, policy_version 39895 (0.0013) [2024-03-21 06:31:35,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45328.9, 300 sec: 45319.8). Total num frames: 1307312128. Throughput: 0: 43397.8. Samples: 1308391600. Policy #0 lag: (min: 0.0, avg: 52.9, max: 103.0) [2024-03-21 06:31:35,522][03784] Avg episode reward: [(0, '1.445')] [2024-03-21 06:31:40,521][03784] Fps is (10 sec: 49152.3, 60 sec: 45329.1, 300 sec: 44653.4). Total num frames: 1307443200. Throughput: 0: 43657.8. Samples: 1308673100. Policy #0 lag: (min: 0.0, avg: 52.9, max: 103.0) [2024-03-21 06:31:40,522][03784] Avg episode reward: [(0, '1.272')] [2024-03-21 06:31:45,521][03784] Fps is (10 sec: 22937.7, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1307541504. Throughput: 0: 43795.6. Samples: 1308808200. Policy #0 lag: (min: 0.0, avg: 52.9, max: 103.0) [2024-03-21 06:31:45,522][03784] Avg episode reward: [(0, '1.588')] [2024-03-21 06:31:47,664][04017] Updated weights for policy 0, policy_version 39905 (0.0011) [2024-03-21 06:31:50,521][03784] Fps is (10 sec: 29491.0, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1307738112. Throughput: 0: 44111.0. Samples: 1309085700. Policy #0 lag: (min: 0.0, avg: 52.9, max: 103.0) [2024-03-21 06:31:50,522][03784] Avg episode reward: [(0, '0.609')] [2024-03-21 06:31:55,246][04017] Updated weights for policy 0, policy_version 39915 (0.0011) [2024-03-21 06:31:55,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46967.4, 300 sec: 44764.4). Total num frames: 1307934720. Throughput: 0: 45824.5. Samples: 1309369400. Policy #0 lag: (min: 0.0, avg: 29.4, max: 72.0) [2024-03-21 06:31:55,522][03784] Avg episode reward: [(0, '0.609')] [2024-03-21 06:32:00,521][03784] Fps is (10 sec: 42598.3, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1308164096. Throughput: 0: 45664.4. Samples: 1309496500. Policy #0 lag: (min: 0.0, avg: 29.4, max: 72.0) [2024-03-21 06:32:00,522][03784] Avg episode reward: [(0, '1.030')] [2024-03-21 06:32:00,836][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000039923_1308196864.pth... [2024-03-21 06:32:00,975][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000039601_1297645568.pth [2024-03-21 06:32:01,574][04017] Updated weights for policy 0, policy_version 39925 (0.0013) [2024-03-21 06:32:05,521][03784] Fps is (10 sec: 52428.2, 60 sec: 47513.5, 300 sec: 44764.4). Total num frames: 1308459008. Throughput: 0: 45255.4. Samples: 1309752000. Policy #0 lag: (min: 0.0, avg: 29.4, max: 72.0) [2024-03-21 06:32:05,522][03784] Avg episode reward: [(0, '1.559')] [2024-03-21 06:32:06,904][04017] Updated weights for policy 0, policy_version 39935 (0.0015) [2024-03-21 06:32:10,521][03784] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 44320.1). Total num frames: 1308590080. Throughput: 0: 46033.3. Samples: 1310032500. Policy #0 lag: (min: 0.0, avg: 29.4, max: 72.0) [2024-03-21 06:32:10,522][03784] Avg episode reward: [(0, '1.559')] [2024-03-21 06:32:15,521][03784] Fps is (10 sec: 36045.2, 60 sec: 43144.6, 300 sec: 44431.2). Total num frames: 1308819456. Throughput: 0: 45624.5. Samples: 1310166000. Policy #0 lag: (min: 0.0, avg: 29.4, max: 72.0) [2024-03-21 06:32:15,522][03784] Avg episode reward: [(0, '1.105')] [2024-03-21 06:32:17,236][04017] Updated weights for policy 0, policy_version 39945 (0.0016) [2024-03-21 06:32:20,521][03784] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 1309114368. Throughput: 0: 45288.9. Samples: 1310429600. Policy #0 lag: (min: 5.0, avg: 47.8, max: 96.0) [2024-03-21 06:32:20,521][03784] Avg episode reward: [(0, '0.670')] [2024-03-21 06:32:21,747][04017] Updated weights for policy 0, policy_version 39955 (0.0018) [2024-03-21 06:32:25,521][03784] Fps is (10 sec: 58982.7, 60 sec: 42598.4, 300 sec: 44764.4). Total num frames: 1309409280. Throughput: 0: 44964.5. Samples: 1310696500. Policy #0 lag: (min: 5.0, avg: 47.8, max: 96.0) [2024-03-21 06:32:25,522][03784] Avg episode reward: [(0, '0.670')] [2024-03-21 06:32:27,397][04017] Updated weights for policy 0, policy_version 39965 (0.0021) [2024-03-21 06:32:27,487][03995] Signal inference workers to stop experience collection... (26350 times) [2024-03-21 06:32:27,573][04017] InferenceWorker_p0-w0: stopping experience collection (26350 times) [2024-03-21 06:32:27,726][03995] Signal inference workers to resume experience collection... (26350 times) [2024-03-21 06:32:27,726][04017] InferenceWorker_p0-w0: resuming experience collection (26350 times) [2024-03-21 06:32:30,521][03784] Fps is (10 sec: 52428.8, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 1309638656. Throughput: 0: 44715.5. Samples: 1310820400. Policy #0 lag: (min: 5.0, avg: 47.8, max: 96.0) [2024-03-21 06:32:30,522][03784] Avg episode reward: [(0, '1.098')] [2024-03-21 06:32:35,521][03784] Fps is (10 sec: 26214.1, 60 sec: 39321.6, 300 sec: 44542.3). Total num frames: 1309671424. Throughput: 0: 45468.9. Samples: 1311131800. Policy #0 lag: (min: 5.0, avg: 47.8, max: 96.0) [2024-03-21 06:32:35,522][03784] Avg episode reward: [(0, '1.059')] [2024-03-21 06:32:40,521][03784] Fps is (10 sec: 13107.1, 60 sec: 38775.4, 300 sec: 44320.1). Total num frames: 1309769728. Throughput: 0: 45719.9. Samples: 1311426800. Policy #0 lag: (min: 5.0, avg: 47.8, max: 96.0) [2024-03-21 06:32:40,522][03784] Avg episode reward: [(0, '1.405')] [2024-03-21 06:32:41,907][04017] Updated weights for policy 0, policy_version 39975 (0.0016) [2024-03-21 06:32:45,131][04017] Updated weights for policy 0, policy_version 39985 (0.0016) [2024-03-21 06:32:45,521][03784] Fps is (10 sec: 58982.7, 60 sec: 45329.0, 300 sec: 44653.3). Total num frames: 1310261248. Throughput: 0: 45111.2. Samples: 1311526500. Policy #0 lag: (min: 7.0, avg: 39.1, max: 117.0) [2024-03-21 06:32:45,522][03784] Avg episode reward: [(0, '1.264')] [2024-03-21 06:32:48,609][04017] Updated weights for policy 0, policy_version 39995 (0.0013) [2024-03-21 06:32:50,521][03784] Fps is (10 sec: 91751.5, 60 sec: 49152.1, 300 sec: 45542.0). Total num frames: 1310687232. Throughput: 0: 45037.9. Samples: 1311778700. Policy #0 lag: (min: 7.0, avg: 39.1, max: 117.0) [2024-03-21 06:32:50,522][03784] Avg episode reward: [(0, '1.141')] [2024-03-21 06:32:55,521][03784] Fps is (10 sec: 58982.7, 60 sec: 48605.9, 300 sec: 44875.5). Total num frames: 1310851072. Throughput: 0: 44446.7. Samples: 1312032600. Policy #0 lag: (min: 7.0, avg: 39.1, max: 117.0) [2024-03-21 06:32:55,522][03784] Avg episode reward: [(0, '1.141')] [2024-03-21 06:32:56,709][04017] Updated weights for policy 0, policy_version 40005 (0.0020) [2024-03-21 06:33:00,521][03784] Fps is (10 sec: 32768.3, 60 sec: 47513.7, 300 sec: 44320.1). Total num frames: 1311014912. Throughput: 0: 44295.7. Samples: 1312159300. Policy #0 lag: (min: 7.0, avg: 39.1, max: 117.0) [2024-03-21 06:33:00,521][03784] Avg episode reward: [(0, '1.091')] [2024-03-21 06:33:05,521][03784] Fps is (10 sec: 32767.6, 60 sec: 45329.1, 300 sec: 44653.3). Total num frames: 1311178752. Throughput: 0: 45048.8. Samples: 1312456800. Policy #0 lag: (min: 7.0, avg: 39.1, max: 117.0) [2024-03-21 06:33:05,522][03784] Avg episode reward: [(0, '1.103')] [2024-03-21 06:33:05,701][04017] Updated weights for policy 0, policy_version 40015 (0.0011) [2024-03-21 06:33:10,521][03784] Fps is (10 sec: 29490.9, 60 sec: 45329.1, 300 sec: 44542.3). Total num frames: 1311309824. Throughput: 0: 45077.7. Samples: 1312725000. Policy #0 lag: (min: 7.0, avg: 39.1, max: 117.0) [2024-03-21 06:33:10,522][03784] Avg episode reward: [(0, '1.179')] [2024-03-21 06:33:14,826][04017] Updated weights for policy 0, policy_version 40025 (0.0017) [2024-03-21 06:33:15,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1311571968. Throughput: 0: 45544.4. Samples: 1312869900. Policy #0 lag: (min: 0.0, avg: 36.2, max: 75.0) [2024-03-21 06:33:15,522][03784] Avg episode reward: [(0, '0.649')] [2024-03-21 06:33:20,297][04017] Updated weights for policy 0, policy_version 40035 (0.0015) [2024-03-21 06:33:20,521][03784] Fps is (10 sec: 55705.1, 60 sec: 45875.1, 300 sec: 44764.4). Total num frames: 1311866880. Throughput: 0: 44797.8. Samples: 1313147700. Policy #0 lag: (min: 0.0, avg: 36.2, max: 75.0) [2024-03-21 06:33:20,522][03784] Avg episode reward: [(0, '1.194')] [2024-03-21 06:33:23,351][03995] Signal inference workers to stop experience collection... (26400 times) [2024-03-21 06:33:23,351][03995] Signal inference workers to resume experience collection... (26400 times) [2024-03-21 06:33:23,406][04017] InferenceWorker_p0-w0: stopping experience collection (26400 times) [2024-03-21 06:33:23,407][04017] InferenceWorker_p0-w0: resuming experience collection (26400 times) [2024-03-21 06:33:25,521][03784] Fps is (10 sec: 55705.8, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 1312129024. Throughput: 0: 44164.6. Samples: 1313414200. Policy #0 lag: (min: 0.0, avg: 36.2, max: 75.0) [2024-03-21 06:33:25,522][03784] Avg episode reward: [(0, '1.300')] [2024-03-21 06:33:27,955][04017] Updated weights for policy 0, policy_version 40045 (0.0014) [2024-03-21 06:33:30,521][03784] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 45097.6). Total num frames: 1312325632. Throughput: 0: 44826.6. Samples: 1313543700. Policy #0 lag: (min: 0.0, avg: 36.2, max: 75.0) [2024-03-21 06:33:30,522][03784] Avg episode reward: [(0, '0.706')] [2024-03-21 06:33:33,975][04017] Updated weights for policy 0, policy_version 40055 (0.0028) [2024-03-21 06:33:35,521][03784] Fps is (10 sec: 42598.0, 60 sec: 48059.7, 300 sec: 44653.4). Total num frames: 1312555008. Throughput: 0: 45215.5. Samples: 1313813400. Policy #0 lag: (min: 0.0, avg: 36.2, max: 75.0) [2024-03-21 06:33:35,522][03784] Avg episode reward: [(0, '1.301')] [2024-03-21 06:33:40,521][03784] Fps is (10 sec: 36045.1, 60 sec: 48606.0, 300 sec: 44764.4). Total num frames: 1312686080. Throughput: 0: 45926.6. Samples: 1314099300. Policy #0 lag: (min: 0.0, avg: 35.9, max: 86.0) [2024-03-21 06:33:40,522][03784] Avg episode reward: [(0, '1.498')] [2024-03-21 06:33:43,156][04017] Updated weights for policy 0, policy_version 40065 (0.0018) [2024-03-21 06:33:45,521][03784] Fps is (10 sec: 49152.5, 60 sec: 46421.4, 300 sec: 45208.7). Total num frames: 1313046528. Throughput: 0: 46059.9. Samples: 1314232000. Policy #0 lag: (min: 0.0, avg: 35.9, max: 86.0) [2024-03-21 06:33:45,522][03784] Avg episode reward: [(0, '1.080')] [2024-03-21 06:33:49,620][04017] Updated weights for policy 0, policy_version 40075 (0.0011) [2024-03-21 06:33:50,521][03784] Fps is (10 sec: 52428.5, 60 sec: 42052.2, 300 sec: 44764.4). Total num frames: 1313210368. Throughput: 0: 46020.0. Samples: 1314527700. Policy #0 lag: (min: 0.0, avg: 35.9, max: 86.0) [2024-03-21 06:33:50,522][03784] Avg episode reward: [(0, '1.080')] [2024-03-21 06:33:55,521][03784] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 44653.4). Total num frames: 1313472512. Throughput: 0: 46026.7. Samples: 1314796200. Policy #0 lag: (min: 0.0, avg: 35.9, max: 86.0) [2024-03-21 06:33:55,522][03784] Avg episode reward: [(0, '1.391')] [2024-03-21 06:33:56,990][04017] Updated weights for policy 0, policy_version 40085 (0.0014) [2024-03-21 06:34:00,521][03784] Fps is (10 sec: 45875.0, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 1313669120. Throughput: 0: 45686.6. Samples: 1314925800. Policy #0 lag: (min: 0.0, avg: 35.9, max: 86.0) [2024-03-21 06:34:00,522][03784] Avg episode reward: [(0, '1.146')] [2024-03-21 06:34:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000040090_1313669120.pth... [2024-03-21 06:34:00,655][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000039759_1302822912.pth [2024-03-21 06:34:03,501][04017] Updated weights for policy 0, policy_version 40095 (0.0012) [2024-03-21 06:34:05,521][03784] Fps is (10 sec: 45875.2, 60 sec: 45875.3, 300 sec: 44764.4). Total num frames: 1313931264. Throughput: 0: 45093.4. Samples: 1315176900. Policy #0 lag: (min: 0.0, avg: 37.8, max: 94.0) [2024-03-21 06:34:05,522][03784] Avg episode reward: [(0, '1.312')] [2024-03-21 06:34:10,058][04017] Updated weights for policy 0, policy_version 40105 (0.0016) [2024-03-21 06:34:10,521][03784] Fps is (10 sec: 52429.0, 60 sec: 48059.7, 300 sec: 44986.6). Total num frames: 1314193408. Throughput: 0: 44835.5. Samples: 1315431800. Policy #0 lag: (min: 0.0, avg: 37.8, max: 94.0) [2024-03-21 06:34:10,522][03784] Avg episode reward: [(0, '1.219')] [2024-03-21 06:34:15,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48059.8, 300 sec: 45653.1). Total num frames: 1314455552. Throughput: 0: 44689.0. Samples: 1315554700. Policy #0 lag: (min: 0.0, avg: 37.8, max: 94.0) [2024-03-21 06:34:15,522][03784] Avg episode reward: [(0, '1.217')] [2024-03-21 06:34:17,313][04017] Updated weights for policy 0, policy_version 40115 (0.0012) [2024-03-21 06:34:20,521][03784] Fps is (10 sec: 39321.3, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 1314586624. Throughput: 0: 44837.7. Samples: 1315831100. Policy #0 lag: (min: 0.0, avg: 37.8, max: 94.0) [2024-03-21 06:34:20,522][03784] Avg episode reward: [(0, '1.339')] [2024-03-21 06:34:25,521][03784] Fps is (10 sec: 22937.5, 60 sec: 42598.4, 300 sec: 44653.3). Total num frames: 1314684928. Throughput: 0: 44011.1. Samples: 1316079800. Policy #0 lag: (min: 0.0, avg: 37.8, max: 94.0) [2024-03-21 06:34:25,522][03784] Avg episode reward: [(0, '1.071')] [2024-03-21 06:34:25,753][03995] Signal inference workers to stop experience collection... (26450 times) [2024-03-21 06:34:25,817][04017] InferenceWorker_p0-w0: stopping experience collection (26450 times) [2024-03-21 06:34:26,031][03995] Signal inference workers to resume experience collection... (26450 times) [2024-03-21 06:34:26,031][04017] InferenceWorker_p0-w0: resuming experience collection (26450 times) [2024-03-21 06:34:26,712][04017] Updated weights for policy 0, policy_version 40125 (0.0020) [2024-03-21 06:34:30,521][03784] Fps is (10 sec: 26214.9, 60 sec: 42052.4, 300 sec: 44320.1). Total num frames: 1314848768. Throughput: 0: 43871.1. Samples: 1316206200. Policy #0 lag: (min: 0.0, avg: 25.0, max: 61.0) [2024-03-21 06:34:30,521][03784] Avg episode reward: [(0, '1.335')] [2024-03-21 06:34:35,521][03784] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 44209.0). Total num frames: 1315110912. Throughput: 0: 43395.6. Samples: 1316480500. Policy #0 lag: (min: 0.0, avg: 25.0, max: 61.0) [2024-03-21 06:34:35,522][03784] Avg episode reward: [(0, '1.362')] [2024-03-21 06:34:36,103][04017] Updated weights for policy 0, policy_version 40135 (0.0013) [2024-03-21 06:34:40,521][03784] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 1315307520. Throughput: 0: 43428.8. Samples: 1316750500. Policy #0 lag: (min: 0.0, avg: 25.0, max: 61.0) [2024-03-21 06:34:40,522][03784] Avg episode reward: [(0, '1.230')] [2024-03-21 06:34:43,375][04017] Updated weights for policy 0, policy_version 40145 (0.0019) [2024-03-21 06:34:45,521][03784] Fps is (10 sec: 45875.0, 60 sec: 42052.2, 300 sec: 44097.9). Total num frames: 1315569664. Throughput: 0: 43529.0. Samples: 1316884600. Policy #0 lag: (min: 0.0, avg: 25.0, max: 61.0) [2024-03-21 06:34:45,522][03784] Avg episode reward: [(0, '1.345')] [2024-03-21 06:34:50,521][03784] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 44431.2). Total num frames: 1315766272. Throughput: 0: 43962.2. Samples: 1317155200. Policy #0 lag: (min: 0.0, avg: 25.0, max: 61.0) [2024-03-21 06:34:50,522][03784] Avg episode reward: [(0, '1.020')] [2024-03-21 06:34:51,169][04017] Updated weights for policy 0, policy_version 40155 (0.0013) [2024-03-21 06:34:55,521][03784] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 44764.4). Total num frames: 1316028416. Throughput: 0: 44479.9. Samples: 1317433400. Policy #0 lag: (min: 0.0, avg: 25.0, max: 61.0) [2024-03-21 06:34:55,523][03784] Avg episode reward: [(0, '1.543')] [2024-03-21 06:34:56,504][04017] Updated weights for policy 0, policy_version 40165 (0.0011) [2024-03-21 06:35:00,521][03784] Fps is (10 sec: 58981.8, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 1316356096. Throughput: 0: 44457.7. Samples: 1317555300. Policy #0 lag: (min: 3.0, avg: 42.7, max: 87.0) [2024-03-21 06:35:00,522][03784] Avg episode reward: [(0, '1.060')] [2024-03-21 06:35:01,529][04017] Updated weights for policy 0, policy_version 40175 (0.0016) [2024-03-21 06:35:05,521][03784] Fps is (10 sec: 62260.4, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1316651008. Throughput: 0: 44040.2. Samples: 1317812900. Policy #0 lag: (min: 3.0, avg: 42.7, max: 87.0) [2024-03-21 06:35:05,522][03784] Avg episode reward: [(0, '1.440')] [2024-03-21 06:35:07,803][04017] Updated weights for policy 0, policy_version 40185 (0.0022) [2024-03-21 06:35:10,521][03784] Fps is (10 sec: 65536.6, 60 sec: 46967.5, 300 sec: 45653.0). Total num frames: 1317011456. Throughput: 0: 44384.5. Samples: 1318077100. Policy #0 lag: (min: 3.0, avg: 42.7, max: 87.0) [2024-03-21 06:35:10,522][03784] Avg episode reward: [(0, '1.202')] [2024-03-21 06:35:12,297][03995] Signal inference workers to stop experience collection... (26500 times) [2024-03-21 06:35:12,303][03995] Signal inference workers to resume experience collection... (26500 times) [2024-03-21 06:35:12,393][04017] InferenceWorker_p0-w0: stopping experience collection (26500 times) [2024-03-21 06:35:12,394][04017] InferenceWorker_p0-w0: resuming experience collection (26500 times) [2024-03-21 06:35:12,723][04017] Updated weights for policy 0, policy_version 40195 (0.0023) [2024-03-21 06:35:15,521][03784] Fps is (10 sec: 55705.7, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 1317208064. Throughput: 0: 44864.4. Samples: 1318225100. Policy #0 lag: (min: 3.0, avg: 42.7, max: 87.0) [2024-03-21 06:35:15,522][03784] Avg episode reward: [(0, '1.202')] [2024-03-21 06:35:20,521][03784] Fps is (10 sec: 29491.3, 60 sec: 45329.2, 300 sec: 44542.3). Total num frames: 1317306368. Throughput: 0: 45482.2. Samples: 1318527200. Policy #0 lag: (min: 3.0, avg: 42.7, max: 87.0) [2024-03-21 06:35:20,522][03784] Avg episode reward: [(0, '1.075')] [2024-03-21 06:35:22,676][04017] Updated weights for policy 0, policy_version 40205 (0.0015) [2024-03-21 06:35:25,521][03784] Fps is (10 sec: 36044.4, 60 sec: 48059.7, 300 sec: 44431.2). Total num frames: 1317568512. Throughput: 0: 45182.2. Samples: 1318783700. Policy #0 lag: (min: 1.0, avg: 41.9, max: 74.0) [2024-03-21 06:35:25,522][03784] Avg episode reward: [(0, '1.244')] [2024-03-21 06:35:30,521][03784] Fps is (10 sec: 29490.9, 60 sec: 45875.1, 300 sec: 44097.9). Total num frames: 1317601280. Throughput: 0: 45553.2. Samples: 1318934500. Policy #0 lag: (min: 1.0, avg: 41.9, max: 74.0) [2024-03-21 06:35:30,522][03784] Avg episode reward: [(0, '0.477')] [2024-03-21 06:35:34,695][04017] Updated weights for policy 0, policy_version 40215 (0.0016) [2024-03-21 06:35:35,521][03784] Fps is (10 sec: 26214.2, 60 sec: 45328.9, 300 sec: 44431.2). Total num frames: 1317830656. Throughput: 0: 45748.7. Samples: 1319213900. Policy #0 lag: (min: 1.0, avg: 41.9, max: 74.0) [2024-03-21 06:35:35,522][03784] Avg episode reward: [(0, '1.545')] [2024-03-21 06:35:40,521][03784] Fps is (10 sec: 32768.4, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 1317928960. Throughput: 0: 45498.0. Samples: 1319480800. Policy #0 lag: (min: 1.0, avg: 41.9, max: 74.0) [2024-03-21 06:35:40,522][03784] Avg episode reward: [(0, '1.096')] [2024-03-21 06:35:42,697][04017] Updated weights for policy 0, policy_version 40225 (0.0016) [2024-03-21 06:35:45,521][03784] Fps is (10 sec: 29491.7, 60 sec: 42598.4, 300 sec: 44431.2). Total num frames: 1318125568. Throughput: 0: 45617.9. Samples: 1319608100. Policy #0 lag: (min: 1.0, avg: 41.9, max: 74.0) [2024-03-21 06:35:45,522][03784] Avg episode reward: [(0, '1.368')] [2024-03-21 06:35:50,521][03784] Fps is (10 sec: 45874.8, 60 sec: 43690.6, 300 sec: 44986.6). Total num frames: 1318387712. Throughput: 0: 46055.5. Samples: 1319885400. Policy #0 lag: (min: 0.0, avg: 20.8, max: 58.0) [2024-03-21 06:35:50,522][03784] Avg episode reward: [(0, '0.803')] [2024-03-21 06:35:50,914][04017] Updated weights for policy 0, policy_version 40235 (0.0012) [2024-03-21 06:35:55,521][03784] Fps is (10 sec: 55705.6, 60 sec: 44236.9, 300 sec: 44986.6). Total num frames: 1318682624. Throughput: 0: 45268.9. Samples: 1320114200. Policy #0 lag: (min: 0.0, avg: 20.8, max: 58.0) [2024-03-21 06:35:55,522][03784] Avg episode reward: [(0, '0.897')] [2024-03-21 06:35:56,156][04017] Updated weights for policy 0, policy_version 40245 (0.0013) [2024-03-21 06:36:00,521][03784] Fps is (10 sec: 62259.1, 60 sec: 44236.8, 300 sec: 45430.9). Total num frames: 1319010304. Throughput: 0: 44282.1. Samples: 1320217800. Policy #0 lag: (min: 0.0, avg: 20.8, max: 58.0) [2024-03-21 06:36:00,522][03784] Avg episode reward: [(0, '1.315')] [2024-03-21 06:36:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000040253_1319010304.pth... [2024-03-21 06:36:00,662][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000039923_1308196864.pth [2024-03-21 06:36:02,385][04017] Updated weights for policy 0, policy_version 40255 (0.0017) [2024-03-21 06:36:05,521][03784] Fps is (10 sec: 62258.8, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 1319305216. Throughput: 0: 43882.2. Samples: 1320501900. Policy #0 lag: (min: 0.0, avg: 20.8, max: 58.0) [2024-03-21 06:36:05,522][03784] Avg episode reward: [(0, '1.305')] [2024-03-21 06:36:07,885][04017] Updated weights for policy 0, policy_version 40265 (0.0020) [2024-03-21 06:36:10,521][03784] Fps is (10 sec: 45875.4, 60 sec: 40960.0, 300 sec: 44875.5). Total num frames: 1319469056. Throughput: 0: 44184.5. Samples: 1320772000. Policy #0 lag: (min: 0.0, avg: 20.8, max: 58.0) [2024-03-21 06:36:10,522][03784] Avg episode reward: [(0, '1.080')] [2024-03-21 06:36:13,428][03995] Signal inference workers to stop experience collection... (26550 times) [2024-03-21 06:36:13,500][04017] InferenceWorker_p0-w0: stopping experience collection (26550 times) [2024-03-21 06:36:13,515][03995] Signal inference workers to resume experience collection... (26550 times) [2024-03-21 06:36:13,539][04017] InferenceWorker_p0-w0: resuming experience collection (26550 times) [2024-03-21 06:36:14,528][04017] Updated weights for policy 0, policy_version 40275 (0.0012) [2024-03-21 06:36:15,523][03784] Fps is (10 sec: 52420.4, 60 sec: 43689.5, 300 sec: 45208.5). Total num frames: 1319829504. Throughput: 0: 43969.6. Samples: 1320913200. Policy #0 lag: (min: 2.0, avg: 59.1, max: 112.0) [2024-03-21 06:36:15,523][03784] Avg episode reward: [(0, '0.773')] [2024-03-21 06:36:20,319][04017] Updated weights for policy 0, policy_version 40285 (0.0015) [2024-03-21 06:36:20,521][03784] Fps is (10 sec: 58982.3, 60 sec: 45875.1, 300 sec: 44764.4). Total num frames: 1320058880. Throughput: 0: 43229.0. Samples: 1321159200. Policy #0 lag: (min: 2.0, avg: 59.1, max: 112.0) [2024-03-21 06:36:20,522][03784] Avg episode reward: [(0, '1.462')] [2024-03-21 06:36:25,521][03784] Fps is (10 sec: 36050.7, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1320189952. Throughput: 0: 43573.3. Samples: 1321441600. Policy #0 lag: (min: 2.0, avg: 59.1, max: 112.0) [2024-03-21 06:36:25,522][03784] Avg episode reward: [(0, '0.660')] [2024-03-21 06:36:30,521][03784] Fps is (10 sec: 22937.8, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 1320288256. Throughput: 0: 43988.9. Samples: 1321587600. Policy #0 lag: (min: 2.0, avg: 59.1, max: 112.0) [2024-03-21 06:36:30,522][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 06:36:31,780][04017] Updated weights for policy 0, policy_version 40295 (0.0010) [2024-03-21 06:36:35,521][03784] Fps is (10 sec: 39321.7, 60 sec: 45875.3, 300 sec: 44542.3). Total num frames: 1320583168. Throughput: 0: 44122.3. Samples: 1321870900. Policy #0 lag: (min: 2.0, avg: 59.1, max: 112.0) [2024-03-21 06:36:35,521][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 06:36:38,316][04017] Updated weights for policy 0, policy_version 40305 (0.0011) [2024-03-21 06:36:40,521][03784] Fps is (10 sec: 55705.5, 60 sec: 48605.8, 300 sec: 45097.6). Total num frames: 1320845312. Throughput: 0: 44671.1. Samples: 1322124400. Policy #0 lag: (min: 1.0, avg: 45.3, max: 97.0) [2024-03-21 06:36:40,522][03784] Avg episode reward: [(0, '0.883')] [2024-03-21 06:36:44,684][04017] Updated weights for policy 0, policy_version 40315 (0.0013) [2024-03-21 06:36:45,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48605.8, 300 sec: 45097.7). Total num frames: 1321041920. Throughput: 0: 45511.2. Samples: 1322265800. Policy #0 lag: (min: 1.0, avg: 45.3, max: 97.0) [2024-03-21 06:36:45,522][03784] Avg episode reward: [(0, '1.279')] [2024-03-21 06:36:50,521][03784] Fps is (10 sec: 49151.7, 60 sec: 49152.0, 300 sec: 45430.9). Total num frames: 1321336832. Throughput: 0: 44993.3. Samples: 1322526600. Policy #0 lag: (min: 1.0, avg: 45.3, max: 97.0) [2024-03-21 06:36:50,522][03784] Avg episode reward: [(0, '1.455')] [2024-03-21 06:36:50,655][04017] Updated weights for policy 0, policy_version 40325 (0.0027) [2024-03-21 06:36:55,521][03784] Fps is (10 sec: 39321.7, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 1321435136. Throughput: 0: 45060.1. Samples: 1322799700. Policy #0 lag: (min: 1.0, avg: 45.3, max: 97.0) [2024-03-21 06:36:55,521][03784] Avg episode reward: [(0, '1.248')] [2024-03-21 06:37:00,184][04017] Updated weights for policy 0, policy_version 40335 (0.0016) [2024-03-21 06:37:00,521][03784] Fps is (10 sec: 36044.4, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 1321697280. Throughput: 0: 44819.2. Samples: 1322930000. Policy #0 lag: (min: 1.0, avg: 45.3, max: 97.0) [2024-03-21 06:37:00,522][03784] Avg episode reward: [(0, '1.285')] [2024-03-21 06:37:05,521][03784] Fps is (10 sec: 49151.8, 60 sec: 43690.7, 300 sec: 45208.7). Total num frames: 1321926656. Throughput: 0: 45284.5. Samples: 1323197000. Policy #0 lag: (min: 1.0, avg: 46.3, max: 102.0) [2024-03-21 06:37:05,522][03784] Avg episode reward: [(0, '1.573')] [2024-03-21 06:37:09,611][03995] Signal inference workers to stop experience collection... (26600 times) [2024-03-21 06:37:09,666][04017] InferenceWorker_p0-w0: stopping experience collection (26600 times) [2024-03-21 06:37:09,675][03995] Signal inference workers to resume experience collection... (26600 times) [2024-03-21 06:37:09,710][04017] InferenceWorker_p0-w0: resuming experience collection (26600 times) [2024-03-21 06:37:09,712][04017] Updated weights for policy 0, policy_version 40345 (0.0021) [2024-03-21 06:37:10,521][03784] Fps is (10 sec: 36045.4, 60 sec: 43144.6, 300 sec: 44875.5). Total num frames: 1322057728. Throughput: 0: 45420.0. Samples: 1323485500. Policy #0 lag: (min: 1.0, avg: 46.3, max: 102.0) [2024-03-21 06:37:10,522][03784] Avg episode reward: [(0, '0.739')] [2024-03-21 06:37:15,521][03784] Fps is (10 sec: 39321.6, 60 sec: 41507.3, 300 sec: 44764.4). Total num frames: 1322319872. Throughput: 0: 44813.3. Samples: 1323604200. Policy #0 lag: (min: 1.0, avg: 46.3, max: 102.0) [2024-03-21 06:37:15,522][03784] Avg episode reward: [(0, '0.938')] [2024-03-21 06:37:19,467][04017] Updated weights for policy 0, policy_version 40355 (0.0011) [2024-03-21 06:37:20,521][03784] Fps is (10 sec: 32767.9, 60 sec: 38775.5, 300 sec: 43986.9). Total num frames: 1322385408. Throughput: 0: 44939.9. Samples: 1323893200. Policy #0 lag: (min: 1.0, avg: 46.3, max: 102.0) [2024-03-21 06:37:20,522][03784] Avg episode reward: [(0, '1.802')] [2024-03-21 06:37:24,573][04017] Updated weights for policy 0, policy_version 40365 (0.0014) [2024-03-21 06:37:25,521][03784] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 44431.2). Total num frames: 1322745856. Throughput: 0: 44828.9. Samples: 1324141700. Policy #0 lag: (min: 1.0, avg: 46.3, max: 102.0) [2024-03-21 06:37:25,522][03784] Avg episode reward: [(0, '1.580')] [2024-03-21 06:37:30,521][03784] Fps is (10 sec: 55706.0, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 1322942464. Throughput: 0: 44602.2. Samples: 1324272900. Policy #0 lag: (min: 1.0, avg: 46.3, max: 102.0) [2024-03-21 06:37:30,522][03784] Avg episode reward: [(0, '1.302')] [2024-03-21 06:37:32,366][04017] Updated weights for policy 0, policy_version 40375 (0.0011) [2024-03-21 06:37:35,521][03784] Fps is (10 sec: 36044.9, 60 sec: 42052.2, 300 sec: 45208.7). Total num frames: 1323106304. Throughput: 0: 44966.7. Samples: 1324550100. Policy #0 lag: (min: 0.0, avg: 34.7, max: 73.0) [2024-03-21 06:37:35,522][03784] Avg episode reward: [(0, '1.372')] [2024-03-21 06:37:39,080][04017] Updated weights for policy 0, policy_version 40385 (0.0014) [2024-03-21 06:37:40,521][03784] Fps is (10 sec: 49152.0, 60 sec: 43144.6, 300 sec: 44653.3). Total num frames: 1323433984. Throughput: 0: 44371.1. Samples: 1324796400. Policy #0 lag: (min: 0.0, avg: 34.7, max: 73.0) [2024-03-21 06:37:40,522][03784] Avg episode reward: [(0, '1.472')] [2024-03-21 06:37:42,948][04017] Updated weights for policy 0, policy_version 40395 (0.0011) [2024-03-21 06:37:45,521][03784] Fps is (10 sec: 78643.2, 60 sec: 47513.6, 300 sec: 44764.4). Total num frames: 1323892736. Throughput: 0: 44104.6. Samples: 1324914700. Policy #0 lag: (min: 0.0, avg: 34.7, max: 73.0) [2024-03-21 06:37:45,522][03784] Avg episode reward: [(0, '1.309')] [2024-03-21 06:37:46,175][04017] Updated weights for policy 0, policy_version 40405 (0.0020) [2024-03-21 06:37:50,521][03784] Fps is (10 sec: 72088.9, 60 sec: 46967.5, 300 sec: 45097.6). Total num frames: 1324154880. Throughput: 0: 43517.7. Samples: 1325155300. Policy #0 lag: (min: 0.0, avg: 34.7, max: 73.0) [2024-03-21 06:37:50,522][03784] Avg episode reward: [(0, '0.646')] [2024-03-21 06:37:55,521][03784] Fps is (10 sec: 26214.3, 60 sec: 45329.0, 300 sec: 44542.2). Total num frames: 1324154880. Throughput: 0: 43744.4. Samples: 1325454000. Policy #0 lag: (min: 0.0, avg: 34.7, max: 73.0) [2024-03-21 06:37:55,522][03784] Avg episode reward: [(0, '0.436')] [2024-03-21 06:37:55,916][03995] Signal inference workers to stop experience collection... (26650 times) [2024-03-21 06:37:55,956][04017] InferenceWorker_p0-w0: stopping experience collection (26650 times) [2024-03-21 06:37:56,200][03995] Signal inference workers to resume experience collection... (26650 times) [2024-03-21 06:37:56,200][04017] InferenceWorker_p0-w0: resuming experience collection (26650 times) [2024-03-21 06:37:57,512][04017] Updated weights for policy 0, policy_version 40415 (0.0013) [2024-03-21 06:38:00,521][03784] Fps is (10 sec: 26214.5, 60 sec: 45329.2, 300 sec: 44875.5). Total num frames: 1324417024. Throughput: 0: 43611.1. Samples: 1325566700. Policy #0 lag: (min: 3.0, avg: 43.2, max: 83.0) [2024-03-21 06:38:00,522][03784] Avg episode reward: [(0, '1.215')] [2024-03-21 06:38:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000040418_1324417024.pth... [2024-03-21 06:38:00,662][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000040090_1313669120.pth [2024-03-21 06:38:05,521][03784] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 1324580864. Throughput: 0: 43984.5. Samples: 1325872500. Policy #0 lag: (min: 3.0, avg: 43.2, max: 83.0) [2024-03-21 06:38:05,522][03784] Avg episode reward: [(0, '0.739')] [2024-03-21 06:38:06,174][04017] Updated weights for policy 0, policy_version 40425 (0.0013) [2024-03-21 06:38:10,521][03784] Fps is (10 sec: 26214.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1324679168. Throughput: 0: 44940.0. Samples: 1326164000. Policy #0 lag: (min: 3.0, avg: 43.2, max: 83.0) [2024-03-21 06:38:10,522][03784] Avg episode reward: [(0, '0.969')] [2024-03-21 06:38:15,521][03784] Fps is (10 sec: 32768.1, 60 sec: 43144.6, 300 sec: 44209.0). Total num frames: 1324908544. Throughput: 0: 44926.7. Samples: 1326294600. Policy #0 lag: (min: 3.0, avg: 43.2, max: 83.0) [2024-03-21 06:38:15,522][03784] Avg episode reward: [(0, '1.523')] [2024-03-21 06:38:16,143][04017] Updated weights for policy 0, policy_version 40435 (0.0015) [2024-03-21 06:38:20,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46421.3, 300 sec: 44209.0). Total num frames: 1325170688. Throughput: 0: 45062.2. Samples: 1326577900. Policy #0 lag: (min: 3.0, avg: 43.2, max: 83.0) [2024-03-21 06:38:20,522][03784] Avg episode reward: [(0, '1.430')] [2024-03-21 06:38:21,765][04017] Updated weights for policy 0, policy_version 40445 (0.0013) [2024-03-21 06:38:25,521][03784] Fps is (10 sec: 49151.9, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 1325400064. Throughput: 0: 45722.2. Samples: 1326853900. Policy #0 lag: (min: 0.0, avg: 37.6, max: 75.0) [2024-03-21 06:38:25,522][03784] Avg episode reward: [(0, '1.142')] [2024-03-21 06:38:28,313][04017] Updated weights for policy 0, policy_version 40455 (0.0013) [2024-03-21 06:38:30,521][03784] Fps is (10 sec: 58982.6, 60 sec: 46967.5, 300 sec: 44764.4). Total num frames: 1325760512. Throughput: 0: 45973.3. Samples: 1326983500. Policy #0 lag: (min: 0.0, avg: 37.6, max: 75.0) [2024-03-21 06:38:30,522][03784] Avg episode reward: [(0, '1.237')] [2024-03-21 06:38:33,604][04017] Updated weights for policy 0, policy_version 40465 (0.0015) [2024-03-21 06:38:35,521][03784] Fps is (10 sec: 68811.9, 60 sec: 49698.0, 300 sec: 45430.9). Total num frames: 1326088192. Throughput: 0: 46231.1. Samples: 1327235700. Policy #0 lag: (min: 0.0, avg: 37.6, max: 75.0) [2024-03-21 06:38:35,522][03784] Avg episode reward: [(0, '1.311')] [2024-03-21 06:38:40,460][04017] Updated weights for policy 0, policy_version 40475 (0.0010) [2024-03-21 06:38:40,521][03784] Fps is (10 sec: 52429.4, 60 sec: 47513.7, 300 sec: 44875.5). Total num frames: 1326284800. Throughput: 0: 45917.9. Samples: 1327520300. Policy #0 lag: (min: 0.0, avg: 37.6, max: 75.0) [2024-03-21 06:38:40,521][03784] Avg episode reward: [(0, '0.761')] [2024-03-21 06:38:45,132][03995] Signal inference workers to stop experience collection... (26700 times) [2024-03-21 06:38:45,199][04017] InferenceWorker_p0-w0: stopping experience collection (26700 times) [2024-03-21 06:38:45,376][03995] Signal inference workers to resume experience collection... (26700 times) [2024-03-21 06:38:45,376][04017] InferenceWorker_p0-w0: resuming experience collection (26700 times) [2024-03-21 06:38:45,521][03784] Fps is (10 sec: 36044.8, 60 sec: 42598.3, 300 sec: 44875.5). Total num frames: 1326448640. Throughput: 0: 46753.3. Samples: 1327670600. Policy #0 lag: (min: 0.0, avg: 37.6, max: 75.0) [2024-03-21 06:38:45,522][03784] Avg episode reward: [(0, '1.425')] [2024-03-21 06:38:47,884][04017] Updated weights for policy 0, policy_version 40485 (0.0017) [2024-03-21 06:38:50,521][03784] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 44986.6). Total num frames: 1326743552. Throughput: 0: 45775.5. Samples: 1327932400. Policy #0 lag: (min: 0.0, avg: 37.6, max: 75.0) [2024-03-21 06:38:50,522][03784] Avg episode reward: [(0, '1.022')] [2024-03-21 06:38:54,615][04017] Updated weights for policy 0, policy_version 40495 (0.0011) [2024-03-21 06:38:55,521][03784] Fps is (10 sec: 52429.3, 60 sec: 46967.5, 300 sec: 45097.7). Total num frames: 1326972928. Throughput: 0: 45435.5. Samples: 1328208600. Policy #0 lag: (min: 0.0, avg: 47.6, max: 94.0) [2024-03-21 06:38:55,522][03784] Avg episode reward: [(0, '0.453')] [2024-03-21 06:39:00,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 1327104000. Throughput: 0: 45602.2. Samples: 1328346700. Policy #0 lag: (min: 0.0, avg: 47.6, max: 94.0) [2024-03-21 06:39:00,522][03784] Avg episode reward: [(0, '1.263')] [2024-03-21 06:39:04,802][04017] Updated weights for policy 0, policy_version 40505 (0.0016) [2024-03-21 06:39:05,521][03784] Fps is (10 sec: 32768.1, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1327300608. Throughput: 0: 45257.8. Samples: 1328614500. Policy #0 lag: (min: 0.0, avg: 47.6, max: 94.0) [2024-03-21 06:39:05,522][03784] Avg episode reward: [(0, '1.406')] [2024-03-21 06:39:10,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48059.7, 300 sec: 44431.2). Total num frames: 1327562752. Throughput: 0: 44828.8. Samples: 1328871200. Policy #0 lag: (min: 0.0, avg: 47.6, max: 94.0) [2024-03-21 06:39:10,530][03784] Avg episode reward: [(0, '1.011')] [2024-03-21 06:39:10,794][04017] Updated weights for policy 0, policy_version 40515 (0.0012) [2024-03-21 06:39:15,521][03784] Fps is (10 sec: 55705.4, 60 sec: 49152.0, 300 sec: 44986.6). Total num frames: 1327857664. Throughput: 0: 44724.4. Samples: 1328996100. Policy #0 lag: (min: 0.0, avg: 47.6, max: 94.0) [2024-03-21 06:39:15,522][03784] Avg episode reward: [(0, '1.394')] [2024-03-21 06:39:19,681][04017] Updated weights for policy 0, policy_version 40525 (0.0011) [2024-03-21 06:39:20,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1327923200. Throughput: 0: 45349.0. Samples: 1329276400. Policy #0 lag: (min: 0.0, avg: 31.0, max: 69.0) [2024-03-21 06:39:20,530][03784] Avg episode reward: [(0, '1.400')] [2024-03-21 06:39:25,500][04017] Updated weights for policy 0, policy_version 40535 (0.0021) [2024-03-21 06:39:25,521][03784] Fps is (10 sec: 39321.1, 60 sec: 47513.5, 300 sec: 45430.9). Total num frames: 1328250880. Throughput: 0: 44977.5. Samples: 1329544300. Policy #0 lag: (min: 0.0, avg: 31.0, max: 69.0) [2024-03-21 06:39:25,522][03784] Avg episode reward: [(0, '1.410')] [2024-03-21 06:39:30,521][03784] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 44764.4). Total num frames: 1328316416. Throughput: 0: 45111.2. Samples: 1329700600. Policy #0 lag: (min: 0.0, avg: 31.0, max: 69.0) [2024-03-21 06:39:30,522][03784] Avg episode reward: [(0, '0.729')] [2024-03-21 06:39:35,521][03784] Fps is (10 sec: 22938.0, 60 sec: 39867.8, 300 sec: 44653.3). Total num frames: 1328480256. Throughput: 0: 45337.8. Samples: 1329972600. Policy #0 lag: (min: 0.0, avg: 31.0, max: 69.0) [2024-03-21 06:39:35,522][03784] Avg episode reward: [(0, '1.075')] [2024-03-21 06:39:36,408][04017] Updated weights for policy 0, policy_version 40545 (0.0012) [2024-03-21 06:39:40,521][03784] Fps is (10 sec: 39321.3, 60 sec: 40413.7, 300 sec: 44542.3). Total num frames: 1328709632. Throughput: 0: 45513.3. Samples: 1330256700. Policy #0 lag: (min: 0.0, avg: 31.0, max: 69.0) [2024-03-21 06:39:40,522][03784] Avg episode reward: [(0, '1.075')] [2024-03-21 06:39:44,924][04017] Updated weights for policy 0, policy_version 40555 (0.0015) [2024-03-21 06:39:45,521][03784] Fps is (10 sec: 45874.8, 60 sec: 41506.2, 300 sec: 44653.3). Total num frames: 1328939008. Throughput: 0: 45204.4. Samples: 1330380900. Policy #0 lag: (min: 1.0, avg: 34.8, max: 65.0) [2024-03-21 06:39:45,522][03784] Avg episode reward: [(0, '1.398')] [2024-03-21 06:39:46,728][03995] Signal inference workers to stop experience collection... (26750 times) [2024-03-21 06:39:46,790][03995] Signal inference workers to resume experience collection... (26750 times) [2024-03-21 06:39:46,797][04017] InferenceWorker_p0-w0: stopping experience collection (26750 times) [2024-03-21 06:39:46,858][04017] InferenceWorker_p0-w0: resuming experience collection (26750 times) [2024-03-21 06:39:48,478][04017] Updated weights for policy 0, policy_version 40565 (0.0020) [2024-03-21 06:39:50,521][03784] Fps is (10 sec: 65536.1, 60 sec: 43690.6, 300 sec: 45208.7). Total num frames: 1329364992. Throughput: 0: 44666.6. Samples: 1330624500. Policy #0 lag: (min: 1.0, avg: 34.8, max: 65.0) [2024-03-21 06:39:50,522][03784] Avg episode reward: [(0, '1.327')] [2024-03-21 06:39:54,078][04017] Updated weights for policy 0, policy_version 40575 (0.0011) [2024-03-21 06:39:55,521][03784] Fps is (10 sec: 72089.5, 60 sec: 44782.9, 300 sec: 45097.7). Total num frames: 1329659904. Throughput: 0: 44955.5. Samples: 1330894200. Policy #0 lag: (min: 1.0, avg: 34.8, max: 65.0) [2024-03-21 06:39:55,522][03784] Avg episode reward: [(0, '1.335')] [2024-03-21 06:39:59,376][04017] Updated weights for policy 0, policy_version 40585 (0.0011) [2024-03-21 06:40:00,521][03784] Fps is (10 sec: 62259.1, 60 sec: 48059.7, 300 sec: 45208.7). Total num frames: 1329987584. Throughput: 0: 45297.7. Samples: 1331034500. Policy #0 lag: (min: 1.0, avg: 34.8, max: 65.0) [2024-03-21 06:40:00,522][03784] Avg episode reward: [(0, '1.314')] [2024-03-21 06:40:00,800][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000040589_1330020352.pth... [2024-03-21 06:40:00,944][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000040253_1319010304.pth [2024-03-21 06:40:03,243][04017] Updated weights for policy 0, policy_version 40595 (0.0012) [2024-03-21 06:40:05,521][03784] Fps is (10 sec: 55706.3, 60 sec: 48605.9, 300 sec: 44764.4). Total num frames: 1330216960. Throughput: 0: 45126.7. Samples: 1331307100. Policy #0 lag: (min: 1.0, avg: 34.8, max: 65.0) [2024-03-21 06:40:05,522][03784] Avg episode reward: [(0, '1.314')] [2024-03-21 06:40:10,521][03784] Fps is (10 sec: 42598.3, 60 sec: 47513.6, 300 sec: 44764.4). Total num frames: 1330413568. Throughput: 0: 45488.9. Samples: 1331591300. Policy #0 lag: (min: 1.0, avg: 34.8, max: 65.0) [2024-03-21 06:40:10,522][03784] Avg episode reward: [(0, '1.449')] [2024-03-21 06:40:14,049][04017] Updated weights for policy 0, policy_version 40605 (0.0011) [2024-03-21 06:40:15,521][03784] Fps is (10 sec: 36044.6, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 1330577408. Throughput: 0: 45346.6. Samples: 1331741200. Policy #0 lag: (min: 0.0, avg: 42.6, max: 114.0) [2024-03-21 06:40:15,522][03784] Avg episode reward: [(0, '1.275')] [2024-03-21 06:40:20,521][03784] Fps is (10 sec: 29491.2, 60 sec: 46421.3, 300 sec: 44542.3). Total num frames: 1330708480. Throughput: 0: 45693.2. Samples: 1332028800. Policy #0 lag: (min: 0.0, avg: 42.6, max: 114.0) [2024-03-21 06:40:20,522][03784] Avg episode reward: [(0, '1.275')] [2024-03-21 06:40:23,218][04017] Updated weights for policy 0, policy_version 40615 (0.0011) [2024-03-21 06:40:25,521][03784] Fps is (10 sec: 29491.2, 60 sec: 43690.8, 300 sec: 44986.6). Total num frames: 1330872320. Throughput: 0: 45320.1. Samples: 1332296100. Policy #0 lag: (min: 0.0, avg: 42.6, max: 114.0) [2024-03-21 06:40:25,522][03784] Avg episode reward: [(0, '0.658')] [2024-03-21 06:40:30,521][03784] Fps is (10 sec: 42598.6, 60 sec: 46967.4, 300 sec: 45097.7). Total num frames: 1331134464. Throughput: 0: 45391.2. Samples: 1332423500. Policy #0 lag: (min: 0.0, avg: 42.6, max: 114.0) [2024-03-21 06:40:30,522][03784] Avg episode reward: [(0, '0.998')] [2024-03-21 06:40:32,595][04017] Updated weights for policy 0, policy_version 40625 (0.0011) [2024-03-21 06:40:35,521][03784] Fps is (10 sec: 32767.9, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 1331200000. Throughput: 0: 45284.5. Samples: 1332662300. Policy #0 lag: (min: 0.0, avg: 42.6, max: 114.0) [2024-03-21 06:40:35,522][03784] Avg episode reward: [(0, '0.700')] [2024-03-21 06:40:39,415][03995] Signal inference workers to stop experience collection... (26800 times) [2024-03-21 06:40:39,416][03995] Signal inference workers to resume experience collection... (26800 times) [2024-03-21 06:40:39,469][04017] InferenceWorker_p0-w0: stopping experience collection (26800 times) [2024-03-21 06:40:39,469][04017] InferenceWorker_p0-w0: resuming experience collection (26800 times) [2024-03-21 06:40:40,521][03784] Fps is (10 sec: 36044.7, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1331494912. Throughput: 0: 44602.2. Samples: 1332901300. Policy #0 lag: (min: 0.0, avg: 24.2, max: 65.0) [2024-03-21 06:40:40,522][03784] Avg episode reward: [(0, '1.394')] [2024-03-21 06:40:41,486][04017] Updated weights for policy 0, policy_version 40635 (0.0014) [2024-03-21 06:40:45,521][03784] Fps is (10 sec: 55705.8, 60 sec: 46967.5, 300 sec: 45319.8). Total num frames: 1331757056. Throughput: 0: 44288.9. Samples: 1333027500. Policy #0 lag: (min: 0.0, avg: 24.2, max: 65.0) [2024-03-21 06:40:45,522][03784] Avg episode reward: [(0, '0.825')] [2024-03-21 06:40:50,521][03784] Fps is (10 sec: 29491.5, 60 sec: 40413.9, 300 sec: 44431.2). Total num frames: 1331789824. Throughput: 0: 43906.7. Samples: 1333282900. Policy #0 lag: (min: 0.0, avg: 24.2, max: 65.0) [2024-03-21 06:40:50,522][03784] Avg episode reward: [(0, '1.490')] [2024-03-21 06:40:51,216][04017] Updated weights for policy 0, policy_version 40645 (0.0012) [2024-03-21 06:40:55,521][03784] Fps is (10 sec: 36045.3, 60 sec: 40960.1, 300 sec: 44431.2). Total num frames: 1332117504. Throughput: 0: 43589.1. Samples: 1333552800. Policy #0 lag: (min: 0.0, avg: 24.2, max: 65.0) [2024-03-21 06:40:55,521][03784] Avg episode reward: [(0, '1.448')] [2024-03-21 06:40:56,020][04017] Updated weights for policy 0, policy_version 40655 (0.0030) [2024-03-21 06:41:00,521][03784] Fps is (10 sec: 62259.1, 60 sec: 40413.9, 300 sec: 44431.2). Total num frames: 1332412416. Throughput: 0: 42744.5. Samples: 1333664700. Policy #0 lag: (min: 0.0, avg: 24.2, max: 65.0) [2024-03-21 06:41:00,522][03784] Avg episode reward: [(0, '0.538')] [2024-03-21 06:41:04,316][04017] Updated weights for policy 0, policy_version 40665 (0.0022) [2024-03-21 06:41:05,521][03784] Fps is (10 sec: 45874.4, 60 sec: 39321.5, 300 sec: 44431.2). Total num frames: 1332576256. Throughput: 0: 42657.8. Samples: 1333948400. Policy #0 lag: (min: 1.0, avg: 37.3, max: 68.0) [2024-03-21 06:41:05,522][03784] Avg episode reward: [(0, '0.877')] [2024-03-21 06:41:08,064][04017] Updated weights for policy 0, policy_version 40675 (0.0011) [2024-03-21 06:41:10,521][03784] Fps is (10 sec: 52428.9, 60 sec: 42052.3, 300 sec: 44431.4). Total num frames: 1332936704. Throughput: 0: 42393.4. Samples: 1334203800. Policy #0 lag: (min: 1.0, avg: 37.3, max: 68.0) [2024-03-21 06:41:10,522][03784] Avg episode reward: [(0, '1.168')] [2024-03-21 06:41:14,484][04017] Updated weights for policy 0, policy_version 40685 (0.0014) [2024-03-21 06:41:15,521][03784] Fps is (10 sec: 62259.1, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 1333198848. Throughput: 0: 42693.3. Samples: 1334344700. Policy #0 lag: (min: 1.0, avg: 37.3, max: 68.0) [2024-03-21 06:41:15,522][03784] Avg episode reward: [(0, '1.168')] [2024-03-21 06:41:20,521][03784] Fps is (10 sec: 52427.8, 60 sec: 45875.1, 300 sec: 44986.5). Total num frames: 1333460992. Throughput: 0: 42971.0. Samples: 1334596000. Policy #0 lag: (min: 1.0, avg: 37.3, max: 68.0) [2024-03-21 06:41:20,522][03784] Avg episode reward: [(0, '1.371')] [2024-03-21 06:41:20,791][04017] Updated weights for policy 0, policy_version 40695 (0.0012) [2024-03-21 06:41:24,199][03995] Signal inference workers to stop experience collection... (26850 times) [2024-03-21 06:41:24,200][03995] Signal inference workers to resume experience collection... (26850 times) [2024-03-21 06:41:24,251][04017] InferenceWorker_p0-w0: stopping experience collection (26850 times) [2024-03-21 06:41:24,251][04017] InferenceWorker_p0-w0: resuming experience collection (26850 times) [2024-03-21 06:41:25,521][03784] Fps is (10 sec: 45875.6, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1333657600. Throughput: 0: 43900.1. Samples: 1334876800. Policy #0 lag: (min: 1.0, avg: 37.3, max: 68.0) [2024-03-21 06:41:25,522][03784] Avg episode reward: [(0, '0.872')] [2024-03-21 06:41:28,091][04017] Updated weights for policy 0, policy_version 40705 (0.0011) [2024-03-21 06:41:30,521][03784] Fps is (10 sec: 49152.4, 60 sec: 46967.4, 300 sec: 45319.8). Total num frames: 1333952512. Throughput: 0: 44153.3. Samples: 1335014400. Policy #0 lag: (min: 1.0, avg: 37.3, max: 68.0) [2024-03-21 06:41:30,522][03784] Avg episode reward: [(0, '1.074')] [2024-03-21 06:41:35,521][03784] Fps is (10 sec: 42598.5, 60 sec: 48059.8, 300 sec: 44875.5). Total num frames: 1334083584. Throughput: 0: 44606.7. Samples: 1335290200. Policy #0 lag: (min: 0.0, avg: 44.6, max: 89.0) [2024-03-21 06:41:35,522][03784] Avg episode reward: [(0, '1.395')] [2024-03-21 06:41:36,148][04017] Updated weights for policy 0, policy_version 40715 (0.0016) [2024-03-21 06:41:40,521][03784] Fps is (10 sec: 36045.2, 60 sec: 46967.5, 300 sec: 44986.6). Total num frames: 1334312960. Throughput: 0: 44824.3. Samples: 1335569900. Policy #0 lag: (min: 0.0, avg: 44.6, max: 89.0) [2024-03-21 06:41:40,522][03784] Avg episode reward: [(0, '0.859')] [2024-03-21 06:41:45,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1334444032. Throughput: 0: 45711.1. Samples: 1335721700. Policy #0 lag: (min: 0.0, avg: 44.6, max: 89.0) [2024-03-21 06:41:45,522][03784] Avg episode reward: [(0, '0.859')] [2024-03-21 06:41:45,738][04017] Updated weights for policy 0, policy_version 40725 (0.0011) [2024-03-21 06:41:50,521][03784] Fps is (10 sec: 36044.7, 60 sec: 48059.7, 300 sec: 44875.5). Total num frames: 1334673408. Throughput: 0: 45433.4. Samples: 1335992900. Policy #0 lag: (min: 0.0, avg: 44.6, max: 89.0) [2024-03-21 06:41:50,522][03784] Avg episode reward: [(0, '1.054')] [2024-03-21 06:41:53,164][04017] Updated weights for policy 0, policy_version 40735 (0.0016) [2024-03-21 06:41:55,521][03784] Fps is (10 sec: 42597.7, 60 sec: 45875.0, 300 sec: 44653.3). Total num frames: 1334870016. Throughput: 0: 45913.1. Samples: 1336269900. Policy #0 lag: (min: 0.0, avg: 44.6, max: 89.0) [2024-03-21 06:41:55,522][03784] Avg episode reward: [(0, '1.365')] [2024-03-21 06:41:59,810][04017] Updated weights for policy 0, policy_version 40745 (0.0014) [2024-03-21 06:42:00,521][03784] Fps is (10 sec: 49151.9, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1335164928. Throughput: 0: 45780.0. Samples: 1336404800. Policy #0 lag: (min: 0.0, avg: 35.9, max: 94.0) [2024-03-21 06:42:00,522][03784] Avg episode reward: [(0, '0.999')] [2024-03-21 06:42:00,920][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000040748_1335230464.pth... [2024-03-21 06:42:01,046][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000040418_1324417024.pth [2024-03-21 06:42:05,521][03784] Fps is (10 sec: 49152.3, 60 sec: 46421.3, 300 sec: 45097.6). Total num frames: 1335361536. Throughput: 0: 46133.4. Samples: 1336672000. Policy #0 lag: (min: 0.0, avg: 35.9, max: 94.0) [2024-03-21 06:42:05,522][03784] Avg episode reward: [(0, '0.999')] [2024-03-21 06:42:06,450][04017] Updated weights for policy 0, policy_version 40755 (0.0015) [2024-03-21 06:42:10,521][03784] Fps is (10 sec: 36044.4, 60 sec: 43144.4, 300 sec: 44764.4). Total num frames: 1335525376. Throughput: 0: 45459.9. Samples: 1336922500. Policy #0 lag: (min: 0.0, avg: 35.9, max: 94.0) [2024-03-21 06:42:10,522][03784] Avg episode reward: [(0, '1.025')] [2024-03-21 06:42:15,521][03784] Fps is (10 sec: 26214.8, 60 sec: 40413.9, 300 sec: 44875.5). Total num frames: 1335623680. Throughput: 0: 45335.6. Samples: 1337054500. Policy #0 lag: (min: 0.0, avg: 35.9, max: 94.0) [2024-03-21 06:42:15,522][03784] Avg episode reward: [(0, '1.550')] [2024-03-21 06:42:16,974][04017] Updated weights for policy 0, policy_version 40765 (0.0028) [2024-03-21 06:42:20,372][03995] Signal inference workers to stop experience collection... (26900 times) [2024-03-21 06:42:20,442][03995] Signal inference workers to resume experience collection... (26900 times) [2024-03-21 06:42:20,472][04017] InferenceWorker_p0-w0: stopping experience collection (26900 times) [2024-03-21 06:42:20,521][03784] Fps is (10 sec: 55705.1, 60 sec: 43690.6, 300 sec: 45208.7). Total num frames: 1336082432. Throughput: 0: 45090.9. Samples: 1337319300. Policy #0 lag: (min: 0.0, avg: 35.9, max: 94.0) [2024-03-21 06:42:20,522][03784] Avg episode reward: [(0, '1.682')] [2024-03-21 06:42:20,541][04017] InferenceWorker_p0-w0: resuming experience collection (26900 times) [2024-03-21 06:42:20,802][04017] Updated weights for policy 0, policy_version 40775 (0.0013) [2024-03-21 06:42:25,521][03784] Fps is (10 sec: 62259.0, 60 sec: 43144.5, 300 sec: 45097.6). Total num frames: 1336246272. Throughput: 0: 45353.3. Samples: 1337610800. Policy #0 lag: (min: 0.0, avg: 48.6, max: 121.0) [2024-03-21 06:42:25,522][03784] Avg episode reward: [(0, '1.682')] [2024-03-21 06:42:29,437][04017] Updated weights for policy 0, policy_version 40785 (0.0012) [2024-03-21 06:42:30,521][03784] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 45542.0). Total num frames: 1336541184. Throughput: 0: 45288.8. Samples: 1337759700. Policy #0 lag: (min: 0.0, avg: 48.6, max: 121.0) [2024-03-21 06:42:30,522][03784] Avg episode reward: [(0, '1.737')] [2024-03-21 06:42:34,560][04017] Updated weights for policy 0, policy_version 40795 (0.0018) [2024-03-21 06:42:35,521][03784] Fps is (10 sec: 58981.6, 60 sec: 45875.1, 300 sec: 45430.9). Total num frames: 1336836096. Throughput: 0: 44926.5. Samples: 1338014600. Policy #0 lag: (min: 0.0, avg: 48.6, max: 121.0) [2024-03-21 06:42:35,522][03784] Avg episode reward: [(0, '1.629')] [2024-03-21 06:42:40,521][03784] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1336999936. Throughput: 0: 45213.5. Samples: 1338304500. Policy #0 lag: (min: 0.0, avg: 48.6, max: 121.0) [2024-03-21 06:42:40,522][03784] Avg episode reward: [(0, '1.438')] [2024-03-21 06:42:41,593][04017] Updated weights for policy 0, policy_version 40805 (0.0016) [2024-03-21 06:42:45,521][03784] Fps is (10 sec: 55706.3, 60 sec: 49152.0, 300 sec: 44875.5). Total num frames: 1337393152. Throughput: 0: 44944.4. Samples: 1338427300. Policy #0 lag: (min: 0.0, avg: 48.6, max: 121.0) [2024-03-21 06:42:45,522][03784] Avg episode reward: [(0, '1.438')] [2024-03-21 06:42:45,671][04017] Updated weights for policy 0, policy_version 40815 (0.0013) [2024-03-21 06:42:50,521][03784] Fps is (10 sec: 62258.7, 60 sec: 49151.9, 300 sec: 45653.0). Total num frames: 1337622528. Throughput: 0: 45120.0. Samples: 1338702400. Policy #0 lag: (min: 0.0, avg: 48.6, max: 121.0) [2024-03-21 06:42:50,523][03784] Avg episode reward: [(0, '1.470')] [2024-03-21 06:42:51,813][04017] Updated weights for policy 0, policy_version 40825 (0.0017) [2024-03-21 06:42:55,521][03784] Fps is (10 sec: 45875.5, 60 sec: 49698.3, 300 sec: 45542.0). Total num frames: 1337851904. Throughput: 0: 45644.6. Samples: 1338976500. Policy #0 lag: (min: 3.0, avg: 39.4, max: 71.0) [2024-03-21 06:42:55,522][03784] Avg episode reward: [(0, '0.918')] [2024-03-21 06:43:00,521][03784] Fps is (10 sec: 32768.3, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1337950208. Throughput: 0: 46135.5. Samples: 1339130600. Policy #0 lag: (min: 3.0, avg: 39.4, max: 71.0) [2024-03-21 06:43:00,522][03784] Avg episode reward: [(0, '0.918')] [2024-03-21 06:43:05,521][03784] Fps is (10 sec: 16383.8, 60 sec: 44236.9, 300 sec: 45208.7). Total num frames: 1338015744. Throughput: 0: 46711.3. Samples: 1339421300. Policy #0 lag: (min: 3.0, avg: 39.4, max: 71.0) [2024-03-21 06:43:05,522][03784] Avg episode reward: [(0, '0.805')] [2024-03-21 06:43:05,880][04017] Updated weights for policy 0, policy_version 40835 (0.0012) [2024-03-21 06:43:10,521][03784] Fps is (10 sec: 19660.9, 60 sec: 43690.8, 300 sec: 44875.5). Total num frames: 1338146816. Throughput: 0: 46626.7. Samples: 1339709000. Policy #0 lag: (min: 3.0, avg: 39.4, max: 71.0) [2024-03-21 06:43:10,522][03784] Avg episode reward: [(0, '1.463')] [2024-03-21 06:43:15,521][03784] Fps is (10 sec: 32768.1, 60 sec: 45329.0, 300 sec: 44653.3). Total num frames: 1338343424. Throughput: 0: 46582.2. Samples: 1339855900. Policy #0 lag: (min: 3.0, avg: 39.4, max: 71.0) [2024-03-21 06:43:15,522][03784] Avg episode reward: [(0, '1.538')] [2024-03-21 06:43:15,634][03995] Signal inference workers to stop experience collection... (26950 times) [2024-03-21 06:43:15,703][03995] Signal inference workers to resume experience collection... (26950 times) [2024-03-21 06:43:15,728][04017] InferenceWorker_p0-w0: stopping experience collection (26950 times) [2024-03-21 06:43:15,783][04017] InferenceWorker_p0-w0: resuming experience collection (26950 times) [2024-03-21 06:43:16,119][04017] Updated weights for policy 0, policy_version 40845 (0.0012) [2024-03-21 06:43:19,588][04017] Updated weights for policy 0, policy_version 40855 (0.0014) [2024-03-21 06:43:20,521][03784] Fps is (10 sec: 65536.0, 60 sec: 45329.2, 300 sec: 45430.9). Total num frames: 1338802176. Throughput: 0: 46273.5. Samples: 1340096900. Policy #0 lag: (min: 7.0, avg: 40.8, max: 94.0) [2024-03-21 06:43:20,522][03784] Avg episode reward: [(0, '0.803')] [2024-03-21 06:43:25,521][03784] Fps is (10 sec: 68812.1, 60 sec: 46421.2, 300 sec: 44986.6). Total num frames: 1339031552. Throughput: 0: 45968.8. Samples: 1340373100. Policy #0 lag: (min: 7.0, avg: 40.8, max: 94.0) [2024-03-21 06:43:25,522][03784] Avg episode reward: [(0, '1.535')] [2024-03-21 06:43:25,744][04017] Updated weights for policy 0, policy_version 40865 (0.0022) [2024-03-21 06:43:30,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45329.1, 300 sec: 44653.4). Total num frames: 1339260928. Throughput: 0: 46326.7. Samples: 1340512000. Policy #0 lag: (min: 7.0, avg: 40.8, max: 94.0) [2024-03-21 06:43:30,522][03784] Avg episode reward: [(0, '1.671')] [2024-03-21 06:43:33,616][04017] Updated weights for policy 0, policy_version 40875 (0.0010) [2024-03-21 06:43:35,521][03784] Fps is (10 sec: 45875.6, 60 sec: 44236.9, 300 sec: 44764.4). Total num frames: 1339490304. Throughput: 0: 46051.2. Samples: 1340774700. Policy #0 lag: (min: 7.0, avg: 40.8, max: 94.0) [2024-03-21 06:43:35,522][03784] Avg episode reward: [(0, '1.039')] [2024-03-21 06:43:39,418][04017] Updated weights for policy 0, policy_version 40885 (0.0009) [2024-03-21 06:43:40,521][03784] Fps is (10 sec: 52429.1, 60 sec: 46421.4, 300 sec: 45208.8). Total num frames: 1339785216. Throughput: 0: 46075.6. Samples: 1341049900. Policy #0 lag: (min: 7.0, avg: 40.8, max: 94.0) [2024-03-21 06:43:40,522][03784] Avg episode reward: [(0, '1.053')] [2024-03-21 06:43:43,809][04017] Updated weights for policy 0, policy_version 40895 (0.0016) [2024-03-21 06:43:45,521][03784] Fps is (10 sec: 65536.2, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1340145664. Throughput: 0: 45522.2. Samples: 1341179100. Policy #0 lag: (min: 0.0, avg: 41.6, max: 67.0) [2024-03-21 06:43:45,522][03784] Avg episode reward: [(0, '0.761')] [2024-03-21 06:43:48,747][04017] Updated weights for policy 0, policy_version 40905 (0.0018) [2024-03-21 06:43:50,521][03784] Fps is (10 sec: 62258.9, 60 sec: 46421.4, 300 sec: 45542.0). Total num frames: 1340407808. Throughput: 0: 44531.2. Samples: 1341425200. Policy #0 lag: (min: 0.0, avg: 41.6, max: 67.0) [2024-03-21 06:43:50,522][03784] Avg episode reward: [(0, '1.605')] [2024-03-21 06:43:55,521][03784] Fps is (10 sec: 39321.8, 60 sec: 44782.9, 300 sec: 45542.0). Total num frames: 1340538880. Throughput: 0: 44324.5. Samples: 1341703600. Policy #0 lag: (min: 0.0, avg: 41.6, max: 67.0) [2024-03-21 06:43:55,522][03784] Avg episode reward: [(0, '0.690')] [2024-03-21 06:44:00,521][03784] Fps is (10 sec: 19660.8, 60 sec: 44236.8, 300 sec: 45097.6). Total num frames: 1340604416. Throughput: 0: 44202.2. Samples: 1341845000. Policy #0 lag: (min: 0.0, avg: 41.6, max: 67.0) [2024-03-21 06:44:00,522][03784] Avg episode reward: [(0, '1.426')] [2024-03-21 06:44:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000040912_1340604416.pth... [2024-03-21 06:44:00,664][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000040589_1330020352.pth [2024-03-21 06:44:03,882][04017] Updated weights for policy 0, policy_version 40915 (0.0012) [2024-03-21 06:44:05,521][03784] Fps is (10 sec: 16383.8, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 1340702720. Throughput: 0: 44817.7. Samples: 1342113700. Policy #0 lag: (min: 0.0, avg: 41.6, max: 67.0) [2024-03-21 06:44:05,522][03784] Avg episode reward: [(0, '1.109')] [2024-03-21 06:44:05,906][03995] Signal inference workers to stop experience collection... (27000 times) [2024-03-21 06:44:05,961][04017] InferenceWorker_p0-w0: stopping experience collection (27000 times) [2024-03-21 06:44:06,201][03995] Signal inference workers to resume experience collection... (27000 times) [2024-03-21 06:44:06,201][04017] InferenceWorker_p0-w0: resuming experience collection (27000 times) [2024-03-21 06:44:10,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46967.4, 300 sec: 44431.2). Total num frames: 1340964864. Throughput: 0: 44166.8. Samples: 1342360600. Policy #0 lag: (min: 0.0, avg: 32.1, max: 84.0) [2024-03-21 06:44:10,531][03784] Avg episode reward: [(0, '1.674')] [2024-03-21 06:44:11,220][04017] Updated weights for policy 0, policy_version 40925 (0.0011) [2024-03-21 06:44:15,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46967.4, 300 sec: 44875.5). Total num frames: 1341161472. Throughput: 0: 44075.5. Samples: 1342495400. Policy #0 lag: (min: 0.0, avg: 32.1, max: 84.0) [2024-03-21 06:44:15,522][03784] Avg episode reward: [(0, '1.348')] [2024-03-21 06:44:20,521][03784] Fps is (10 sec: 36044.8, 60 sec: 42052.2, 300 sec: 44320.1). Total num frames: 1341325312. Throughput: 0: 44328.9. Samples: 1342769500. Policy #0 lag: (min: 0.0, avg: 32.1, max: 84.0) [2024-03-21 06:44:20,522][03784] Avg episode reward: [(0, '1.728')] [2024-03-21 06:44:21,978][04017] Updated weights for policy 0, policy_version 40935 (0.0020) [2024-03-21 06:44:25,521][03784] Fps is (10 sec: 29491.5, 60 sec: 40414.0, 300 sec: 44542.3). Total num frames: 1341456384. Throughput: 0: 44151.1. Samples: 1343036700. Policy #0 lag: (min: 0.0, avg: 32.1, max: 84.0) [2024-03-21 06:44:25,522][03784] Avg episode reward: [(0, '1.424')] [2024-03-21 06:44:28,774][04017] Updated weights for policy 0, policy_version 40945 (0.0014) [2024-03-21 06:44:30,521][03784] Fps is (10 sec: 52428.4, 60 sec: 43144.5, 300 sec: 45319.8). Total num frames: 1341849600. Throughput: 0: 44084.4. Samples: 1343162900. Policy #0 lag: (min: 0.0, avg: 32.1, max: 84.0) [2024-03-21 06:44:30,522][03784] Avg episode reward: [(0, '1.286')] [2024-03-21 06:44:33,190][04017] Updated weights for policy 0, policy_version 40955 (0.0018) [2024-03-21 06:44:35,521][03784] Fps is (10 sec: 68812.7, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 1342144512. Throughput: 0: 44271.1. Samples: 1343417400. Policy #0 lag: (min: 0.0, avg: 32.1, max: 84.0) [2024-03-21 06:44:35,522][03784] Avg episode reward: [(0, '1.231')] [2024-03-21 06:44:38,795][04017] Updated weights for policy 0, policy_version 40965 (0.0012) [2024-03-21 06:44:40,521][03784] Fps is (10 sec: 58983.0, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 1342439424. Throughput: 0: 44195.5. Samples: 1343692400. Policy #0 lag: (min: 0.0, avg: 35.0, max: 86.0) [2024-03-21 06:44:40,522][03784] Avg episode reward: [(0, '0.628')] [2024-03-21 06:44:44,463][04017] Updated weights for policy 0, policy_version 40975 (0.0018) [2024-03-21 06:44:45,521][03784] Fps is (10 sec: 52428.9, 60 sec: 42052.3, 300 sec: 45097.7). Total num frames: 1342668800. Throughput: 0: 44260.0. Samples: 1343836700. Policy #0 lag: (min: 0.0, avg: 35.0, max: 86.0) [2024-03-21 06:44:45,522][03784] Avg episode reward: [(0, '1.391')] [2024-03-21 06:44:50,499][04017] Updated weights for policy 0, policy_version 40985 (0.0011) [2024-03-21 06:44:50,521][03784] Fps is (10 sec: 55705.3, 60 sec: 43144.5, 300 sec: 45208.7). Total num frames: 1342996480. Throughput: 0: 43840.0. Samples: 1344086500. Policy #0 lag: (min: 0.0, avg: 35.0, max: 86.0) [2024-03-21 06:44:50,522][03784] Avg episode reward: [(0, '0.554')] [2024-03-21 06:44:55,521][03784] Fps is (10 sec: 58982.4, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 1343258624. Throughput: 0: 44720.0. Samples: 1344373000. Policy #0 lag: (min: 0.0, avg: 35.0, max: 86.0) [2024-03-21 06:44:55,522][03784] Avg episode reward: [(0, '0.951')] [2024-03-21 06:44:58,561][04017] Updated weights for policy 0, policy_version 40995 (0.0023) [2024-03-21 06:45:00,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45875.2, 300 sec: 44542.3). Total num frames: 1343356928. Throughput: 0: 45231.2. Samples: 1344530800. Policy #0 lag: (min: 0.0, avg: 35.0, max: 86.0) [2024-03-21 06:45:00,522][03784] Avg episode reward: [(0, '0.951')] [2024-03-21 06:45:01,231][03995] Signal inference workers to stop experience collection... (27050 times) [2024-03-21 06:45:01,232][03995] Signal inference workers to resume experience collection... (27050 times) [2024-03-21 06:45:01,295][04017] InferenceWorker_p0-w0: stopping experience collection (27050 times) [2024-03-21 06:45:01,295][04017] InferenceWorker_p0-w0: resuming experience collection (27050 times) [2024-03-21 06:45:05,521][03784] Fps is (10 sec: 26214.3, 60 sec: 46967.5, 300 sec: 44431.2). Total num frames: 1343520768. Throughput: 0: 45611.1. Samples: 1344822000. Policy #0 lag: (min: 0.0, avg: 36.2, max: 77.0) [2024-03-21 06:45:05,522][03784] Avg episode reward: [(0, '0.951')] [2024-03-21 06:45:07,866][04017] Updated weights for policy 0, policy_version 41005 (0.0013) [2024-03-21 06:45:10,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45875.2, 300 sec: 44542.3). Total num frames: 1343717376. Throughput: 0: 45831.1. Samples: 1345099100. Policy #0 lag: (min: 0.0, avg: 36.2, max: 77.0) [2024-03-21 06:45:10,522][03784] Avg episode reward: [(0, '1.649')] [2024-03-21 06:45:15,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45329.2, 300 sec: 44653.4). Total num frames: 1343881216. Throughput: 0: 46169.0. Samples: 1345240500. Policy #0 lag: (min: 0.0, avg: 36.2, max: 77.0) [2024-03-21 06:45:15,522][03784] Avg episode reward: [(0, '0.840')] [2024-03-21 06:45:17,372][04017] Updated weights for policy 0, policy_version 41015 (0.0012) [2024-03-21 06:45:20,521][03784] Fps is (10 sec: 45875.5, 60 sec: 47513.6, 300 sec: 45097.7). Total num frames: 1344176128. Throughput: 0: 45937.8. Samples: 1345484600. Policy #0 lag: (min: 0.0, avg: 36.2, max: 77.0) [2024-03-21 06:45:20,522][03784] Avg episode reward: [(0, '0.759')] [2024-03-21 06:45:22,683][04017] Updated weights for policy 0, policy_version 41025 (0.0011) [2024-03-21 06:45:25,521][03784] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 44986.6). Total num frames: 1344405504. Throughput: 0: 45584.5. Samples: 1345743700. Policy #0 lag: (min: 0.0, avg: 36.2, max: 77.0) [2024-03-21 06:45:25,522][03784] Avg episode reward: [(0, '1.372')] [2024-03-21 06:45:27,724][04017] Updated weights for policy 0, policy_version 41035 (0.0021) [2024-03-21 06:45:30,521][03784] Fps is (10 sec: 55705.0, 60 sec: 48059.7, 300 sec: 45875.2). Total num frames: 1344733184. Throughput: 0: 44815.4. Samples: 1345853400. Policy #0 lag: (min: 0.0, avg: 36.2, max: 77.0) [2024-03-21 06:45:30,522][03784] Avg episode reward: [(0, '0.743')] [2024-03-21 06:45:33,898][04017] Updated weights for policy 0, policy_version 41045 (0.0012) [2024-03-21 06:45:35,521][03784] Fps is (10 sec: 58982.0, 60 sec: 47513.6, 300 sec: 45764.1). Total num frames: 1344995328. Throughput: 0: 45164.5. Samples: 1346118900. Policy #0 lag: (min: 1.0, avg: 38.2, max: 74.0) [2024-03-21 06:45:35,522][03784] Avg episode reward: [(0, '1.319')] [2024-03-21 06:45:40,521][03784] Fps is (10 sec: 39322.0, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 1345126400. Throughput: 0: 44604.4. Samples: 1346380200. Policy #0 lag: (min: 1.0, avg: 38.2, max: 74.0) [2024-03-21 06:45:40,522][03784] Avg episode reward: [(0, '1.141')] [2024-03-21 06:45:45,521][03784] Fps is (10 sec: 16384.1, 60 sec: 41506.1, 300 sec: 45319.8). Total num frames: 1345159168. Throughput: 0: 44377.8. Samples: 1346527800. Policy #0 lag: (min: 1.0, avg: 38.2, max: 74.0) [2024-03-21 06:45:45,522][03784] Avg episode reward: [(0, '1.084')] [2024-03-21 06:45:50,521][03784] Fps is (10 sec: 13107.1, 60 sec: 37683.2, 300 sec: 44542.2). Total num frames: 1345257472. Throughput: 0: 44506.6. Samples: 1346824800. Policy #0 lag: (min: 1.0, avg: 38.2, max: 74.0) [2024-03-21 06:45:50,522][03784] Avg episode reward: [(0, '0.531')] [2024-03-21 06:45:50,792][04017] Updated weights for policy 0, policy_version 41055 (0.0024) [2024-03-21 06:45:54,238][04017] Updated weights for policy 0, policy_version 41065 (0.0015) [2024-03-21 06:45:55,521][03784] Fps is (10 sec: 52428.4, 60 sec: 40413.8, 300 sec: 44986.6). Total num frames: 1345683456. Throughput: 0: 44277.8. Samples: 1347091600. Policy #0 lag: (min: 1.0, avg: 38.2, max: 74.0) [2024-03-21 06:45:55,522][03784] Avg episode reward: [(0, '0.911')] [2024-03-21 06:46:00,521][03784] Fps is (10 sec: 62259.4, 60 sec: 42052.3, 300 sec: 45097.7). Total num frames: 1345880064. Throughput: 0: 44295.5. Samples: 1347233800. Policy #0 lag: (min: 0.0, avg: 36.6, max: 111.0) [2024-03-21 06:46:00,522][03784] Avg episode reward: [(0, '0.592')] [2024-03-21 06:46:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000041073_1345880064.pth... [2024-03-21 06:46:00,657][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000040748_1335230464.pth [2024-03-21 06:46:02,246][04017] Updated weights for policy 0, policy_version 41075 (0.0011) [2024-03-21 06:46:03,077][03995] Signal inference workers to stop experience collection... (27100 times) [2024-03-21 06:46:03,149][03995] Signal inference workers to resume experience collection... (27100 times) [2024-03-21 06:46:03,154][04017] InferenceWorker_p0-w0: stopping experience collection (27100 times) [2024-03-21 06:46:03,212][04017] InferenceWorker_p0-w0: resuming experience collection (27100 times) [2024-03-21 06:46:05,521][03784] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 44653.3). Total num frames: 1346109440. Throughput: 0: 45175.6. Samples: 1347517500. Policy #0 lag: (min: 0.0, avg: 36.6, max: 111.0) [2024-03-21 06:46:05,522][03784] Avg episode reward: [(0, '1.402')] [2024-03-21 06:46:09,339][04017] Updated weights for policy 0, policy_version 41085 (0.0020) [2024-03-21 06:46:10,521][03784] Fps is (10 sec: 49151.8, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 1346371584. Throughput: 0: 45386.6. Samples: 1347786100. Policy #0 lag: (min: 0.0, avg: 36.6, max: 111.0) [2024-03-21 06:46:10,522][03784] Avg episode reward: [(0, '1.132')] [2024-03-21 06:46:12,770][04017] Updated weights for policy 0, policy_version 41095 (0.0027) [2024-03-21 06:46:15,521][03784] Fps is (10 sec: 72088.7, 60 sec: 49151.9, 300 sec: 45319.8). Total num frames: 1346830336. Throughput: 0: 45480.0. Samples: 1347900000. Policy #0 lag: (min: 0.0, avg: 36.6, max: 111.0) [2024-03-21 06:46:15,522][03784] Avg episode reward: [(0, '1.065')] [2024-03-21 06:46:18,069][04017] Updated weights for policy 0, policy_version 41105 (0.0015) [2024-03-21 06:46:20,521][03784] Fps is (10 sec: 75366.4, 60 sec: 49152.0, 300 sec: 45653.0). Total num frames: 1347125248. Throughput: 0: 45202.2. Samples: 1348153000. Policy #0 lag: (min: 0.0, avg: 36.6, max: 111.0) [2024-03-21 06:46:20,522][03784] Avg episode reward: [(0, '1.317')] [2024-03-21 06:46:22,957][04017] Updated weights for policy 0, policy_version 41115 (0.0016) [2024-03-21 06:46:25,521][03784] Fps is (10 sec: 58983.2, 60 sec: 50244.3, 300 sec: 45653.1). Total num frames: 1347420160. Throughput: 0: 45671.1. Samples: 1348435400. Policy #0 lag: (min: 0.0, avg: 60.2, max: 121.0) [2024-03-21 06:46:25,522][03784] Avg episode reward: [(0, '1.317')] [2024-03-21 06:46:28,456][04017] Updated weights for policy 0, policy_version 41125 (0.0016) [2024-03-21 06:46:30,521][03784] Fps is (10 sec: 45875.2, 60 sec: 47513.7, 300 sec: 45764.1). Total num frames: 1347584000. Throughput: 0: 45506.6. Samples: 1348575600. Policy #0 lag: (min: 0.0, avg: 60.2, max: 121.0) [2024-03-21 06:46:30,522][03784] Avg episode reward: [(0, '1.217')] [2024-03-21 06:46:35,521][03784] Fps is (10 sec: 19660.6, 60 sec: 43690.6, 300 sec: 45097.6). Total num frames: 1347616768. Throughput: 0: 45437.7. Samples: 1348869500. Policy #0 lag: (min: 0.0, avg: 60.2, max: 121.0) [2024-03-21 06:46:35,522][03784] Avg episode reward: [(0, '0.828')] [2024-03-21 06:46:40,521][03784] Fps is (10 sec: 26214.5, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1347846144. Throughput: 0: 45317.8. Samples: 1349130900. Policy #0 lag: (min: 0.0, avg: 60.2, max: 121.0) [2024-03-21 06:46:40,522][03784] Avg episode reward: [(0, '0.668')] [2024-03-21 06:46:42,628][04017] Updated weights for policy 0, policy_version 41135 (0.0011) [2024-03-21 06:46:45,521][03784] Fps is (10 sec: 39321.7, 60 sec: 47513.5, 300 sec: 45208.7). Total num frames: 1348009984. Throughput: 0: 45051.1. Samples: 1349261100. Policy #0 lag: (min: 0.0, avg: 60.2, max: 121.0) [2024-03-21 06:46:45,522][03784] Avg episode reward: [(0, '0.689')] [2024-03-21 06:46:50,521][03784] Fps is (10 sec: 22937.6, 60 sec: 46967.5, 300 sec: 44764.4). Total num frames: 1348075520. Throughput: 0: 44911.1. Samples: 1349538500. Policy #0 lag: (min: 0.0, avg: 60.2, max: 121.0) [2024-03-21 06:46:50,522][03784] Avg episode reward: [(0, '0.930')] [2024-03-21 06:46:53,927][04017] Updated weights for policy 0, policy_version 41145 (0.0016) [2024-03-21 06:46:55,521][03784] Fps is (10 sec: 26213.8, 60 sec: 43144.4, 300 sec: 44431.1). Total num frames: 1348272128. Throughput: 0: 44270.9. Samples: 1349778300. Policy #0 lag: (min: 2.0, avg: 33.8, max: 85.0) [2024-03-21 06:46:55,523][03784] Avg episode reward: [(0, '1.321')] [2024-03-21 06:46:58,066][03995] Signal inference workers to stop experience collection... (27150 times) [2024-03-21 06:46:58,137][03995] Signal inference workers to resume experience collection... (27150 times) [2024-03-21 06:46:58,148][04017] InferenceWorker_p0-w0: stopping experience collection (27150 times) [2024-03-21 06:46:58,206][04017] InferenceWorker_p0-w0: resuming experience collection (27150 times) [2024-03-21 06:47:00,264][04017] Updated weights for policy 0, policy_version 41155 (0.0028) [2024-03-21 06:47:00,521][03784] Fps is (10 sec: 49152.2, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 1348567040. Throughput: 0: 44680.1. Samples: 1349910600. Policy #0 lag: (min: 2.0, avg: 33.8, max: 85.0) [2024-03-21 06:47:00,522][03784] Avg episode reward: [(0, '1.321')] [2024-03-21 06:47:05,521][03784] Fps is (10 sec: 58984.3, 60 sec: 45875.2, 300 sec: 45208.8). Total num frames: 1348861952. Throughput: 0: 44546.7. Samples: 1350157600. Policy #0 lag: (min: 2.0, avg: 33.8, max: 85.0) [2024-03-21 06:47:05,522][03784] Avg episode reward: [(0, '1.623')] [2024-03-21 06:47:06,320][04017] Updated weights for policy 0, policy_version 41165 (0.0014) [2024-03-21 06:47:10,521][03784] Fps is (10 sec: 52428.7, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 1349091328. Throughput: 0: 44175.5. Samples: 1350423300. Policy #0 lag: (min: 2.0, avg: 33.8, max: 85.0) [2024-03-21 06:47:10,522][03784] Avg episode reward: [(0, '0.877')] [2024-03-21 06:47:15,051][04017] Updated weights for policy 0, policy_version 41175 (0.0015) [2024-03-21 06:47:15,521][03784] Fps is (10 sec: 36044.5, 60 sec: 39867.8, 300 sec: 44542.3). Total num frames: 1349222400. Throughput: 0: 44295.5. Samples: 1350568900. Policy #0 lag: (min: 2.0, avg: 33.8, max: 85.0) [2024-03-21 06:47:15,522][03784] Avg episode reward: [(0, '1.090')] [2024-03-21 06:47:19,786][04017] Updated weights for policy 0, policy_version 41185 (0.0026) [2024-03-21 06:47:20,521][03784] Fps is (10 sec: 52428.6, 60 sec: 41506.1, 300 sec: 45319.8). Total num frames: 1349615616. Throughput: 0: 43513.4. Samples: 1350827600. Policy #0 lag: (min: 2.0, avg: 33.1, max: 65.0) [2024-03-21 06:47:20,522][03784] Avg episode reward: [(0, '1.213')] [2024-03-21 06:47:25,409][04017] Updated weights for policy 0, policy_version 41195 (0.0011) [2024-03-21 06:47:25,521][03784] Fps is (10 sec: 65536.7, 60 sec: 40960.0, 300 sec: 45208.7). Total num frames: 1349877760. Throughput: 0: 43635.6. Samples: 1351094500. Policy #0 lag: (min: 2.0, avg: 33.1, max: 65.0) [2024-03-21 06:47:25,522][03784] Avg episode reward: [(0, '1.213')] [2024-03-21 06:47:30,521][03784] Fps is (10 sec: 52428.9, 60 sec: 42598.4, 300 sec: 45097.7). Total num frames: 1350139904. Throughput: 0: 44157.8. Samples: 1351248200. Policy #0 lag: (min: 2.0, avg: 33.1, max: 65.0) [2024-03-21 06:47:30,522][03784] Avg episode reward: [(0, '0.637')] [2024-03-21 06:47:33,505][04017] Updated weights for policy 0, policy_version 41205 (0.0012) [2024-03-21 06:47:35,521][03784] Fps is (10 sec: 49151.7, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1350369280. Throughput: 0: 43855.5. Samples: 1351512000. Policy #0 lag: (min: 2.0, avg: 33.1, max: 65.0) [2024-03-21 06:47:35,522][03784] Avg episode reward: [(0, '1.632')] [2024-03-21 06:47:39,240][04017] Updated weights for policy 0, policy_version 41215 (0.0024) [2024-03-21 06:47:40,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 1350533120. Throughput: 0: 44426.9. Samples: 1351777500. Policy #0 lag: (min: 2.0, avg: 33.1, max: 65.0) [2024-03-21 06:47:40,522][03784] Avg episode reward: [(0, '1.435')] [2024-03-21 06:47:43,874][03995] Signal inference workers to stop experience collection... (27200 times) [2024-03-21 06:47:43,947][04017] InferenceWorker_p0-w0: stopping experience collection (27200 times) [2024-03-21 06:47:43,948][03995] Signal inference workers to resume experience collection... (27200 times) [2024-03-21 06:47:43,993][04017] InferenceWorker_p0-w0: resuming experience collection (27200 times) [2024-03-21 06:47:45,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46421.3, 300 sec: 44653.4). Total num frames: 1350795264. Throughput: 0: 44877.7. Samples: 1351930100. Policy #0 lag: (min: 0.0, avg: 38.9, max: 79.0) [2024-03-21 06:47:45,522][03784] Avg episode reward: [(0, '1.333')] [2024-03-21 06:47:46,088][04017] Updated weights for policy 0, policy_version 41225 (0.0012) [2024-03-21 06:47:50,521][03784] Fps is (10 sec: 39321.3, 60 sec: 47513.5, 300 sec: 44320.1). Total num frames: 1350926336. Throughput: 0: 45199.9. Samples: 1352191600. Policy #0 lag: (min: 0.0, avg: 38.9, max: 79.0) [2024-03-21 06:47:50,522][03784] Avg episode reward: [(0, '1.115')] [2024-03-21 06:47:55,521][03784] Fps is (10 sec: 32767.9, 60 sec: 47513.7, 300 sec: 44653.3). Total num frames: 1351122944. Throughput: 0: 45366.6. Samples: 1352464800. Policy #0 lag: (min: 0.0, avg: 38.9, max: 79.0) [2024-03-21 06:47:55,522][03784] Avg episode reward: [(0, '1.362')] [2024-03-21 06:47:57,128][04017] Updated weights for policy 0, policy_version 41235 (0.0028) [2024-03-21 06:48:00,521][03784] Fps is (10 sec: 42598.7, 60 sec: 46421.3, 300 sec: 45208.7). Total num frames: 1351352320. Throughput: 0: 45117.8. Samples: 1352599200. Policy #0 lag: (min: 0.0, avg: 38.9, max: 79.0) [2024-03-21 06:48:00,522][03784] Avg episode reward: [(0, '1.099')] [2024-03-21 06:48:00,618][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000041241_1351385088.pth... [2024-03-21 06:48:00,748][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000040912_1340604416.pth [2024-03-21 06:48:03,676][04017] Updated weights for policy 0, policy_version 41245 (0.0018) [2024-03-21 06:48:05,521][03784] Fps is (10 sec: 42598.8, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 1351548928. Throughput: 0: 45255.6. Samples: 1352864100. Policy #0 lag: (min: 0.0, avg: 38.9, max: 79.0) [2024-03-21 06:48:05,522][03784] Avg episode reward: [(0, '0.445')] [2024-03-21 06:48:10,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1351712768. Throughput: 0: 45308.8. Samples: 1353133400. Policy #0 lag: (min: 0.0, avg: 38.9, max: 79.0) [2024-03-21 06:48:10,522][03784] Avg episode reward: [(0, '1.630')] [2024-03-21 06:48:12,914][04017] Updated weights for policy 0, policy_version 41255 (0.0015) [2024-03-21 06:48:15,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46421.4, 300 sec: 44764.4). Total num frames: 1352007680. Throughput: 0: 45211.1. Samples: 1353282700. Policy #0 lag: (min: 0.0, avg: 32.1, max: 76.0) [2024-03-21 06:48:15,522][03784] Avg episode reward: [(0, '1.483')] [2024-03-21 06:48:18,134][04017] Updated weights for policy 0, policy_version 41265 (0.0010) [2024-03-21 06:48:20,521][03784] Fps is (10 sec: 62259.2, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 1352335360. Throughput: 0: 45037.8. Samples: 1353538700. Policy #0 lag: (min: 0.0, avg: 32.1, max: 76.0) [2024-03-21 06:48:20,522][03784] Avg episode reward: [(0, '1.483')] [2024-03-21 06:48:23,473][04017] Updated weights for policy 0, policy_version 41275 (0.0010) [2024-03-21 06:48:25,521][03784] Fps is (10 sec: 62258.7, 60 sec: 45875.1, 300 sec: 45319.8). Total num frames: 1352630272. Throughput: 0: 45213.2. Samples: 1353812100. Policy #0 lag: (min: 0.0, avg: 32.1, max: 76.0) [2024-03-21 06:48:25,523][03784] Avg episode reward: [(0, '1.483')] [2024-03-21 06:48:30,521][03784] Fps is (10 sec: 45874.7, 60 sec: 44236.7, 300 sec: 45097.6). Total num frames: 1352794112. Throughput: 0: 45006.6. Samples: 1353955400. Policy #0 lag: (min: 0.0, avg: 32.1, max: 76.0) [2024-03-21 06:48:30,522][03784] Avg episode reward: [(0, '1.361')] [2024-03-21 06:48:30,748][04017] Updated weights for policy 0, policy_version 41285 (0.0016) [2024-03-21 06:48:35,521][03784] Fps is (10 sec: 32768.4, 60 sec: 43144.6, 300 sec: 44653.3). Total num frames: 1352957952. Throughput: 0: 45075.7. Samples: 1354220000. Policy #0 lag: (min: 0.0, avg: 32.1, max: 76.0) [2024-03-21 06:48:35,522][03784] Avg episode reward: [(0, '1.014')] [2024-03-21 06:48:38,959][04017] Updated weights for policy 0, policy_version 41295 (0.0015) [2024-03-21 06:48:40,521][03784] Fps is (10 sec: 39322.1, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 1353187328. Throughput: 0: 45006.8. Samples: 1354490100. Policy #0 lag: (min: 0.0, avg: 40.9, max: 87.0) [2024-03-21 06:48:40,522][03784] Avg episode reward: [(0, '1.239')] [2024-03-21 06:48:45,521][03784] Fps is (10 sec: 49151.7, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 1353449472. Throughput: 0: 45006.6. Samples: 1354624500. Policy #0 lag: (min: 0.0, avg: 40.9, max: 87.0) [2024-03-21 06:48:45,522][03784] Avg episode reward: [(0, '1.417')] [2024-03-21 06:48:46,014][04017] Updated weights for policy 0, policy_version 41305 (0.0012) [2024-03-21 06:48:48,338][03995] Signal inference workers to stop experience collection... (27250 times) [2024-03-21 06:48:48,458][03995] Signal inference workers to resume experience collection... (27250 times) [2024-03-21 06:48:48,525][04017] InferenceWorker_p0-w0: stopping experience collection (27250 times) [2024-03-21 06:48:48,572][04017] InferenceWorker_p0-w0: resuming experience collection (27250 times) [2024-03-21 06:48:50,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1353646080. Throughput: 0: 45033.3. Samples: 1354890600. Policy #0 lag: (min: 0.0, avg: 40.9, max: 87.0) [2024-03-21 06:48:50,522][03784] Avg episode reward: [(0, '1.320')] [2024-03-21 06:48:52,093][04017] Updated weights for policy 0, policy_version 41315 (0.0012) [2024-03-21 06:48:55,521][03784] Fps is (10 sec: 52429.1, 60 sec: 47513.7, 300 sec: 45319.8). Total num frames: 1353973760. Throughput: 0: 44944.5. Samples: 1355155900. Policy #0 lag: (min: 0.0, avg: 40.9, max: 87.0) [2024-03-21 06:48:55,522][03784] Avg episode reward: [(0, '0.533')] [2024-03-21 06:49:00,521][03784] Fps is (10 sec: 45874.9, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1354104832. Throughput: 0: 44755.5. Samples: 1355296700. Policy #0 lag: (min: 0.0, avg: 40.9, max: 87.0) [2024-03-21 06:49:00,522][03784] Avg episode reward: [(0, '0.882')] [2024-03-21 06:49:00,800][04017] Updated weights for policy 0, policy_version 41325 (0.0012) [2024-03-21 06:49:05,521][03784] Fps is (10 sec: 36044.7, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1354334208. Throughput: 0: 44875.5. Samples: 1355558100. Policy #0 lag: (min: 1.0, avg: 30.5, max: 69.0) [2024-03-21 06:49:05,530][03784] Avg episode reward: [(0, '1.407')] [2024-03-21 06:49:09,125][04017] Updated weights for policy 0, policy_version 41335 (0.0012) [2024-03-21 06:49:10,521][03784] Fps is (10 sec: 42598.6, 60 sec: 46967.5, 300 sec: 45319.8). Total num frames: 1354530816. Throughput: 0: 44684.5. Samples: 1355822900. Policy #0 lag: (min: 1.0, avg: 30.5, max: 69.0) [2024-03-21 06:49:10,530][03784] Avg episode reward: [(0, '1.023')] [2024-03-21 06:49:14,725][04017] Updated weights for policy 0, policy_version 41345 (0.0011) [2024-03-21 06:49:15,521][03784] Fps is (10 sec: 52429.0, 60 sec: 47513.6, 300 sec: 45875.2). Total num frames: 1354858496. Throughput: 0: 44500.2. Samples: 1355957900. Policy #0 lag: (min: 1.0, avg: 30.5, max: 69.0) [2024-03-21 06:49:15,522][03784] Avg episode reward: [(0, '0.794')] [2024-03-21 06:49:20,521][03784] Fps is (10 sec: 52429.0, 60 sec: 45329.1, 300 sec: 46097.4). Total num frames: 1355055104. Throughput: 0: 44813.3. Samples: 1356236600. Policy #0 lag: (min: 1.0, avg: 30.5, max: 69.0) [2024-03-21 06:49:20,522][03784] Avg episode reward: [(0, '0.712')] [2024-03-21 06:49:25,521][03784] Fps is (10 sec: 22937.5, 60 sec: 40960.1, 300 sec: 44875.5). Total num frames: 1355087872. Throughput: 0: 45313.3. Samples: 1356529200. Policy #0 lag: (min: 1.0, avg: 30.5, max: 69.0) [2024-03-21 06:49:25,522][03784] Avg episode reward: [(0, '1.180')] [2024-03-21 06:49:27,709][04017] Updated weights for policy 0, policy_version 41355 (0.0012) [2024-03-21 06:49:30,521][03784] Fps is (10 sec: 19660.7, 60 sec: 40960.0, 300 sec: 44431.2). Total num frames: 1355251712. Throughput: 0: 45724.4. Samples: 1356682100. Policy #0 lag: (min: 1.0, avg: 30.5, max: 69.0) [2024-03-21 06:49:30,522][03784] Avg episode reward: [(0, '1.303')] [2024-03-21 06:49:33,104][04017] Updated weights for policy 0, policy_version 41365 (0.0016) [2024-03-21 06:49:35,521][03784] Fps is (10 sec: 52428.8, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 1355612160. Throughput: 0: 45713.4. Samples: 1356947700. Policy #0 lag: (min: 1.0, avg: 25.5, max: 68.0) [2024-03-21 06:49:35,522][03784] Avg episode reward: [(0, '1.373')] [2024-03-21 06:49:40,091][04017] Updated weights for policy 0, policy_version 41375 (0.0011) [2024-03-21 06:49:40,521][03784] Fps is (10 sec: 55705.4, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 1355808768. Throughput: 0: 45619.9. Samples: 1357208800. Policy #0 lag: (min: 1.0, avg: 25.5, max: 68.0) [2024-03-21 06:49:40,522][03784] Avg episode reward: [(0, '0.919')] [2024-03-21 06:49:44,210][03995] Signal inference workers to stop experience collection... (27300 times) [2024-03-21 06:49:44,283][03995] Signal inference workers to resume experience collection... (27300 times) [2024-03-21 06:49:44,304][04017] InferenceWorker_p0-w0: stopping experience collection (27300 times) [2024-03-21 06:49:44,459][04017] InferenceWorker_p0-w0: resuming experience collection (27300 times) [2024-03-21 06:49:44,976][04017] Updated weights for policy 0, policy_version 41385 (0.0012) [2024-03-21 06:49:45,521][03784] Fps is (10 sec: 52428.6, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 1356136448. Throughput: 0: 45580.1. Samples: 1357347800. Policy #0 lag: (min: 1.0, avg: 25.5, max: 68.0) [2024-03-21 06:49:45,523][03784] Avg episode reward: [(0, '1.169')] [2024-03-21 06:49:49,285][04017] Updated weights for policy 0, policy_version 41395 (0.0021) [2024-03-21 06:49:50,521][03784] Fps is (10 sec: 72090.0, 60 sec: 48059.7, 300 sec: 44986.6). Total num frames: 1356529664. Throughput: 0: 45680.0. Samples: 1357613700. Policy #0 lag: (min: 1.0, avg: 25.5, max: 68.0) [2024-03-21 06:49:50,522][03784] Avg episode reward: [(0, '1.169')] [2024-03-21 06:49:55,521][03784] Fps is (10 sec: 55705.3, 60 sec: 45329.0, 300 sec: 45208.7). Total num frames: 1356693504. Throughput: 0: 46099.9. Samples: 1357897400. Policy #0 lag: (min: 1.0, avg: 25.5, max: 68.0) [2024-03-21 06:49:55,522][03784] Avg episode reward: [(0, '0.488')] [2024-03-21 06:49:56,332][04017] Updated weights for policy 0, policy_version 41405 (0.0015) [2024-03-21 06:50:00,521][03784] Fps is (10 sec: 49151.9, 60 sec: 48605.9, 300 sec: 45764.1). Total num frames: 1357021184. Throughput: 0: 46099.9. Samples: 1358032400. Policy #0 lag: (min: 0.0, avg: 42.9, max: 80.0) [2024-03-21 06:50:00,522][03784] Avg episode reward: [(0, '1.214')] [2024-03-21 06:50:00,816][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000041414_1357053952.pth... [2024-03-21 06:50:00,929][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000041073_1345880064.pth [2024-03-21 06:50:02,571][04017] Updated weights for policy 0, policy_version 41415 (0.0015) [2024-03-21 06:50:05,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46967.5, 300 sec: 45542.0). Total num frames: 1357152256. Throughput: 0: 46215.5. Samples: 1358316300. Policy #0 lag: (min: 0.0, avg: 42.9, max: 80.0) [2024-03-21 06:50:05,522][03784] Avg episode reward: [(0, '0.740')] [2024-03-21 06:50:10,078][04017] Updated weights for policy 0, policy_version 41425 (0.0011) [2024-03-21 06:50:10,521][03784] Fps is (10 sec: 39321.6, 60 sec: 48059.7, 300 sec: 45875.2). Total num frames: 1357414400. Throughput: 0: 45406.6. Samples: 1358572500. Policy #0 lag: (min: 0.0, avg: 42.9, max: 80.0) [2024-03-21 06:50:10,522][03784] Avg episode reward: [(0, '1.332')] [2024-03-21 06:50:15,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44236.7, 300 sec: 45208.7). Total num frames: 1357512704. Throughput: 0: 45051.1. Samples: 1358709400. Policy #0 lag: (min: 0.0, avg: 42.9, max: 80.0) [2024-03-21 06:50:15,522][03784] Avg episode reward: [(0, '1.311')] [2024-03-21 06:50:20,521][03784] Fps is (10 sec: 29491.0, 60 sec: 44236.7, 300 sec: 45097.6). Total num frames: 1357709312. Throughput: 0: 44908.8. Samples: 1358968600. Policy #0 lag: (min: 0.0, avg: 42.9, max: 80.0) [2024-03-21 06:50:20,522][03784] Avg episode reward: [(0, '0.434')] [2024-03-21 06:50:21,738][04017] Updated weights for policy 0, policy_version 41435 (0.0012) [2024-03-21 06:50:25,521][03784] Fps is (10 sec: 39321.7, 60 sec: 46967.4, 300 sec: 44653.4). Total num frames: 1357905920. Throughput: 0: 43964.5. Samples: 1359187200. Policy #0 lag: (min: 1.0, avg: 29.9, max: 72.0) [2024-03-21 06:50:25,522][03784] Avg episode reward: [(0, '0.907')] [2024-03-21 06:50:28,378][04017] Updated weights for policy 0, policy_version 41445 (0.0016) [2024-03-21 06:50:30,521][03784] Fps is (10 sec: 36045.0, 60 sec: 46967.5, 300 sec: 44320.1). Total num frames: 1358069760. Throughput: 0: 43482.2. Samples: 1359304500. Policy #0 lag: (min: 1.0, avg: 29.9, max: 72.0) [2024-03-21 06:50:30,522][03784] Avg episode reward: [(0, '0.522')] [2024-03-21 06:50:35,521][03784] Fps is (10 sec: 32768.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1358233600. Throughput: 0: 43240.0. Samples: 1359559500. Policy #0 lag: (min: 1.0, avg: 29.9, max: 72.0) [2024-03-21 06:50:35,522][03784] Avg episode reward: [(0, '1.345')] [2024-03-21 06:50:39,894][04017] Updated weights for policy 0, policy_version 41455 (0.0017) [2024-03-21 06:50:40,521][03784] Fps is (10 sec: 36044.9, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 1358430208. Throughput: 0: 43271.2. Samples: 1359844600. Policy #0 lag: (min: 1.0, avg: 29.9, max: 72.0) [2024-03-21 06:50:40,522][03784] Avg episode reward: [(0, '1.141')] [2024-03-21 06:50:44,960][03995] Signal inference workers to stop experience collection... (27350 times) [2024-03-21 06:50:45,032][03995] Signal inference workers to resume experience collection... (27350 times) [2024-03-21 06:50:45,051][04017] InferenceWorker_p0-w0: stopping experience collection (27350 times) [2024-03-21 06:50:45,110][04017] InferenceWorker_p0-w0: resuming experience collection (27350 times) [2024-03-21 06:50:45,521][03784] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 45430.9). Total num frames: 1358659584. Throughput: 0: 43313.3. Samples: 1359981500. Policy #0 lag: (min: 1.0, avg: 29.9, max: 72.0) [2024-03-21 06:50:45,522][03784] Avg episode reward: [(0, '0.497')] [2024-03-21 06:50:48,345][04017] Updated weights for policy 0, policy_version 41465 (0.0012) [2024-03-21 06:50:50,521][03784] Fps is (10 sec: 39321.5, 60 sec: 38229.3, 300 sec: 44542.3). Total num frames: 1358823424. Throughput: 0: 43708.9. Samples: 1360283200. Policy #0 lag: (min: 1.0, avg: 29.9, max: 72.0) [2024-03-21 06:50:50,522][03784] Avg episode reward: [(0, '0.497')] [2024-03-21 06:50:54,359][04017] Updated weights for policy 0, policy_version 41475 (0.0016) [2024-03-21 06:50:55,521][03784] Fps is (10 sec: 45875.7, 60 sec: 40413.9, 300 sec: 44875.5). Total num frames: 1359118336. Throughput: 0: 43986.7. Samples: 1360551900. Policy #0 lag: (min: 0.0, avg: 48.7, max: 114.0) [2024-03-21 06:50:55,522][03784] Avg episode reward: [(0, '1.500')] [2024-03-21 06:51:00,521][03784] Fps is (10 sec: 55706.3, 60 sec: 39321.7, 300 sec: 44986.6). Total num frames: 1359380480. Throughput: 0: 44120.1. Samples: 1360694800. Policy #0 lag: (min: 0.0, avg: 48.7, max: 114.0) [2024-03-21 06:51:00,522][03784] Avg episode reward: [(0, '1.500')] [2024-03-21 06:51:00,528][04017] Updated weights for policy 0, policy_version 41485 (0.0012) [2024-03-21 06:51:03,957][04017] Updated weights for policy 0, policy_version 41495 (0.0012) [2024-03-21 06:51:05,521][03784] Fps is (10 sec: 68812.4, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 1359806464. Throughput: 0: 44066.7. Samples: 1360951600. Policy #0 lag: (min: 0.0, avg: 48.7, max: 114.0) [2024-03-21 06:51:05,522][03784] Avg episode reward: [(0, '1.474')] [2024-03-21 06:51:10,521][03784] Fps is (10 sec: 62259.5, 60 sec: 43144.7, 300 sec: 44653.4). Total num frames: 1360003072. Throughput: 0: 45377.9. Samples: 1361229200. Policy #0 lag: (min: 0.0, avg: 48.7, max: 114.0) [2024-03-21 06:51:10,521][03784] Avg episode reward: [(0, '1.244')] [2024-03-21 06:51:10,545][04017] Updated weights for policy 0, policy_version 41505 (0.0010) [2024-03-21 06:51:15,501][04017] Updated weights for policy 0, policy_version 41515 (0.0010) [2024-03-21 06:51:15,521][03784] Fps is (10 sec: 55705.7, 60 sec: 47513.6, 300 sec: 44875.5). Total num frames: 1360363520. Throughput: 0: 45748.9. Samples: 1361363200. Policy #0 lag: (min: 0.0, avg: 48.7, max: 114.0) [2024-03-21 06:51:15,522][03784] Avg episode reward: [(0, '0.656')] [2024-03-21 06:51:20,521][03784] Fps is (10 sec: 45874.6, 60 sec: 45875.3, 300 sec: 44209.0). Total num frames: 1360461824. Throughput: 0: 46208.9. Samples: 1361638900. Policy #0 lag: (min: 0.0, avg: 42.3, max: 114.0) [2024-03-21 06:51:20,522][03784] Avg episode reward: [(0, '0.682')] [2024-03-21 06:51:25,521][03784] Fps is (10 sec: 29490.9, 60 sec: 45875.1, 300 sec: 44320.1). Total num frames: 1360658432. Throughput: 0: 46199.9. Samples: 1361923600. Policy #0 lag: (min: 0.0, avg: 42.3, max: 114.0) [2024-03-21 06:51:25,522][03784] Avg episode reward: [(0, '0.991')] [2024-03-21 06:51:25,813][04017] Updated weights for policy 0, policy_version 41525 (0.0011) [2024-03-21 06:51:30,521][03784] Fps is (10 sec: 49152.2, 60 sec: 48059.8, 300 sec: 45208.7). Total num frames: 1360953344. Throughput: 0: 46029.0. Samples: 1362052800. Policy #0 lag: (min: 0.0, avg: 42.3, max: 114.0) [2024-03-21 06:51:30,522][03784] Avg episode reward: [(0, '0.900')] [2024-03-21 06:51:33,287][04017] Updated weights for policy 0, policy_version 41535 (0.0012) [2024-03-21 06:51:33,331][03995] Signal inference workers to stop experience collection... (27400 times) [2024-03-21 06:51:33,394][04017] InferenceWorker_p0-w0: stopping experience collection (27400 times) [2024-03-21 06:51:33,610][03995] Signal inference workers to resume experience collection... (27400 times) [2024-03-21 06:51:33,611][04017] InferenceWorker_p0-w0: resuming experience collection (27400 times) [2024-03-21 06:51:35,521][03784] Fps is (10 sec: 49152.6, 60 sec: 48605.9, 300 sec: 45097.6). Total num frames: 1361149952. Throughput: 0: 44960.0. Samples: 1362306400. Policy #0 lag: (min: 0.0, avg: 42.3, max: 114.0) [2024-03-21 06:51:35,522][03784] Avg episode reward: [(0, '1.025')] [2024-03-21 06:51:40,521][03784] Fps is (10 sec: 32768.0, 60 sec: 47513.6, 300 sec: 44986.6). Total num frames: 1361281024. Throughput: 0: 45026.7. Samples: 1362578100. Policy #0 lag: (min: 0.0, avg: 42.3, max: 114.0) [2024-03-21 06:51:40,522][03784] Avg episode reward: [(0, '1.038')] [2024-03-21 06:51:42,765][04017] Updated weights for policy 0, policy_version 41545 (0.0011) [2024-03-21 06:51:45,521][03784] Fps is (10 sec: 36044.9, 60 sec: 47513.7, 300 sec: 45542.0). Total num frames: 1361510400. Throughput: 0: 44893.2. Samples: 1362715000. Policy #0 lag: (min: 0.0, avg: 42.3, max: 114.0) [2024-03-21 06:51:45,522][03784] Avg episode reward: [(0, '1.470')] [2024-03-21 06:51:50,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46421.4, 300 sec: 45208.8). Total num frames: 1361608704. Throughput: 0: 45006.7. Samples: 1362976900. Policy #0 lag: (min: 0.0, avg: 34.3, max: 75.0) [2024-03-21 06:51:50,522][03784] Avg episode reward: [(0, '1.662')] [2024-03-21 06:51:52,128][04017] Updated weights for policy 0, policy_version 41555 (0.0017) [2024-03-21 06:51:55,521][03784] Fps is (10 sec: 32768.0, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 1361838080. Throughput: 0: 44348.8. Samples: 1363224900. Policy #0 lag: (min: 0.0, avg: 34.3, max: 75.0) [2024-03-21 06:51:55,522][03784] Avg episode reward: [(0, '1.288')] [2024-03-21 06:52:00,521][03784] Fps is (10 sec: 36044.6, 60 sec: 43144.4, 300 sec: 44431.2). Total num frames: 1361969152. Throughput: 0: 44522.2. Samples: 1363366700. Policy #0 lag: (min: 0.0, avg: 34.3, max: 75.0) [2024-03-21 06:52:00,522][03784] Avg episode reward: [(0, '1.343')] [2024-03-21 06:52:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000041564_1361969152.pth... [2024-03-21 06:52:00,677][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000041241_1351385088.pth [2024-03-21 06:52:01,225][04017] Updated weights for policy 0, policy_version 41565 (0.0018) [2024-03-21 06:52:05,521][03784] Fps is (10 sec: 32767.9, 60 sec: 39321.6, 300 sec: 44320.1). Total num frames: 1362165760. Throughput: 0: 44595.5. Samples: 1363645700. Policy #0 lag: (min: 0.0, avg: 34.3, max: 75.0) [2024-03-21 06:52:05,522][03784] Avg episode reward: [(0, '1.076')] [2024-03-21 06:52:07,353][04017] Updated weights for policy 0, policy_version 41575 (0.0019) [2024-03-21 06:52:10,521][03784] Fps is (10 sec: 52429.2, 60 sec: 41506.1, 300 sec: 44986.6). Total num frames: 1362493440. Throughput: 0: 44166.8. Samples: 1363911100. Policy #0 lag: (min: 0.0, avg: 34.3, max: 75.0) [2024-03-21 06:52:10,522][03784] Avg episode reward: [(0, '1.619')] [2024-03-21 06:52:12,054][04017] Updated weights for policy 0, policy_version 41585 (0.0011) [2024-03-21 06:52:15,521][03784] Fps is (10 sec: 68813.2, 60 sec: 41506.2, 300 sec: 44875.5). Total num frames: 1362853888. Throughput: 0: 44042.2. Samples: 1364034700. Policy #0 lag: (min: 1.0, avg: 38.2, max: 72.0) [2024-03-21 06:52:15,522][03784] Avg episode reward: [(0, '1.286')] [2024-03-21 06:52:16,794][04017] Updated weights for policy 0, policy_version 41595 (0.0016) [2024-03-21 06:52:20,521][03784] Fps is (10 sec: 65536.0, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 1363148800. Throughput: 0: 44466.7. Samples: 1364307400. Policy #0 lag: (min: 1.0, avg: 38.2, max: 72.0) [2024-03-21 06:52:20,522][03784] Avg episode reward: [(0, '0.632')] [2024-03-21 06:52:25,521][03784] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 1363279872. Throughput: 0: 44582.2. Samples: 1364584300. Policy #0 lag: (min: 1.0, avg: 38.2, max: 72.0) [2024-03-21 06:52:25,522][03784] Avg episode reward: [(0, '1.260')] [2024-03-21 06:52:25,958][04017] Updated weights for policy 0, policy_version 41605 (0.0011) [2024-03-21 06:52:28,968][03995] Signal inference workers to stop experience collection... (27450 times) [2024-03-21 06:52:29,036][03995] Signal inference workers to resume experience collection... (27450 times) [2024-03-21 06:52:29,071][04017] InferenceWorker_p0-w0: stopping experience collection (27450 times) [2024-03-21 06:52:29,118][04017] InferenceWorker_p0-w0: resuming experience collection (27450 times) [2024-03-21 06:52:30,521][03784] Fps is (10 sec: 45875.2, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 1363607552. Throughput: 0: 44342.2. Samples: 1364710400. Policy #0 lag: (min: 1.0, avg: 38.2, max: 72.0) [2024-03-21 06:52:30,522][03784] Avg episode reward: [(0, '1.813')] [2024-03-21 06:52:31,029][04017] Updated weights for policy 0, policy_version 41615 (0.0018) [2024-03-21 06:52:35,521][03784] Fps is (10 sec: 55706.1, 60 sec: 44783.0, 300 sec: 45097.7). Total num frames: 1363836928. Throughput: 0: 45057.8. Samples: 1365004500. Policy #0 lag: (min: 1.0, avg: 38.2, max: 72.0) [2024-03-21 06:52:35,521][03784] Avg episode reward: [(0, '1.813')] [2024-03-21 06:52:37,380][04017] Updated weights for policy 0, policy_version 41625 (0.0009) [2024-03-21 06:52:40,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46967.5, 300 sec: 45097.7). Total num frames: 1364099072. Throughput: 0: 45368.9. Samples: 1365266500. Policy #0 lag: (min: 1.0, avg: 38.2, max: 72.0) [2024-03-21 06:52:40,522][03784] Avg episode reward: [(0, '0.855')] [2024-03-21 06:52:45,521][03784] Fps is (10 sec: 42598.0, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1364262912. Throughput: 0: 45246.7. Samples: 1365402800. Policy #0 lag: (min: 0.0, avg: 43.8, max: 89.0) [2024-03-21 06:52:45,522][03784] Avg episode reward: [(0, '1.559')] [2024-03-21 06:52:45,901][04017] Updated weights for policy 0, policy_version 41635 (0.0016) [2024-03-21 06:52:50,521][03784] Fps is (10 sec: 26214.3, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1364361216. Throughput: 0: 45077.8. Samples: 1365674200. Policy #0 lag: (min: 0.0, avg: 43.8, max: 89.0) [2024-03-21 06:52:50,522][03784] Avg episode reward: [(0, '1.253')] [2024-03-21 06:52:55,521][03784] Fps is (10 sec: 32768.0, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1364590592. Throughput: 0: 45417.7. Samples: 1365954900. Policy #0 lag: (min: 0.0, avg: 43.8, max: 89.0) [2024-03-21 06:52:55,522][03784] Avg episode reward: [(0, '1.406')] [2024-03-21 06:52:55,806][04017] Updated weights for policy 0, policy_version 41645 (0.0010) [2024-03-21 06:53:00,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46967.5, 300 sec: 44875.5). Total num frames: 1364787200. Throughput: 0: 45853.3. Samples: 1366098100. Policy #0 lag: (min: 0.0, avg: 43.8, max: 89.0) [2024-03-21 06:53:00,522][03784] Avg episode reward: [(0, '1.171')] [2024-03-21 06:53:05,521][03784] Fps is (10 sec: 32768.1, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1364918272. Throughput: 0: 46373.3. Samples: 1366394200. Policy #0 lag: (min: 0.0, avg: 43.8, max: 89.0) [2024-03-21 06:53:05,522][03784] Avg episode reward: [(0, '0.994')] [2024-03-21 06:53:05,777][04017] Updated weights for policy 0, policy_version 41655 (0.0012) [2024-03-21 06:53:10,521][03784] Fps is (10 sec: 49152.7, 60 sec: 46421.4, 300 sec: 44986.6). Total num frames: 1365278720. Throughput: 0: 45569.0. Samples: 1366634900. Policy #0 lag: (min: 2.0, avg: 35.7, max: 78.0) [2024-03-21 06:53:10,521][03784] Avg episode reward: [(0, '1.400')] [2024-03-21 06:53:10,526][04017] Updated weights for policy 0, policy_version 41665 (0.0014) [2024-03-21 06:53:15,521][03784] Fps is (10 sec: 58982.5, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 1365508096. Throughput: 0: 45720.0. Samples: 1366767800. Policy #0 lag: (min: 2.0, avg: 35.7, max: 78.0) [2024-03-21 06:53:15,522][03784] Avg episode reward: [(0, '1.281')] [2024-03-21 06:53:16,270][04017] Updated weights for policy 0, policy_version 41675 (0.0018) [2024-03-21 06:53:20,521][03784] Fps is (10 sec: 62258.5, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 1365901312. Throughput: 0: 45162.2. Samples: 1367036800. Policy #0 lag: (min: 2.0, avg: 35.7, max: 78.0) [2024-03-21 06:53:20,522][03784] Avg episode reward: [(0, '1.281')] [2024-03-21 06:53:20,589][04017] Updated weights for policy 0, policy_version 41685 (0.0012) [2024-03-21 06:53:21,200][03995] Signal inference workers to stop experience collection... (27500 times) [2024-03-21 06:53:21,280][04017] InferenceWorker_p0-w0: stopping experience collection (27500 times) [2024-03-21 06:53:21,511][03995] Signal inference workers to resume experience collection... (27500 times) [2024-03-21 06:53:21,512][04017] InferenceWorker_p0-w0: resuming experience collection (27500 times) [2024-03-21 06:53:25,521][03784] Fps is (10 sec: 68812.9, 60 sec: 48605.9, 300 sec: 45430.9). Total num frames: 1366196224. Throughput: 0: 45182.2. Samples: 1367299700. Policy #0 lag: (min: 2.0, avg: 35.7, max: 78.0) [2024-03-21 06:53:25,522][03784] Avg episode reward: [(0, '0.705')] [2024-03-21 06:53:26,719][04017] Updated weights for policy 0, policy_version 41695 (0.0017) [2024-03-21 06:53:30,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1366327296. Throughput: 0: 45342.3. Samples: 1367443200. Policy #0 lag: (min: 2.0, avg: 35.7, max: 78.0) [2024-03-21 06:53:30,521][03784] Avg episode reward: [(0, '0.924')] [2024-03-21 06:53:35,521][03784] Fps is (10 sec: 32768.0, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 1366523904. Throughput: 0: 45244.5. Samples: 1367710200. Policy #0 lag: (min: 2.0, avg: 35.7, max: 78.0) [2024-03-21 06:53:35,522][03784] Avg episode reward: [(0, '1.446')] [2024-03-21 06:53:39,154][04017] Updated weights for policy 0, policy_version 41705 (0.0015) [2024-03-21 06:53:40,521][03784] Fps is (10 sec: 32767.9, 60 sec: 42598.4, 300 sec: 44764.4). Total num frames: 1366654976. Throughput: 0: 45222.3. Samples: 1367989900. Policy #0 lag: (min: 0.0, avg: 34.3, max: 98.0) [2024-03-21 06:53:40,522][03784] Avg episode reward: [(0, '0.788')] [2024-03-21 06:53:45,521][03784] Fps is (10 sec: 26214.5, 60 sec: 42052.3, 300 sec: 44542.3). Total num frames: 1366786048. Throughput: 0: 45006.8. Samples: 1368123400. Policy #0 lag: (min: 0.0, avg: 34.3, max: 98.0) [2024-03-21 06:53:45,522][03784] Avg episode reward: [(0, '1.095')] [2024-03-21 06:53:48,038][04017] Updated weights for policy 0, policy_version 41715 (0.0011) [2024-03-21 06:53:50,521][03784] Fps is (10 sec: 42598.3, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1367080960. Throughput: 0: 44302.2. Samples: 1368387800. Policy #0 lag: (min: 0.0, avg: 34.3, max: 98.0) [2024-03-21 06:53:50,522][03784] Avg episode reward: [(0, '1.130')] [2024-03-21 06:53:54,379][04017] Updated weights for policy 0, policy_version 41725 (0.0020) [2024-03-21 06:53:55,521][03784] Fps is (10 sec: 52428.3, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1367310336. Throughput: 0: 44777.6. Samples: 1368649900. Policy #0 lag: (min: 0.0, avg: 34.3, max: 98.0) [2024-03-21 06:53:55,522][03784] Avg episode reward: [(0, '1.171')] [2024-03-21 06:53:59,283][04017] Updated weights for policy 0, policy_version 41735 (0.0021) [2024-03-21 06:54:00,521][03784] Fps is (10 sec: 55705.2, 60 sec: 47513.5, 300 sec: 45097.6). Total num frames: 1367638016. Throughput: 0: 44817.7. Samples: 1368784600. Policy #0 lag: (min: 0.0, avg: 34.3, max: 98.0) [2024-03-21 06:54:00,522][03784] Avg episode reward: [(0, '0.763')] [2024-03-21 06:54:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000041737_1367638016.pth... [2024-03-21 06:54:00,657][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000041414_1357053952.pth [2024-03-21 06:54:05,266][04017] Updated weights for policy 0, policy_version 41745 (0.0016) [2024-03-21 06:54:05,521][03784] Fps is (10 sec: 58982.6, 60 sec: 49698.1, 300 sec: 45319.8). Total num frames: 1367900160. Throughput: 0: 44860.0. Samples: 1369055500. Policy #0 lag: (min: 2.0, avg: 34.1, max: 68.0) [2024-03-21 06:54:05,522][03784] Avg episode reward: [(0, '0.920')] [2024-03-21 06:54:10,521][03784] Fps is (10 sec: 52429.2, 60 sec: 48059.6, 300 sec: 45097.6). Total num frames: 1368162304. Throughput: 0: 44748.9. Samples: 1369313400. Policy #0 lag: (min: 2.0, avg: 34.1, max: 68.0) [2024-03-21 06:54:10,523][03784] Avg episode reward: [(0, '1.295')] [2024-03-21 06:54:13,151][04017] Updated weights for policy 0, policy_version 41755 (0.0011) [2024-03-21 06:54:15,521][03784] Fps is (10 sec: 45875.3, 60 sec: 47513.6, 300 sec: 45097.7). Total num frames: 1368358912. Throughput: 0: 44784.4. Samples: 1369458500. Policy #0 lag: (min: 2.0, avg: 34.1, max: 68.0) [2024-03-21 06:54:15,522][03784] Avg episode reward: [(0, '1.409')] [2024-03-21 06:54:19,061][04017] Updated weights for policy 0, policy_version 41765 (0.0016) [2024-03-21 06:54:20,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 45653.0). Total num frames: 1368555520. Throughput: 0: 44733.3. Samples: 1369723200. Policy #0 lag: (min: 2.0, avg: 34.1, max: 68.0) [2024-03-21 06:54:20,522][03784] Avg episode reward: [(0, '1.425')] [2024-03-21 06:54:25,521][03784] Fps is (10 sec: 29491.1, 60 sec: 40960.0, 300 sec: 45430.9). Total num frames: 1368653824. Throughput: 0: 45295.6. Samples: 1370028200. Policy #0 lag: (min: 2.0, avg: 34.1, max: 68.0) [2024-03-21 06:54:25,522][03784] Avg episode reward: [(0, '1.425')] [2024-03-21 06:54:30,402][03995] Signal inference workers to stop experience collection... (27550 times) [2024-03-21 06:54:30,403][03995] Signal inference workers to resume experience collection... (27550 times) [2024-03-21 06:54:30,453][04017] InferenceWorker_p0-w0: stopping experience collection (27550 times) [2024-03-21 06:54:30,454][04017] InferenceWorker_p0-w0: resuming experience collection (27550 times) [2024-03-21 06:54:30,521][03784] Fps is (10 sec: 22937.5, 60 sec: 40959.9, 300 sec: 44653.3). Total num frames: 1368784896. Throughput: 0: 45444.3. Samples: 1370168400. Policy #0 lag: (min: 2.0, avg: 34.1, max: 68.0) [2024-03-21 06:54:30,522][03784] Avg episode reward: [(0, '0.663')] [2024-03-21 06:54:31,408][04017] Updated weights for policy 0, policy_version 41775 (0.0011) [2024-03-21 06:54:35,521][03784] Fps is (10 sec: 32768.0, 60 sec: 40960.0, 300 sec: 44653.4). Total num frames: 1368981504. Throughput: 0: 46028.9. Samples: 1370459100. Policy #0 lag: (min: 3.0, avg: 38.6, max: 74.0) [2024-03-21 06:54:35,522][03784] Avg episode reward: [(0, '0.639')] [2024-03-21 06:54:40,521][03784] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 44209.0). Total num frames: 1369178112. Throughput: 0: 46500.0. Samples: 1370742400. Policy #0 lag: (min: 3.0, avg: 38.6, max: 74.0) [2024-03-21 06:54:40,522][03784] Avg episode reward: [(0, '1.330')] [2024-03-21 06:54:40,724][04017] Updated weights for policy 0, policy_version 41785 (0.0016) [2024-03-21 06:54:45,521][03784] Fps is (10 sec: 49152.1, 60 sec: 44782.9, 300 sec: 43875.8). Total num frames: 1369473024. Throughput: 0: 46224.6. Samples: 1370864700. Policy #0 lag: (min: 3.0, avg: 38.6, max: 74.0) [2024-03-21 06:54:45,522][03784] Avg episode reward: [(0, '0.820')] [2024-03-21 06:54:46,420][04017] Updated weights for policy 0, policy_version 41795 (0.0017) [2024-03-21 06:54:50,521][03784] Fps is (10 sec: 65535.6, 60 sec: 45875.1, 300 sec: 44542.3). Total num frames: 1369833472. Throughput: 0: 46033.2. Samples: 1371127000. Policy #0 lag: (min: 3.0, avg: 38.6, max: 74.0) [2024-03-21 06:54:50,523][03784] Avg episode reward: [(0, '0.621')] [2024-03-21 06:54:50,662][04017] Updated weights for policy 0, policy_version 41805 (0.0025) [2024-03-21 06:54:53,922][04017] Updated weights for policy 0, policy_version 41815 (0.0020) [2024-03-21 06:54:55,521][03784] Fps is (10 sec: 81920.0, 60 sec: 49698.2, 300 sec: 44986.6). Total num frames: 1370292224. Throughput: 0: 45404.5. Samples: 1371356600. Policy #0 lag: (min: 3.0, avg: 38.6, max: 74.0) [2024-03-21 06:54:55,522][03784] Avg episode reward: [(0, '0.936')] [2024-03-21 06:54:57,993][04017] Updated weights for policy 0, policy_version 41825 (0.0012) [2024-03-21 06:55:00,521][03784] Fps is (10 sec: 78643.8, 60 sec: 49698.2, 300 sec: 45653.0). Total num frames: 1370619904. Throughput: 0: 44991.0. Samples: 1371483100. Policy #0 lag: (min: 3.0, avg: 48.6, max: 85.0) [2024-03-21 06:55:00,522][03784] Avg episode reward: [(0, '0.935')] [2024-03-21 06:55:05,521][03784] Fps is (10 sec: 42598.1, 60 sec: 46967.4, 300 sec: 45097.7). Total num frames: 1370718208. Throughput: 0: 45353.3. Samples: 1371764100. Policy #0 lag: (min: 3.0, avg: 48.6, max: 85.0) [2024-03-21 06:55:05,522][03784] Avg episode reward: [(0, '1.413')] [2024-03-21 06:55:08,979][04017] Updated weights for policy 0, policy_version 41835 (0.0011) [2024-03-21 06:55:10,521][03784] Fps is (10 sec: 29491.3, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1370914816. Throughput: 0: 44115.5. Samples: 1372013400. Policy #0 lag: (min: 3.0, avg: 48.6, max: 85.0) [2024-03-21 06:55:10,522][03784] Avg episode reward: [(0, '1.407')] [2024-03-21 06:55:15,527][03784] Fps is (10 sec: 36024.6, 60 sec: 45324.8, 300 sec: 45319.0). Total num frames: 1371078656. Throughput: 0: 43759.0. Samples: 1372137800. Policy #0 lag: (min: 3.0, avg: 48.6, max: 85.0) [2024-03-21 06:55:15,527][03784] Avg episode reward: [(0, '0.464')] [2024-03-21 06:55:18,615][04017] Updated weights for policy 0, policy_version 41845 (0.0021) [2024-03-21 06:55:20,521][03784] Fps is (10 sec: 26214.2, 60 sec: 43690.6, 300 sec: 44986.6). Total num frames: 1371176960. Throughput: 0: 42862.1. Samples: 1372387900. Policy #0 lag: (min: 3.0, avg: 48.6, max: 85.0) [2024-03-21 06:55:20,522][03784] Avg episode reward: [(0, '1.150')] [2024-03-21 06:55:25,521][03784] Fps is (10 sec: 9836.0, 60 sec: 42052.3, 300 sec: 44431.2). Total num frames: 1371176960. Throughput: 0: 42524.5. Samples: 1372656000. Policy #0 lag: (min: 3.0, avg: 48.6, max: 85.0) [2024-03-21 06:55:25,522][03784] Avg episode reward: [(0, '1.328')] [2024-03-21 06:55:27,888][03995] Signal inference workers to stop experience collection... (27600 times) [2024-03-21 06:55:27,888][03995] Signal inference workers to resume experience collection... (27600 times) [2024-03-21 06:55:27,972][04017] InferenceWorker_p0-w0: stopping experience collection (27600 times) [2024-03-21 06:55:27,972][04017] InferenceWorker_p0-w0: resuming experience collection (27600 times) [2024-03-21 06:55:30,521][03784] Fps is (10 sec: 19661.0, 60 sec: 43144.6, 300 sec: 44542.3). Total num frames: 1371373568. Throughput: 0: 42733.3. Samples: 1372787700. Policy #0 lag: (min: 0.0, avg: 26.9, max: 131.0) [2024-03-21 06:55:30,522][03784] Avg episode reward: [(0, '1.871')] [2024-03-21 06:55:32,672][04017] Updated weights for policy 0, policy_version 41855 (0.0016) [2024-03-21 06:55:35,521][03784] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 44542.3). Total num frames: 1371570176. Throughput: 0: 43160.1. Samples: 1373069200. Policy #0 lag: (min: 0.0, avg: 26.9, max: 131.0) [2024-03-21 06:55:35,522][03784] Avg episode reward: [(0, '1.153')] [2024-03-21 06:55:39,315][04017] Updated weights for policy 0, policy_version 41865 (0.0015) [2024-03-21 06:55:40,521][03784] Fps is (10 sec: 55704.5, 60 sec: 45875.1, 300 sec: 44986.6). Total num frames: 1371930624. Throughput: 0: 44046.5. Samples: 1373338700. Policy #0 lag: (min: 0.0, avg: 26.9, max: 131.0) [2024-03-21 06:55:40,522][03784] Avg episode reward: [(0, '0.859')] [2024-03-21 06:55:43,543][04017] Updated weights for policy 0, policy_version 41875 (0.0011) [2024-03-21 06:55:45,521][03784] Fps is (10 sec: 68812.7, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1372258304. Throughput: 0: 44144.5. Samples: 1373469600. Policy #0 lag: (min: 0.0, avg: 26.9, max: 131.0) [2024-03-21 06:55:45,522][03784] Avg episode reward: [(0, '0.812')] [2024-03-21 06:55:50,521][03784] Fps is (10 sec: 52429.5, 60 sec: 43690.7, 300 sec: 45208.7). Total num frames: 1372454912. Throughput: 0: 44106.6. Samples: 1373748900. Policy #0 lag: (min: 0.0, avg: 26.9, max: 131.0) [2024-03-21 06:55:50,522][03784] Avg episode reward: [(0, '0.577')] [2024-03-21 06:55:50,952][04017] Updated weights for policy 0, policy_version 41885 (0.0016) [2024-03-21 06:55:54,932][04017] Updated weights for policy 0, policy_version 41895 (0.0012) [2024-03-21 06:55:55,521][03784] Fps is (10 sec: 58982.2, 60 sec: 42598.3, 300 sec: 45653.0). Total num frames: 1372848128. Throughput: 0: 44757.7. Samples: 1374027500. Policy #0 lag: (min: 5.0, avg: 52.6, max: 123.0) [2024-03-21 06:55:55,522][03784] Avg episode reward: [(0, '0.936')] [2024-03-21 06:56:00,521][03784] Fps is (10 sec: 62259.4, 60 sec: 40960.0, 300 sec: 44986.6). Total num frames: 1373077504. Throughput: 0: 45210.1. Samples: 1374172000. Policy #0 lag: (min: 5.0, avg: 52.6, max: 123.0) [2024-03-21 06:56:00,522][03784] Avg episode reward: [(0, '1.035')] [2024-03-21 06:56:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000041903_1373077504.pth... [2024-03-21 06:56:00,663][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000041564_1361969152.pth [2024-03-21 06:56:01,461][04017] Updated weights for policy 0, policy_version 41905 (0.0012) [2024-03-21 06:56:05,521][03784] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 44986.5). Total num frames: 1373274112. Throughput: 0: 45551.1. Samples: 1374437700. Policy #0 lag: (min: 5.0, avg: 52.6, max: 123.0) [2024-03-21 06:56:05,522][03784] Avg episode reward: [(0, '1.109')] [2024-03-21 06:56:10,521][03784] Fps is (10 sec: 36044.6, 60 sec: 42052.2, 300 sec: 44320.1). Total num frames: 1373437952. Throughput: 0: 45395.5. Samples: 1374698800. Policy #0 lag: (min: 5.0, avg: 52.6, max: 123.0) [2024-03-21 06:56:10,522][03784] Avg episode reward: [(0, '0.679')] [2024-03-21 06:56:13,695][04017] Updated weights for policy 0, policy_version 41915 (0.0012) [2024-03-21 06:56:13,836][03995] Signal inference workers to stop experience collection... (27650 times) [2024-03-21 06:56:13,889][04017] InferenceWorker_p0-w0: stopping experience collection (27650 times) [2024-03-21 06:56:14,143][03995] Signal inference workers to resume experience collection... (27650 times) [2024-03-21 06:56:14,143][04017] InferenceWorker_p0-w0: resuming experience collection (27650 times) [2024-03-21 06:56:15,521][03784] Fps is (10 sec: 29491.6, 60 sec: 41510.1, 300 sec: 44431.2). Total num frames: 1373569024. Throughput: 0: 45824.5. Samples: 1374849800. Policy #0 lag: (min: 5.0, avg: 52.6, max: 123.0) [2024-03-21 06:56:15,522][03784] Avg episode reward: [(0, '1.364')] [2024-03-21 06:56:19,564][04017] Updated weights for policy 0, policy_version 41925 (0.0019) [2024-03-21 06:56:20,521][03784] Fps is (10 sec: 42598.8, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 1373863936. Throughput: 0: 45444.5. Samples: 1375114200. Policy #0 lag: (min: 5.0, avg: 52.6, max: 123.0) [2024-03-21 06:56:20,522][03784] Avg episode reward: [(0, '1.517')] [2024-03-21 06:56:23,824][04017] Updated weights for policy 0, policy_version 41935 (0.0012) [2024-03-21 06:56:25,521][03784] Fps is (10 sec: 62259.0, 60 sec: 50244.3, 300 sec: 44875.5). Total num frames: 1374191616. Throughput: 0: 45418.0. Samples: 1375382500. Policy #0 lag: (min: 5.0, avg: 54.4, max: 112.0) [2024-03-21 06:56:25,522][03784] Avg episode reward: [(0, '0.809')] [2024-03-21 06:56:30,521][03784] Fps is (10 sec: 42598.1, 60 sec: 48605.8, 300 sec: 44542.3). Total num frames: 1374289920. Throughput: 0: 45920.0. Samples: 1375536000. Policy #0 lag: (min: 5.0, avg: 54.4, max: 112.0) [2024-03-21 06:56:30,522][03784] Avg episode reward: [(0, '0.809')] [2024-03-21 06:56:31,945][04017] Updated weights for policy 0, policy_version 41945 (0.0011) [2024-03-21 06:56:35,521][03784] Fps is (10 sec: 42597.7, 60 sec: 50790.3, 300 sec: 45208.7). Total num frames: 1374617600. Throughput: 0: 45453.2. Samples: 1375794300. Policy #0 lag: (min: 5.0, avg: 54.4, max: 112.0) [2024-03-21 06:56:35,522][03784] Avg episode reward: [(0, '1.412')] [2024-03-21 06:56:39,128][04017] Updated weights for policy 0, policy_version 41955 (0.0011) [2024-03-21 06:56:40,521][03784] Fps is (10 sec: 52429.0, 60 sec: 48059.9, 300 sec: 45097.7). Total num frames: 1374814208. Throughput: 0: 45528.9. Samples: 1376076300. Policy #0 lag: (min: 5.0, avg: 54.4, max: 112.0) [2024-03-21 06:56:40,522][03784] Avg episode reward: [(0, '1.156')] [2024-03-21 06:56:45,521][03784] Fps is (10 sec: 39322.4, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1375010816. Throughput: 0: 45540.0. Samples: 1376221300. Policy #0 lag: (min: 5.0, avg: 54.4, max: 112.0) [2024-03-21 06:56:45,522][03784] Avg episode reward: [(0, '1.156')] [2024-03-21 06:56:47,796][04017] Updated weights for policy 0, policy_version 41965 (0.0016) [2024-03-21 06:56:50,521][03784] Fps is (10 sec: 32768.0, 60 sec: 44783.0, 300 sec: 45097.7). Total num frames: 1375141888. Throughput: 0: 46106.8. Samples: 1376512500. Policy #0 lag: (min: 1.0, avg: 32.0, max: 73.0) [2024-03-21 06:56:50,522][03784] Avg episode reward: [(0, '1.353')] [2024-03-21 06:56:54,039][04017] Updated weights for policy 0, policy_version 41975 (0.0013) [2024-03-21 06:56:55,521][03784] Fps is (10 sec: 45875.4, 60 sec: 43690.8, 300 sec: 45764.1). Total num frames: 1375469568. Throughput: 0: 45880.1. Samples: 1376763400. Policy #0 lag: (min: 1.0, avg: 32.0, max: 73.0) [2024-03-21 06:56:55,522][03784] Avg episode reward: [(0, '1.165')] [2024-03-21 06:57:00,521][03784] Fps is (10 sec: 49152.0, 60 sec: 42598.4, 300 sec: 45653.0). Total num frames: 1375633408. Throughput: 0: 45826.6. Samples: 1376912000. Policy #0 lag: (min: 1.0, avg: 32.0, max: 73.0) [2024-03-21 06:57:00,522][03784] Avg episode reward: [(0, '1.317')] [2024-03-21 06:57:02,911][04017] Updated weights for policy 0, policy_version 41985 (0.0022) [2024-03-21 06:57:03,734][03995] Signal inference workers to stop experience collection... (27700 times) [2024-03-21 06:57:03,755][04017] InferenceWorker_p0-w0: stopping experience collection (27700 times) [2024-03-21 06:57:03,967][03995] Signal inference workers to resume experience collection... (27700 times) [2024-03-21 06:57:03,968][04017] InferenceWorker_p0-w0: resuming experience collection (27700 times) [2024-03-21 06:57:05,521][03784] Fps is (10 sec: 52428.2, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 1375993856. Throughput: 0: 45759.9. Samples: 1377173400. Policy #0 lag: (min: 1.0, avg: 32.0, max: 73.0) [2024-03-21 06:57:05,522][03784] Avg episode reward: [(0, '1.253')] [2024-03-21 06:57:07,796][04017] Updated weights for policy 0, policy_version 41995 (0.0013) [2024-03-21 06:57:10,521][03784] Fps is (10 sec: 55705.6, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1376190464. Throughput: 0: 46191.1. Samples: 1377461100. Policy #0 lag: (min: 1.0, avg: 32.0, max: 73.0) [2024-03-21 06:57:10,522][03784] Avg episode reward: [(0, '1.253')] [2024-03-21 06:57:15,521][03784] Fps is (10 sec: 29491.3, 60 sec: 45329.0, 300 sec: 44542.3). Total num frames: 1376288768. Throughput: 0: 46188.9. Samples: 1377614500. Policy #0 lag: (min: 0.0, avg: 45.8, max: 114.0) [2024-03-21 06:57:15,522][03784] Avg episode reward: [(0, '1.304')] [2024-03-21 06:57:19,951][04017] Updated weights for policy 0, policy_version 42005 (0.0011) [2024-03-21 06:57:20,521][03784] Fps is (10 sec: 26214.1, 60 sec: 43144.4, 300 sec: 44653.3). Total num frames: 1376452608. Throughput: 0: 46617.8. Samples: 1377892100. Policy #0 lag: (min: 0.0, avg: 45.8, max: 114.0) [2024-03-21 06:57:20,522][03784] Avg episode reward: [(0, '0.926')] [2024-03-21 06:57:24,838][04017] Updated weights for policy 0, policy_version 42015 (0.0012) [2024-03-21 06:57:25,521][03784] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 1376813056. Throughput: 0: 46391.1. Samples: 1378163900. Policy #0 lag: (min: 0.0, avg: 45.8, max: 114.0) [2024-03-21 06:57:25,522][03784] Avg episode reward: [(0, '1.340')] [2024-03-21 06:57:28,038][04017] Updated weights for policy 0, policy_version 42025 (0.0011) [2024-03-21 06:57:30,521][03784] Fps is (10 sec: 68813.8, 60 sec: 47513.7, 300 sec: 45097.6). Total num frames: 1377140736. Throughput: 0: 45891.1. Samples: 1378286400. Policy #0 lag: (min: 0.0, avg: 45.8, max: 114.0) [2024-03-21 06:57:30,522][03784] Avg episode reward: [(0, '1.340')] [2024-03-21 06:57:33,943][04017] Updated weights for policy 0, policy_version 42035 (0.0012) [2024-03-21 06:57:35,521][03784] Fps is (10 sec: 65536.3, 60 sec: 47513.8, 300 sec: 45319.8). Total num frames: 1377468416. Throughput: 0: 45177.8. Samples: 1378545500. Policy #0 lag: (min: 0.0, avg: 45.8, max: 114.0) [2024-03-21 06:57:35,522][03784] Avg episode reward: [(0, '0.839')] [2024-03-21 06:57:40,521][03784] Fps is (10 sec: 39321.6, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 1377533952. Throughput: 0: 46613.3. Samples: 1378861000. Policy #0 lag: (min: 0.0, avg: 45.8, max: 114.0) [2024-03-21 06:57:40,522][03784] Avg episode reward: [(0, '0.839')] [2024-03-21 06:57:43,031][04017] Updated weights for policy 0, policy_version 42045 (0.0011) [2024-03-21 06:57:45,521][03784] Fps is (10 sec: 45874.7, 60 sec: 48605.8, 300 sec: 45986.3). Total num frames: 1377927168. Throughput: 0: 46264.4. Samples: 1378993900. Policy #0 lag: (min: 1.0, avg: 49.4, max: 125.0) [2024-03-21 06:57:45,522][03784] Avg episode reward: [(0, '0.501')] [2024-03-21 06:57:47,707][04017] Updated weights for policy 0, policy_version 42055 (0.0012) [2024-03-21 06:57:50,521][03784] Fps is (10 sec: 62258.9, 60 sec: 50244.3, 300 sec: 45986.3). Total num frames: 1378156544. Throughput: 0: 46406.7. Samples: 1379261700. Policy #0 lag: (min: 1.0, avg: 49.4, max: 125.0) [2024-03-21 06:57:50,522][03784] Avg episode reward: [(0, '0.501')] [2024-03-21 06:57:55,521][03784] Fps is (10 sec: 29491.3, 60 sec: 45875.1, 300 sec: 45542.0). Total num frames: 1378222080. Throughput: 0: 46608.9. Samples: 1379558500. Policy #0 lag: (min: 1.0, avg: 49.4, max: 125.0) [2024-03-21 06:57:55,522][03784] Avg episode reward: [(0, '1.463')] [2024-03-21 06:57:56,772][03995] Signal inference workers to stop experience collection... (27750 times) [2024-03-21 06:57:56,773][03995] Signal inference workers to resume experience collection... (27750 times) [2024-03-21 06:57:56,843][04017] InferenceWorker_p0-w0: stopping experience collection (27750 times) [2024-03-21 06:57:56,844][04017] InferenceWorker_p0-w0: resuming experience collection (27750 times) [2024-03-21 06:58:00,521][03784] Fps is (10 sec: 19660.9, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 1378353152. Throughput: 0: 46284.5. Samples: 1379697300. Policy #0 lag: (min: 1.0, avg: 49.4, max: 125.0) [2024-03-21 06:58:00,522][03784] Avg episode reward: [(0, '1.278')] [2024-03-21 06:58:00,565][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000042065_1378385920.pth... [2024-03-21 06:58:00,588][04017] Updated weights for policy 0, policy_version 42065 (0.0012) [2024-03-21 06:58:00,699][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000041737_1367638016.pth [2024-03-21 06:58:05,521][03784] Fps is (10 sec: 36044.6, 60 sec: 43144.5, 300 sec: 45097.6). Total num frames: 1378582528. Throughput: 0: 45788.9. Samples: 1379952600. Policy #0 lag: (min: 1.0, avg: 49.4, max: 125.0) [2024-03-21 06:58:05,522][03784] Avg episode reward: [(0, '1.756')] [2024-03-21 06:58:07,479][04017] Updated weights for policy 0, policy_version 42075 (0.0014) [2024-03-21 06:58:10,521][03784] Fps is (10 sec: 49151.9, 60 sec: 44236.8, 300 sec: 45208.7). Total num frames: 1378844672. Throughput: 0: 46060.0. Samples: 1380236600. Policy #0 lag: (min: 0.0, avg: 31.6, max: 74.0) [2024-03-21 06:58:10,522][03784] Avg episode reward: [(0, '0.935')] [2024-03-21 06:58:14,267][04017] Updated weights for policy 0, policy_version 42085 (0.0011) [2024-03-21 06:58:15,521][03784] Fps is (10 sec: 52429.0, 60 sec: 46967.5, 300 sec: 44764.4). Total num frames: 1379106816. Throughput: 0: 46540.0. Samples: 1380380700. Policy #0 lag: (min: 0.0, avg: 31.6, max: 74.0) [2024-03-21 06:58:15,522][03784] Avg episode reward: [(0, '1.423')] [2024-03-21 06:58:20,521][03784] Fps is (10 sec: 45874.8, 60 sec: 47513.6, 300 sec: 44431.2). Total num frames: 1379303424. Throughput: 0: 46708.7. Samples: 1380647400. Policy #0 lag: (min: 0.0, avg: 31.6, max: 74.0) [2024-03-21 06:58:20,522][03784] Avg episode reward: [(0, '1.406')] [2024-03-21 06:58:22,010][04017] Updated weights for policy 0, policy_version 42095 (0.0015) [2024-03-21 06:58:25,521][03784] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1379565568. Throughput: 0: 45260.0. Samples: 1380897700. Policy #0 lag: (min: 0.0, avg: 31.6, max: 74.0) [2024-03-21 06:58:25,522][03784] Avg episode reward: [(0, '1.413')] [2024-03-21 06:58:27,710][04017] Updated weights for policy 0, policy_version 42105 (0.0015) [2024-03-21 06:58:30,521][03784] Fps is (10 sec: 65535.6, 60 sec: 46967.3, 300 sec: 45541.9). Total num frames: 1379958784. Throughput: 0: 45286.6. Samples: 1381031800. Policy #0 lag: (min: 0.0, avg: 31.6, max: 74.0) [2024-03-21 06:58:30,522][03784] Avg episode reward: [(0, '1.174')] [2024-03-21 06:58:31,187][04017] Updated weights for policy 0, policy_version 42115 (0.0012) [2024-03-21 06:58:35,521][03784] Fps is (10 sec: 55705.2, 60 sec: 44236.7, 300 sec: 45653.0). Total num frames: 1380122624. Throughput: 0: 45348.9. Samples: 1381302400. Policy #0 lag: (min: 0.0, avg: 31.6, max: 74.0) [2024-03-21 06:58:35,522][03784] Avg episode reward: [(0, '1.368')] [2024-03-21 06:58:39,342][04017] Updated weights for policy 0, policy_version 42125 (0.0010) [2024-03-21 06:58:40,521][03784] Fps is (10 sec: 42599.1, 60 sec: 47513.6, 300 sec: 46097.3). Total num frames: 1380384768. Throughput: 0: 44911.1. Samples: 1381579500. Policy #0 lag: (min: 0.0, avg: 39.5, max: 75.0) [2024-03-21 06:58:40,522][03784] Avg episode reward: [(0, '1.770')] [2024-03-21 06:58:45,521][03784] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 45653.1). Total num frames: 1380548608. Throughput: 0: 45088.9. Samples: 1381726300. Policy #0 lag: (min: 0.0, avg: 39.5, max: 75.0) [2024-03-21 06:58:45,522][03784] Avg episode reward: [(0, '1.568')] [2024-03-21 06:58:48,801][04017] Updated weights for policy 0, policy_version 42135 (0.0011) [2024-03-21 06:58:49,487][03995] Signal inference workers to stop experience collection... (27800 times) [2024-03-21 06:58:49,545][04017] InferenceWorker_p0-w0: stopping experience collection (27800 times) [2024-03-21 06:58:49,553][03995] Signal inference workers to resume experience collection... (27800 times) [2024-03-21 06:58:49,607][04017] InferenceWorker_p0-w0: resuming experience collection (27800 times) [2024-03-21 06:58:50,521][03784] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 1380810752. Throughput: 0: 45491.2. Samples: 1381999700. Policy #0 lag: (min: 0.0, avg: 39.5, max: 75.0) [2024-03-21 06:58:50,522][03784] Avg episode reward: [(0, '1.389')] [2024-03-21 06:58:55,389][04017] Updated weights for policy 0, policy_version 42145 (0.0013) [2024-03-21 06:58:55,521][03784] Fps is (10 sec: 45874.9, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1381007360. Throughput: 0: 45117.8. Samples: 1382266900. Policy #0 lag: (min: 0.0, avg: 39.5, max: 75.0) [2024-03-21 06:58:55,522][03784] Avg episode reward: [(0, '1.389')] [2024-03-21 06:59:00,521][03784] Fps is (10 sec: 36044.5, 60 sec: 46967.4, 300 sec: 44986.6). Total num frames: 1381171200. Throughput: 0: 44757.7. Samples: 1382394800. Policy #0 lag: (min: 0.0, avg: 39.5, max: 75.0) [2024-03-21 06:59:00,522][03784] Avg episode reward: [(0, '1.359')] [2024-03-21 06:59:02,856][04017] Updated weights for policy 0, policy_version 42155 (0.0013) [2024-03-21 06:59:05,521][03784] Fps is (10 sec: 45875.0, 60 sec: 48059.7, 300 sec: 45097.6). Total num frames: 1381466112. Throughput: 0: 44873.4. Samples: 1382666700. Policy #0 lag: (min: 2.0, avg: 37.9, max: 76.0) [2024-03-21 06:59:05,522][03784] Avg episode reward: [(0, '1.707')] [2024-03-21 06:59:10,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1381564416. Throughput: 0: 45806.6. Samples: 1382959000. Policy #0 lag: (min: 2.0, avg: 37.9, max: 76.0) [2024-03-21 06:59:10,522][03784] Avg episode reward: [(0, '0.798')] [2024-03-21 06:59:13,982][04017] Updated weights for policy 0, policy_version 42165 (0.0017) [2024-03-21 06:59:15,521][03784] Fps is (10 sec: 26214.8, 60 sec: 43690.8, 300 sec: 44653.4). Total num frames: 1381728256. Throughput: 0: 46220.3. Samples: 1383111700. Policy #0 lag: (min: 2.0, avg: 37.9, max: 76.0) [2024-03-21 06:59:15,522][03784] Avg episode reward: [(0, '1.315')] [2024-03-21 06:59:20,521][03784] Fps is (10 sec: 39321.6, 60 sec: 44236.9, 300 sec: 45097.7). Total num frames: 1381957632. Throughput: 0: 45917.8. Samples: 1383368700. Policy #0 lag: (min: 2.0, avg: 37.9, max: 76.0) [2024-03-21 06:59:20,522][03784] Avg episode reward: [(0, '1.020')] [2024-03-21 06:59:20,687][04017] Updated weights for policy 0, policy_version 42175 (0.0013) [2024-03-21 06:59:24,054][04017] Updated weights for policy 0, policy_version 42185 (0.0013) [2024-03-21 06:59:25,521][03784] Fps is (10 sec: 62258.7, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 1382350848. Throughput: 0: 45588.9. Samples: 1383631000. Policy #0 lag: (min: 2.0, avg: 37.9, max: 76.0) [2024-03-21 06:59:25,522][03784] Avg episode reward: [(0, '0.818')] [2024-03-21 06:59:28,363][04017] Updated weights for policy 0, policy_version 42195 (0.0020) [2024-03-21 06:59:30,521][03784] Fps is (10 sec: 72089.9, 60 sec: 45329.2, 300 sec: 46430.6). Total num frames: 1382678528. Throughput: 0: 44897.7. Samples: 1383746700. Policy #0 lag: (min: 2.0, avg: 37.9, max: 76.0) [2024-03-21 06:59:30,522][03784] Avg episode reward: [(0, '1.270')] [2024-03-21 06:59:35,521][03784] Fps is (10 sec: 58981.9, 60 sec: 46967.5, 300 sec: 46652.7). Total num frames: 1382940672. Throughput: 0: 44982.2. Samples: 1384023900. Policy #0 lag: (min: 0.0, avg: 38.7, max: 68.0) [2024-03-21 06:59:35,522][03784] Avg episode reward: [(0, '1.303')] [2024-03-21 06:59:37,722][04017] Updated weights for policy 0, policy_version 42205 (0.0018) [2024-03-21 06:59:37,762][03995] Signal inference workers to stop experience collection... (27850 times) [2024-03-21 06:59:37,841][04017] InferenceWorker_p0-w0: stopping experience collection (27850 times) [2024-03-21 06:59:38,054][03995] Signal inference workers to resume experience collection... (27850 times) [2024-03-21 06:59:38,055][04017] InferenceWorker_p0-w0: resuming experience collection (27850 times) [2024-03-21 06:59:40,521][03784] Fps is (10 sec: 42598.3, 60 sec: 45329.1, 300 sec: 46208.4). Total num frames: 1383104512. Throughput: 0: 44571.1. Samples: 1384272600. Policy #0 lag: (min: 0.0, avg: 38.7, max: 68.0) [2024-03-21 06:59:40,522][03784] Avg episode reward: [(0, '0.960')] [2024-03-21 06:59:45,521][03784] Fps is (10 sec: 22937.9, 60 sec: 43690.7, 300 sec: 45208.8). Total num frames: 1383170048. Throughput: 0: 45102.4. Samples: 1384424400. Policy #0 lag: (min: 0.0, avg: 38.7, max: 68.0) [2024-03-21 06:59:45,521][03784] Avg episode reward: [(0, '1.042')] [2024-03-21 06:59:48,008][04017] Updated weights for policy 0, policy_version 42215 (0.0015) [2024-03-21 06:59:50,521][03784] Fps is (10 sec: 32767.7, 60 sec: 43690.6, 300 sec: 44542.2). Total num frames: 1383432192. Throughput: 0: 45115.5. Samples: 1384696900. Policy #0 lag: (min: 0.0, avg: 38.7, max: 68.0) [2024-03-21 06:59:50,522][03784] Avg episode reward: [(0, '0.955')] [2024-03-21 06:59:54,109][04017] Updated weights for policy 0, policy_version 42225 (0.0016) [2024-03-21 06:59:55,521][03784] Fps is (10 sec: 58981.9, 60 sec: 45875.2, 300 sec: 44542.3). Total num frames: 1383759872. Throughput: 0: 44231.1. Samples: 1384949400. Policy #0 lag: (min: 0.0, avg: 38.7, max: 68.0) [2024-03-21 06:59:55,522][03784] Avg episode reward: [(0, '1.082')] [2024-03-21 07:00:00,521][03784] Fps is (10 sec: 49152.3, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1383923712. Throughput: 0: 44157.6. Samples: 1385098800. Policy #0 lag: (min: 0.0, avg: 42.4, max: 84.0) [2024-03-21 07:00:00,522][03784] Avg episode reward: [(0, '0.655')] [2024-03-21 07:00:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000042234_1383923712.pth... [2024-03-21 07:00:00,655][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000041903_1373077504.pth [2024-03-21 07:00:01,317][04017] Updated weights for policy 0, policy_version 42235 (0.0019) [2024-03-21 07:00:05,521][03784] Fps is (10 sec: 29491.0, 60 sec: 43144.5, 300 sec: 44542.3). Total num frames: 1384054784. Throughput: 0: 45193.3. Samples: 1385402400. Policy #0 lag: (min: 0.0, avg: 42.4, max: 84.0) [2024-03-21 07:00:05,522][03784] Avg episode reward: [(0, '0.679')] [2024-03-21 07:00:10,521][03784] Fps is (10 sec: 32768.0, 60 sec: 44782.9, 300 sec: 44654.2). Total num frames: 1384251392. Throughput: 0: 45637.7. Samples: 1385684700. Policy #0 lag: (min: 0.0, avg: 42.4, max: 84.0) [2024-03-21 07:00:10,522][03784] Avg episode reward: [(0, '1.536')] [2024-03-21 07:00:10,814][04017] Updated weights for policy 0, policy_version 42245 (0.0012) [2024-03-21 07:00:14,263][04017] Updated weights for policy 0, policy_version 42255 (0.0012) [2024-03-21 07:00:15,521][03784] Fps is (10 sec: 62259.7, 60 sec: 49151.9, 300 sec: 45764.1). Total num frames: 1384677376. Throughput: 0: 45520.0. Samples: 1385795100. Policy #0 lag: (min: 0.0, avg: 42.4, max: 84.0) [2024-03-21 07:00:15,522][03784] Avg episode reward: [(0, '1.169')] [2024-03-21 07:00:20,521][03784] Fps is (10 sec: 62259.2, 60 sec: 48605.8, 300 sec: 46430.6). Total num frames: 1384873984. Throughput: 0: 44997.8. Samples: 1386048800. Policy #0 lag: (min: 0.0, avg: 42.4, max: 84.0) [2024-03-21 07:00:20,522][03784] Avg episode reward: [(0, '1.355')] [2024-03-21 07:00:21,311][04017] Updated weights for policy 0, policy_version 42265 (0.0011) [2024-03-21 07:00:25,521][03784] Fps is (10 sec: 42598.1, 60 sec: 45875.1, 300 sec: 46541.7). Total num frames: 1385103360. Throughput: 0: 45073.2. Samples: 1386300900. Policy #0 lag: (min: 0.0, avg: 42.4, max: 84.0) [2024-03-21 07:00:25,522][03784] Avg episode reward: [(0, '1.319')] [2024-03-21 07:00:30,225][04017] Updated weights for policy 0, policy_version 42275 (0.0011) [2024-03-21 07:00:30,527][03784] Fps is (10 sec: 39299.7, 60 sec: 43140.5, 300 sec: 46429.7). Total num frames: 1385267200. Throughput: 0: 44643.3. Samples: 1386433600. Policy #0 lag: (min: 0.0, avg: 44.1, max: 95.0) [2024-03-21 07:00:30,527][03784] Avg episode reward: [(0, '0.537')] [2024-03-21 07:00:35,521][03784] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 45986.3). Total num frames: 1385496576. Throughput: 0: 44737.8. Samples: 1386710100. Policy #0 lag: (min: 0.0, avg: 44.1, max: 95.0) [2024-03-21 07:00:35,522][03784] Avg episode reward: [(0, '1.392')] [2024-03-21 07:00:36,103][03995] Signal inference workers to stop experience collection... (27900 times) [2024-03-21 07:00:36,104][03995] Signal inference workers to resume experience collection... (27900 times) [2024-03-21 07:00:36,176][04017] InferenceWorker_p0-w0: stopping experience collection (27900 times) [2024-03-21 07:00:36,176][04017] InferenceWorker_p0-w0: resuming experience collection (27900 times) [2024-03-21 07:00:36,471][04017] Updated weights for policy 0, policy_version 42285 (0.0017) [2024-03-21 07:00:40,521][03784] Fps is (10 sec: 36065.0, 60 sec: 42052.3, 300 sec: 45319.8). Total num frames: 1385627648. Throughput: 0: 45964.5. Samples: 1387017800. Policy #0 lag: (min: 0.0, avg: 44.1, max: 95.0) [2024-03-21 07:00:40,522][03784] Avg episode reward: [(0, '0.904')] [2024-03-21 07:00:45,339][04017] Updated weights for policy 0, policy_version 42295 (0.0010) [2024-03-21 07:00:45,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 1385922560. Throughput: 0: 45604.5. Samples: 1387151000. Policy #0 lag: (min: 0.0, avg: 44.1, max: 95.0) [2024-03-21 07:00:45,522][03784] Avg episode reward: [(0, '1.554')] [2024-03-21 07:00:50,521][03784] Fps is (10 sec: 49151.7, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 1386119168. Throughput: 0: 45275.6. Samples: 1387439800. Policy #0 lag: (min: 0.0, avg: 44.1, max: 95.0) [2024-03-21 07:00:50,522][03784] Avg episode reward: [(0, '1.554')] [2024-03-21 07:00:51,900][04017] Updated weights for policy 0, policy_version 42305 (0.0012) [2024-03-21 07:00:55,521][03784] Fps is (10 sec: 36044.7, 60 sec: 42052.3, 300 sec: 44764.4). Total num frames: 1386283008. Throughput: 0: 45124.5. Samples: 1387715300. Policy #0 lag: (min: 0.0, avg: 44.1, max: 95.0) [2024-03-21 07:00:55,522][03784] Avg episode reward: [(0, '1.216')] [2024-03-21 07:01:00,521][03784] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 44875.5). Total num frames: 1386512384. Throughput: 0: 45806.6. Samples: 1387856400. Policy #0 lag: (min: 0.0, avg: 31.4, max: 113.0) [2024-03-21 07:01:00,522][03784] Avg episode reward: [(0, '0.645')] [2024-03-21 07:01:01,446][04017] Updated weights for policy 0, policy_version 42315 (0.0012) [2024-03-21 07:01:05,522][03784] Fps is (10 sec: 49150.4, 60 sec: 45328.9, 300 sec: 45208.7). Total num frames: 1386774528. Throughput: 0: 45875.3. Samples: 1388113200. Policy #0 lag: (min: 0.0, avg: 31.4, max: 113.0) [2024-03-21 07:01:05,523][03784] Avg episode reward: [(0, '1.386')] [2024-03-21 07:01:06,910][04017] Updated weights for policy 0, policy_version 42325 (0.0011) [2024-03-21 07:01:10,521][03784] Fps is (10 sec: 49152.2, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 1387003904. Throughput: 0: 46173.4. Samples: 1388378700. Policy #0 lag: (min: 0.0, avg: 31.4, max: 113.0) [2024-03-21 07:01:10,523][03784] Avg episode reward: [(0, '1.060')] [2024-03-21 07:01:12,923][04017] Updated weights for policy 0, policy_version 42335 (0.0014) [2024-03-21 07:01:15,521][03784] Fps is (10 sec: 68814.8, 60 sec: 46421.3, 300 sec: 46097.3). Total num frames: 1387462656. Throughput: 0: 46123.5. Samples: 1388508900. Policy #0 lag: (min: 0.0, avg: 31.4, max: 113.0) [2024-03-21 07:01:15,522][03784] Avg episode reward: [(0, '0.647')] [2024-03-21 07:01:16,733][04017] Updated weights for policy 0, policy_version 42345 (0.0010) [2024-03-21 07:01:20,521][03784] Fps is (10 sec: 81919.8, 60 sec: 49152.0, 300 sec: 46208.4). Total num frames: 1387823104. Throughput: 0: 45393.3. Samples: 1388752800. Policy #0 lag: (min: 0.0, avg: 31.4, max: 113.0) [2024-03-21 07:01:20,522][03784] Avg episode reward: [(0, '1.193')] [2024-03-21 07:01:21,328][04017] Updated weights for policy 0, policy_version 42355 (0.0013) [2024-03-21 07:01:25,521][03784] Fps is (10 sec: 55705.6, 60 sec: 48605.9, 300 sec: 46541.7). Total num frames: 1388019712. Throughput: 0: 44902.2. Samples: 1389038400. Policy #0 lag: (min: 0.0, avg: 46.4, max: 107.0) [2024-03-21 07:01:25,522][03784] Avg episode reward: [(0, '0.967')] [2024-03-21 07:01:30,521][03784] Fps is (10 sec: 22937.8, 60 sec: 46425.7, 300 sec: 45542.0). Total num frames: 1388052480. Throughput: 0: 45253.3. Samples: 1389187400. Policy #0 lag: (min: 0.0, avg: 46.4, max: 107.0) [2024-03-21 07:01:30,522][03784] Avg episode reward: [(0, '1.461')] [2024-03-21 07:01:35,521][03784] Fps is (10 sec: 3276.8, 60 sec: 42598.4, 300 sec: 44875.5). Total num frames: 1388052480. Throughput: 0: 45233.4. Samples: 1389475300. Policy #0 lag: (min: 0.0, avg: 46.4, max: 107.0) [2024-03-21 07:01:35,522][03784] Avg episode reward: [(0, '1.609')] [2024-03-21 07:01:35,986][03995] Signal inference workers to stop experience collection... (27950 times) [2024-03-21 07:01:36,036][04017] InferenceWorker_p0-w0: stopping experience collection (27950 times) [2024-03-21 07:01:36,271][03995] Signal inference workers to resume experience collection... (27950 times) [2024-03-21 07:01:36,272][04017] InferenceWorker_p0-w0: resuming experience collection (27950 times) [2024-03-21 07:01:40,456][04017] Updated weights for policy 0, policy_version 42365 (0.0011) [2024-03-21 07:01:40,521][03784] Fps is (10 sec: 16383.8, 60 sec: 43144.5, 300 sec: 44764.4). Total num frames: 1388216320. Throughput: 0: 45028.8. Samples: 1389741600. Policy #0 lag: (min: 0.0, avg: 46.4, max: 107.0) [2024-03-21 07:01:40,522][03784] Avg episode reward: [(0, '1.707')] [2024-03-21 07:01:45,521][03784] Fps is (10 sec: 49151.7, 60 sec: 43690.6, 300 sec: 45430.9). Total num frames: 1388544000. Throughput: 0: 44435.6. Samples: 1389856000. Policy #0 lag: (min: 0.0, avg: 46.4, max: 107.0) [2024-03-21 07:01:45,522][03784] Avg episode reward: [(0, '0.665')] [2024-03-21 07:01:45,565][04017] Updated weights for policy 0, policy_version 42375 (0.0012) [2024-03-21 07:01:50,521][03784] Fps is (10 sec: 49152.3, 60 sec: 43144.5, 300 sec: 44875.5). Total num frames: 1388707840. Throughput: 0: 44989.2. Samples: 1390137700. Policy #0 lag: (min: 0.0, avg: 46.4, max: 107.0) [2024-03-21 07:01:50,522][03784] Avg episode reward: [(0, '1.140')] [2024-03-21 07:01:54,336][04017] Updated weights for policy 0, policy_version 42385 (0.0019) [2024-03-21 07:01:55,521][03784] Fps is (10 sec: 42598.7, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 1388969984. Throughput: 0: 45015.6. Samples: 1390404400. Policy #0 lag: (min: 0.0, avg: 30.7, max: 68.0) [2024-03-21 07:01:55,522][03784] Avg episode reward: [(0, '0.799')] [2024-03-21 07:01:57,588][04017] Updated weights for policy 0, policy_version 42395 (0.0017) [2024-03-21 07:02:00,521][03784] Fps is (10 sec: 62259.0, 60 sec: 46967.5, 300 sec: 45208.7). Total num frames: 1389330432. Throughput: 0: 44702.2. Samples: 1390520500. Policy #0 lag: (min: 0.0, avg: 30.7, max: 68.0) [2024-03-21 07:02:00,522][03784] Avg episode reward: [(0, '0.872')] [2024-03-21 07:02:00,720][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000042400_1389363200.pth... [2024-03-21 07:02:00,831][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000042065_1378385920.pth [2024-03-21 07:02:02,859][04017] Updated weights for policy 0, policy_version 42405 (0.0021) [2024-03-21 07:02:05,521][03784] Fps is (10 sec: 78643.4, 60 sec: 49698.4, 300 sec: 45986.3). Total num frames: 1389756416. Throughput: 0: 45360.1. Samples: 1390794000. Policy #0 lag: (min: 0.0, avg: 30.7, max: 68.0) [2024-03-21 07:02:05,522][03784] Avg episode reward: [(0, '1.314')] [2024-03-21 07:02:08,793][04017] Updated weights for policy 0, policy_version 42415 (0.0024) [2024-03-21 07:02:10,521][03784] Fps is (10 sec: 58982.7, 60 sec: 48605.9, 300 sec: 46208.4). Total num frames: 1389920256. Throughput: 0: 45206.7. Samples: 1391072700. Policy #0 lag: (min: 0.0, avg: 30.7, max: 68.0) [2024-03-21 07:02:10,522][03784] Avg episode reward: [(0, '1.420')] [2024-03-21 07:02:15,521][03784] Fps is (10 sec: 32768.1, 60 sec: 43690.7, 300 sec: 46208.5). Total num frames: 1390084096. Throughput: 0: 45115.6. Samples: 1391217600. Policy #0 lag: (min: 0.0, avg: 30.7, max: 68.0) [2024-03-21 07:02:15,522][03784] Avg episode reward: [(0, '1.433')] [2024-03-21 07:02:17,907][04017] Updated weights for policy 0, policy_version 42425 (0.0014) [2024-03-21 07:02:20,521][03784] Fps is (10 sec: 36044.8, 60 sec: 40960.0, 300 sec: 45653.0). Total num frames: 1390280704. Throughput: 0: 44795.5. Samples: 1391491100. Policy #0 lag: (min: 0.0, avg: 30.7, max: 68.0) [2024-03-21 07:02:20,522][03784] Avg episode reward: [(0, '1.392')] [2024-03-21 07:02:25,521][03784] Fps is (10 sec: 29490.9, 60 sec: 39321.6, 300 sec: 44875.5). Total num frames: 1390379008. Throughput: 0: 45493.4. Samples: 1391788800. Policy #0 lag: (min: 0.0, avg: 31.6, max: 78.0) [2024-03-21 07:02:25,522][03784] Avg episode reward: [(0, '1.272')] [2024-03-21 07:02:28,536][03995] Signal inference workers to stop experience collection... (28000 times) [2024-03-21 07:02:28,536][03995] Signal inference workers to resume experience collection... (28000 times) [2024-03-21 07:02:28,626][04017] InferenceWorker_p0-w0: stopping experience collection (28000 times) [2024-03-21 07:02:28,627][04017] InferenceWorker_p0-w0: resuming experience collection (28000 times) [2024-03-21 07:02:29,284][04017] Updated weights for policy 0, policy_version 42435 (0.0011) [2024-03-21 07:02:30,521][03784] Fps is (10 sec: 32768.1, 60 sec: 42598.4, 300 sec: 44542.3). Total num frames: 1390608384. Throughput: 0: 46280.1. Samples: 1391938600. Policy #0 lag: (min: 0.0, avg: 31.6, max: 78.0) [2024-03-21 07:02:30,522][03784] Avg episode reward: [(0, '0.681')] [2024-03-21 07:02:33,355][04017] Updated weights for policy 0, policy_version 42445 (0.0010) [2024-03-21 07:02:35,521][03784] Fps is (10 sec: 49152.3, 60 sec: 46967.5, 300 sec: 45208.7). Total num frames: 1390870528. Throughput: 0: 45740.0. Samples: 1392196000. Policy #0 lag: (min: 0.0, avg: 31.6, max: 78.0) [2024-03-21 07:02:35,522][03784] Avg episode reward: [(0, '1.177')] [2024-03-21 07:02:40,521][03784] Fps is (10 sec: 52428.9, 60 sec: 48606.0, 300 sec: 44764.4). Total num frames: 1391132672. Throughput: 0: 46106.7. Samples: 1392479200. Policy #0 lag: (min: 0.0, avg: 31.6, max: 78.0) [2024-03-21 07:02:40,522][03784] Avg episode reward: [(0, '0.974')] [2024-03-21 07:02:41,493][04017] Updated weights for policy 0, policy_version 42455 (0.0011) [2024-03-21 07:02:45,521][03784] Fps is (10 sec: 52428.4, 60 sec: 47513.6, 300 sec: 44875.5). Total num frames: 1391394816. Throughput: 0: 46664.4. Samples: 1392620400. Policy #0 lag: (min: 0.0, avg: 31.6, max: 78.0) [2024-03-21 07:02:45,522][03784] Avg episode reward: [(0, '0.974')] [2024-03-21 07:02:46,451][04017] Updated weights for policy 0, policy_version 42465 (0.0013) [2024-03-21 07:02:50,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48059.8, 300 sec: 45319.8). Total num frames: 1391591424. Throughput: 0: 46440.0. Samples: 1392883800. Policy #0 lag: (min: 0.0, avg: 31.6, max: 78.0) [2024-03-21 07:02:50,522][03784] Avg episode reward: [(0, '0.803')] [2024-03-21 07:02:54,664][04017] Updated weights for policy 0, policy_version 42475 (0.0012) [2024-03-21 07:02:55,521][03784] Fps is (10 sec: 49151.8, 60 sec: 48605.8, 300 sec: 45875.2). Total num frames: 1391886336. Throughput: 0: 46297.7. Samples: 1393156100. Policy #0 lag: (min: 0.0, avg: 30.5, max: 68.0) [2024-03-21 07:02:55,522][03784] Avg episode reward: [(0, '1.244')] [2024-03-21 07:03:00,480][04017] Updated weights for policy 0, policy_version 42485 (0.0016) [2024-03-21 07:03:00,521][03784] Fps is (10 sec: 55705.3, 60 sec: 46967.5, 300 sec: 45986.3). Total num frames: 1392148480. Throughput: 0: 46062.1. Samples: 1393290400. Policy #0 lag: (min: 0.0, avg: 30.5, max: 68.0) [2024-03-21 07:03:00,522][03784] Avg episode reward: [(0, '0.756')] [2024-03-21 07:03:05,521][03784] Fps is (10 sec: 55705.2, 60 sec: 44782.8, 300 sec: 46097.3). Total num frames: 1392443392. Throughput: 0: 45837.6. Samples: 1393553800. Policy #0 lag: (min: 0.0, avg: 30.5, max: 68.0) [2024-03-21 07:03:05,522][03784] Avg episode reward: [(0, '1.394')] [2024-03-21 07:03:05,822][04017] Updated weights for policy 0, policy_version 42495 (0.0011) [2024-03-21 07:03:08,136][03995] Signal inference workers to stop experience collection... (28050 times) [2024-03-21 07:03:08,195][03995] Signal inference workers to resume experience collection... (28050 times) [2024-03-21 07:03:08,230][04017] InferenceWorker_p0-w0: stopping experience collection (28050 times) [2024-03-21 07:03:08,312][04017] InferenceWorker_p0-w0: resuming experience collection (28050 times) [2024-03-21 07:03:10,521][03784] Fps is (10 sec: 58982.3, 60 sec: 46967.4, 300 sec: 46208.4). Total num frames: 1392738304. Throughput: 0: 45204.5. Samples: 1393823000. Policy #0 lag: (min: 0.0, avg: 30.5, max: 68.0) [2024-03-21 07:03:10,522][03784] Avg episode reward: [(0, '1.194')] [2024-03-21 07:03:15,521][03784] Fps is (10 sec: 29491.8, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 1392738304. Throughput: 0: 45324.5. Samples: 1393978200. Policy #0 lag: (min: 0.0, avg: 30.5, max: 68.0) [2024-03-21 07:03:15,522][03784] Avg episode reward: [(0, '1.326')] [2024-03-21 07:03:18,705][04017] Updated weights for policy 0, policy_version 42505 (0.0015) [2024-03-21 07:03:20,521][03784] Fps is (10 sec: 22937.7, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 1392967680. Throughput: 0: 46162.2. Samples: 1394273300. Policy #0 lag: (min: 0.0, avg: 42.6, max: 90.0) [2024-03-21 07:03:20,522][03784] Avg episode reward: [(0, '1.326')] [2024-03-21 07:03:25,521][03784] Fps is (10 sec: 36044.1, 60 sec: 45329.0, 300 sec: 44542.3). Total num frames: 1393098752. Throughput: 0: 45930.9. Samples: 1394546100. Policy #0 lag: (min: 0.0, avg: 42.6, max: 90.0) [2024-03-21 07:03:25,522][03784] Avg episode reward: [(0, '1.373')] [2024-03-21 07:03:25,887][04017] Updated weights for policy 0, policy_version 42515 (0.0012) [2024-03-21 07:03:29,312][04017] Updated weights for policy 0, policy_version 42525 (0.0020) [2024-03-21 07:03:30,521][03784] Fps is (10 sec: 58982.0, 60 sec: 49151.9, 300 sec: 45542.0). Total num frames: 1393557504. Throughput: 0: 45617.8. Samples: 1394673200. Policy #0 lag: (min: 0.0, avg: 42.6, max: 90.0) [2024-03-21 07:03:30,522][03784] Avg episode reward: [(0, '1.303')] [2024-03-21 07:03:35,521][03784] Fps is (10 sec: 65537.4, 60 sec: 48059.8, 300 sec: 45319.8). Total num frames: 1393754112. Throughput: 0: 45891.1. Samples: 1394948900. Policy #0 lag: (min: 0.0, avg: 42.6, max: 90.0) [2024-03-21 07:03:35,522][03784] Avg episode reward: [(0, '1.303')] [2024-03-21 07:03:35,596][04017] Updated weights for policy 0, policy_version 42535 (0.0013) [2024-03-21 07:03:40,521][03784] Fps is (10 sec: 42598.8, 60 sec: 47513.6, 300 sec: 45542.0). Total num frames: 1393983488. Throughput: 0: 45409.0. Samples: 1395199500. Policy #0 lag: (min: 0.0, avg: 42.6, max: 90.0) [2024-03-21 07:03:40,522][03784] Avg episode reward: [(0, '0.986')] [2024-03-21 07:03:42,676][04017] Updated weights for policy 0, policy_version 42545 (0.0012) [2024-03-21 07:03:45,521][03784] Fps is (10 sec: 45874.1, 60 sec: 46967.4, 300 sec: 45430.9). Total num frames: 1394212864. Throughput: 0: 45235.4. Samples: 1395326000. Policy #0 lag: (min: 0.0, avg: 42.6, max: 90.0) [2024-03-21 07:03:45,522][03784] Avg episode reward: [(0, '0.430')] [2024-03-21 07:03:50,480][04017] Updated weights for policy 0, policy_version 42555 (0.0018) [2024-03-21 07:03:50,521][03784] Fps is (10 sec: 45874.6, 60 sec: 47513.5, 300 sec: 45541.9). Total num frames: 1394442240. Throughput: 0: 45444.5. Samples: 1395598800. Policy #0 lag: (min: 1.0, avg: 35.0, max: 73.0) [2024-03-21 07:03:50,522][03784] Avg episode reward: [(0, '0.799')] [2024-03-21 07:03:55,521][03784] Fps is (10 sec: 39322.4, 60 sec: 45329.2, 300 sec: 45542.0). Total num frames: 1394606080. Throughput: 0: 46113.4. Samples: 1395898100. Policy #0 lag: (min: 1.0, avg: 35.0, max: 73.0) [2024-03-21 07:03:55,522][03784] Avg episode reward: [(0, '1.163')] [2024-03-21 07:04:00,348][04017] Updated weights for policy 0, policy_version 42565 (0.0015) [2024-03-21 07:04:00,521][03784] Fps is (10 sec: 32768.4, 60 sec: 43690.7, 300 sec: 45097.7). Total num frames: 1394769920. Throughput: 0: 46022.2. Samples: 1396049200. Policy #0 lag: (min: 1.0, avg: 35.0, max: 73.0) [2024-03-21 07:04:00,522][03784] Avg episode reward: [(0, '1.163')] [2024-03-21 07:04:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000042565_1394769920.pth... [2024-03-21 07:04:00,656][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000042234_1383923712.pth [2024-03-21 07:04:05,521][03784] Fps is (10 sec: 36044.7, 60 sec: 42052.4, 300 sec: 45430.9). Total num frames: 1394966528. Throughput: 0: 45708.9. Samples: 1396330200. Policy #0 lag: (min: 1.0, avg: 35.0, max: 73.0) [2024-03-21 07:04:05,522][03784] Avg episode reward: [(0, '1.375')] [2024-03-21 07:04:07,040][04017] Updated weights for policy 0, policy_version 42575 (0.0016) [2024-03-21 07:04:09,287][03995] Signal inference workers to stop experience collection... (28100 times) [2024-03-21 07:04:09,287][03995] Signal inference workers to resume experience collection... (28100 times) [2024-03-21 07:04:09,360][04017] InferenceWorker_p0-w0: stopping experience collection (28100 times) [2024-03-21 07:04:09,361][04017] InferenceWorker_p0-w0: resuming experience collection (28100 times) [2024-03-21 07:04:10,521][03784] Fps is (10 sec: 42598.5, 60 sec: 40960.0, 300 sec: 45653.0). Total num frames: 1395195904. Throughput: 0: 45929.1. Samples: 1396612900. Policy #0 lag: (min: 1.0, avg: 35.0, max: 73.0) [2024-03-21 07:04:10,522][03784] Avg episode reward: [(0, '1.023')] [2024-03-21 07:04:15,521][03784] Fps is (10 sec: 42598.2, 60 sec: 44236.7, 300 sec: 45542.0). Total num frames: 1395392512. Throughput: 0: 46400.0. Samples: 1396761200. Policy #0 lag: (min: 0.0, avg: 34.6, max: 87.0) [2024-03-21 07:04:15,522][03784] Avg episode reward: [(0, '1.149')] [2024-03-21 07:04:15,692][04017] Updated weights for policy 0, policy_version 42585 (0.0015) [2024-03-21 07:04:19,711][04017] Updated weights for policy 0, policy_version 42595 (0.0016) [2024-03-21 07:04:20,521][03784] Fps is (10 sec: 58982.7, 60 sec: 46967.5, 300 sec: 45542.0). Total num frames: 1395785728. Throughput: 0: 45922.2. Samples: 1397015400. Policy #0 lag: (min: 0.0, avg: 34.6, max: 87.0) [2024-03-21 07:04:20,522][03784] Avg episode reward: [(0, '0.677')] [2024-03-21 07:04:24,343][04017] Updated weights for policy 0, policy_version 42605 (0.0022) [2024-03-21 07:04:25,521][03784] Fps is (10 sec: 72089.4, 60 sec: 50244.3, 300 sec: 45541.9). Total num frames: 1396113408. Throughput: 0: 45799.9. Samples: 1397260500. Policy #0 lag: (min: 0.0, avg: 34.6, max: 87.0) [2024-03-21 07:04:25,522][03784] Avg episode reward: [(0, '0.665')] [2024-03-21 07:04:30,521][03784] Fps is (10 sec: 52428.6, 60 sec: 45875.3, 300 sec: 45319.8). Total num frames: 1396310016. Throughput: 0: 46100.2. Samples: 1397400500. Policy #0 lag: (min: 0.0, avg: 34.6, max: 87.0) [2024-03-21 07:04:30,522][03784] Avg episode reward: [(0, '0.665')] [2024-03-21 07:04:31,336][04017] Updated weights for policy 0, policy_version 42615 (0.0011) [2024-03-21 07:04:35,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1396539392. Throughput: 0: 45586.7. Samples: 1397650200. Policy #0 lag: (min: 0.0, avg: 34.6, max: 87.0) [2024-03-21 07:04:35,522][03784] Avg episode reward: [(0, '1.425')] [2024-03-21 07:04:39,177][04017] Updated weights for policy 0, policy_version 42625 (0.0022) [2024-03-21 07:04:40,521][03784] Fps is (10 sec: 52428.2, 60 sec: 47513.5, 300 sec: 46319.5). Total num frames: 1396834304. Throughput: 0: 44082.1. Samples: 1397881800. Policy #0 lag: (min: 0.0, avg: 34.6, max: 87.0) [2024-03-21 07:04:40,522][03784] Avg episode reward: [(0, '1.417')] [2024-03-21 07:04:45,219][04017] Updated weights for policy 0, policy_version 42635 (0.0013) [2024-03-21 07:04:45,521][03784] Fps is (10 sec: 52429.4, 60 sec: 47513.8, 300 sec: 46208.5). Total num frames: 1397063680. Throughput: 0: 43986.7. Samples: 1398028600. Policy #0 lag: (min: 5.0, avg: 46.8, max: 123.0) [2024-03-21 07:04:45,522][03784] Avg episode reward: [(0, '0.883')] [2024-03-21 07:04:50,521][03784] Fps is (10 sec: 32768.3, 60 sec: 45329.2, 300 sec: 45430.9). Total num frames: 1397161984. Throughput: 0: 44435.5. Samples: 1398329800. Policy #0 lag: (min: 5.0, avg: 46.8, max: 123.0) [2024-03-21 07:04:50,522][03784] Avg episode reward: [(0, '1.106')] [2024-03-21 07:04:55,521][03784] Fps is (10 sec: 13107.2, 60 sec: 43144.5, 300 sec: 44986.6). Total num frames: 1397194752. Throughput: 0: 44766.7. Samples: 1398627400. Policy #0 lag: (min: 5.0, avg: 46.8, max: 123.0) [2024-03-21 07:04:55,522][03784] Avg episode reward: [(0, '1.800')] [2024-03-21 07:05:00,348][04017] Updated weights for policy 0, policy_version 42645 (0.0016) [2024-03-21 07:05:00,521][03784] Fps is (10 sec: 22937.6, 60 sec: 43690.6, 300 sec: 45208.7). Total num frames: 1397391360. Throughput: 0: 44888.9. Samples: 1398781200. Policy #0 lag: (min: 5.0, avg: 46.8, max: 123.0) [2024-03-21 07:05:00,522][03784] Avg episode reward: [(0, '1.800')] [2024-03-21 07:05:03,157][03995] Signal inference workers to stop experience collection... (28150 times) [2024-03-21 07:05:03,212][04017] InferenceWorker_p0-w0: stopping experience collection (28150 times) [2024-03-21 07:05:03,401][03995] Signal inference workers to resume experience collection... (28150 times) [2024-03-21 07:05:03,402][04017] InferenceWorker_p0-w0: resuming experience collection (28150 times) [2024-03-21 07:05:04,650][04017] Updated weights for policy 0, policy_version 42655 (0.0012) [2024-03-21 07:05:05,521][03784] Fps is (10 sec: 55705.0, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 1397751808. Throughput: 0: 44315.4. Samples: 1399009600. Policy #0 lag: (min: 5.0, avg: 46.8, max: 123.0) [2024-03-21 07:05:05,522][03784] Avg episode reward: [(0, '0.810')] [2024-03-21 07:05:10,432][04017] Updated weights for policy 0, policy_version 42665 (0.0015) [2024-03-21 07:05:10,521][03784] Fps is (10 sec: 65536.0, 60 sec: 47513.6, 300 sec: 45319.8). Total num frames: 1398046720. Throughput: 0: 44300.0. Samples: 1399254000. Policy #0 lag: (min: 0.0, avg: 45.0, max: 94.0) [2024-03-21 07:05:10,522][03784] Avg episode reward: [(0, '0.616')] [2024-03-21 07:05:15,521][03784] Fps is (10 sec: 45875.8, 60 sec: 46967.5, 300 sec: 45208.7). Total num frames: 1398210560. Throughput: 0: 44126.7. Samples: 1399386200. Policy #0 lag: (min: 0.0, avg: 45.0, max: 94.0) [2024-03-21 07:05:15,522][03784] Avg episode reward: [(0, '1.582')] [2024-03-21 07:05:18,226][04017] Updated weights for policy 0, policy_version 42675 (0.0011) [2024-03-21 07:05:20,521][03784] Fps is (10 sec: 45875.3, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 1398505472. Throughput: 0: 44922.3. Samples: 1399671700. Policy #0 lag: (min: 0.0, avg: 45.0, max: 94.0) [2024-03-21 07:05:20,522][03784] Avg episode reward: [(0, '1.405')] [2024-03-21 07:05:25,521][03784] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 45431.7). Total num frames: 1398669312. Throughput: 0: 45789.0. Samples: 1399942300. Policy #0 lag: (min: 0.0, avg: 45.0, max: 94.0) [2024-03-21 07:05:25,522][03784] Avg episode reward: [(0, '1.405')] [2024-03-21 07:05:25,917][04017] Updated weights for policy 0, policy_version 42685 (0.0016) [2024-03-21 07:05:30,521][03784] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 45430.9). Total num frames: 1398898688. Throughput: 0: 45593.2. Samples: 1400080300. Policy #0 lag: (min: 0.0, avg: 45.0, max: 94.0) [2024-03-21 07:05:30,522][03784] Avg episode reward: [(0, '1.489')] [2024-03-21 07:05:32,246][04017] Updated weights for policy 0, policy_version 42695 (0.0011) [2024-03-21 07:05:35,521][03784] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 45542.0). Total num frames: 1399062528. Throughput: 0: 45035.6. Samples: 1400356400. Policy #0 lag: (min: 0.0, avg: 45.0, max: 94.0) [2024-03-21 07:05:35,522][03784] Avg episode reward: [(0, '1.569')] [2024-03-21 07:05:40,521][03784] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 45319.8). Total num frames: 1399291904. Throughput: 0: 44839.9. Samples: 1400645200. Policy #0 lag: (min: 0.0, avg: 51.9, max: 117.0) [2024-03-21 07:05:40,522][03784] Avg episode reward: [(0, '0.583')] [2024-03-21 07:05:41,481][04017] Updated weights for policy 0, policy_version 42705 (0.0016) [2024-03-21 07:05:45,521][03784] Fps is (10 sec: 42598.3, 60 sec: 40413.8, 300 sec: 45319.8). Total num frames: 1399488512. Throughput: 0: 44460.1. Samples: 1400781900. Policy #0 lag: (min: 0.0, avg: 51.9, max: 117.0) [2024-03-21 07:05:45,522][03784] Avg episode reward: [(0, '0.765')] [2024-03-21 07:05:49,829][04017] Updated weights for policy 0, policy_version 42715 (0.0012) [2024-03-21 07:05:50,521][03784] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 45430.9). Total num frames: 1399685120. Throughput: 0: 45257.8. Samples: 1401046200. Policy #0 lag: (min: 0.0, avg: 51.9, max: 117.0) [2024-03-21 07:05:50,522][03784] Avg episode reward: [(0, '1.512')] [2024-03-21 07:05:55,290][04017] Updated weights for policy 0, policy_version 42725 (0.0016) [2024-03-21 07:05:55,521][03784] Fps is (10 sec: 52428.1, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 1400012800. Throughput: 0: 45453.3. Samples: 1401299400. Policy #0 lag: (min: 0.0, avg: 51.9, max: 117.0) [2024-03-21 07:05:55,522][03784] Avg episode reward: [(0, '1.154')] [2024-03-21 07:05:57,626][03995] Signal inference workers to stop experience collection... (28200 times) [2024-03-21 07:05:57,627][03995] Signal inference workers to resume experience collection... (28200 times) [2024-03-21 07:05:57,728][04017] InferenceWorker_p0-w0: stopping experience collection (28200 times) [2024-03-21 07:05:57,728][04017] InferenceWorker_p0-w0: resuming experience collection (28200 times) [2024-03-21 07:05:59,308][04017] Updated weights for policy 0, policy_version 42735 (0.0013) [2024-03-21 07:06:00,521][03784] Fps is (10 sec: 75365.8, 60 sec: 50790.4, 300 sec: 46319.5). Total num frames: 1400438784. Throughput: 0: 45517.6. Samples: 1401434500. Policy #0 lag: (min: 0.0, avg: 51.9, max: 117.0) [2024-03-21 07:06:00,522][03784] Avg episode reward: [(0, '0.612')] [2024-03-21 07:06:00,935][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000042740_1400504320.pth... [2024-03-21 07:06:01,021][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000042400_1389363200.pth [2024-03-21 07:06:05,521][03784] Fps is (10 sec: 58983.1, 60 sec: 47513.7, 300 sec: 46097.4). Total num frames: 1400602624. Throughput: 0: 44706.7. Samples: 1401683500. Policy #0 lag: (min: 0.0, avg: 51.9, max: 117.0) [2024-03-21 07:06:05,522][03784] Avg episode reward: [(0, '0.834')] [2024-03-21 07:06:07,128][04017] Updated weights for policy 0, policy_version 42745 (0.0016) [2024-03-21 07:06:10,521][03784] Fps is (10 sec: 39321.8, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1400832000. Throughput: 0: 44862.2. Samples: 1401961100. Policy #0 lag: (min: 0.0, avg: 41.2, max: 76.0) [2024-03-21 07:06:10,522][03784] Avg episode reward: [(0, '1.095')] [2024-03-21 07:06:15,266][04017] Updated weights for policy 0, policy_version 42755 (0.0021) [2024-03-21 07:06:15,521][03784] Fps is (10 sec: 39321.6, 60 sec: 46421.3, 300 sec: 44653.4). Total num frames: 1400995840. Throughput: 0: 45091.2. Samples: 1402109400. Policy #0 lag: (min: 0.0, avg: 41.2, max: 76.0) [2024-03-21 07:06:15,522][03784] Avg episode reward: [(0, '1.597')] [2024-03-21 07:06:20,521][03784] Fps is (10 sec: 22937.5, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 1401061376. Throughput: 0: 45228.8. Samples: 1402391700. Policy #0 lag: (min: 0.0, avg: 41.2, max: 76.0) [2024-03-21 07:06:20,522][03784] Avg episode reward: [(0, '0.639')] [2024-03-21 07:06:25,521][03784] Fps is (10 sec: 29491.2, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1401290752. Throughput: 0: 44557.9. Samples: 1402650300. Policy #0 lag: (min: 0.0, avg: 41.2, max: 76.0) [2024-03-21 07:06:25,522][03784] Avg episode reward: [(0, '1.560')] [2024-03-21 07:06:25,746][04017] Updated weights for policy 0, policy_version 42765 (0.0019) [2024-03-21 07:06:30,521][03784] Fps is (10 sec: 52429.2, 60 sec: 44783.0, 300 sec: 45875.2). Total num frames: 1401585664. Throughput: 0: 44484.4. Samples: 1402783700. Policy #0 lag: (min: 0.0, avg: 41.2, max: 76.0) [2024-03-21 07:06:30,522][03784] Avg episode reward: [(0, '0.820')] [2024-03-21 07:06:34,844][04017] Updated weights for policy 0, policy_version 42775 (0.0014) [2024-03-21 07:06:35,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43144.5, 300 sec: 45542.0). Total num frames: 1401651200. Throughput: 0: 45095.6. Samples: 1403075500. Policy #0 lag: (min: 0.0, avg: 35.5, max: 86.0) [2024-03-21 07:06:35,522][03784] Avg episode reward: [(0, '1.414')] [2024-03-21 07:06:40,521][03784] Fps is (10 sec: 32768.1, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1401913344. Throughput: 0: 45624.6. Samples: 1403352500. Policy #0 lag: (min: 0.0, avg: 35.5, max: 86.0) [2024-03-21 07:06:40,522][03784] Avg episode reward: [(0, '1.360')] [2024-03-21 07:06:40,901][04017] Updated weights for policy 0, policy_version 42785 (0.0014) [2024-03-21 07:06:45,521][03784] Fps is (10 sec: 58981.9, 60 sec: 45875.1, 300 sec: 45875.2). Total num frames: 1402241024. Throughput: 0: 45704.5. Samples: 1403491200. Policy #0 lag: (min: 0.0, avg: 35.5, max: 86.0) [2024-03-21 07:06:45,522][03784] Avg episode reward: [(0, '1.360')] [2024-03-21 07:06:46,166][04017] Updated weights for policy 0, policy_version 42795 (0.0010) [2024-03-21 07:06:47,650][03995] Signal inference workers to stop experience collection... (28250 times) [2024-03-21 07:06:47,738][04017] InferenceWorker_p0-w0: stopping experience collection (28250 times) [2024-03-21 07:06:47,887][03995] Signal inference workers to resume experience collection... (28250 times) [2024-03-21 07:06:47,888][04017] InferenceWorker_p0-w0: resuming experience collection (28250 times) [2024-03-21 07:06:50,521][03784] Fps is (10 sec: 65535.8, 60 sec: 48059.7, 300 sec: 46097.4). Total num frames: 1402568704. Throughput: 0: 46228.9. Samples: 1403763800. Policy #0 lag: (min: 0.0, avg: 35.5, max: 86.0) [2024-03-21 07:06:50,522][03784] Avg episode reward: [(0, '1.360')] [2024-03-21 07:06:51,977][04017] Updated weights for policy 0, policy_version 42805 (0.0027) [2024-03-21 07:06:55,521][03784] Fps is (10 sec: 62259.6, 60 sec: 47513.7, 300 sec: 45875.2). Total num frames: 1402863616. Throughput: 0: 45673.4. Samples: 1404016400. Policy #0 lag: (min: 0.0, avg: 35.5, max: 86.0) [2024-03-21 07:06:55,522][03784] Avg episode reward: [(0, '1.400')] [2024-03-21 07:06:57,420][04017] Updated weights for policy 0, policy_version 42815 (0.0015) [2024-03-21 07:07:00,521][03784] Fps is (10 sec: 55705.3, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 1403125760. Throughput: 0: 45442.2. Samples: 1404154300. Policy #0 lag: (min: 0.0, avg: 35.5, max: 86.0) [2024-03-21 07:07:00,522][03784] Avg episode reward: [(0, '1.205')] [2024-03-21 07:07:05,521][03784] Fps is (10 sec: 32767.9, 60 sec: 43144.5, 300 sec: 44986.6). Total num frames: 1403191296. Throughput: 0: 45804.5. Samples: 1404452900. Policy #0 lag: (min: 0.0, avg: 54.7, max: 118.0) [2024-03-21 07:07:05,522][03784] Avg episode reward: [(0, '1.518')] [2024-03-21 07:07:06,317][04017] Updated weights for policy 0, policy_version 42825 (0.0011) [2024-03-21 07:07:10,521][03784] Fps is (10 sec: 26214.4, 60 sec: 42598.4, 300 sec: 45097.6). Total num frames: 1403387904. Throughput: 0: 46095.5. Samples: 1404724600. Policy #0 lag: (min: 0.0, avg: 54.7, max: 118.0) [2024-03-21 07:07:10,522][03784] Avg episode reward: [(0, '1.301')] [2024-03-21 07:07:14,320][04017] Updated weights for policy 0, policy_version 42835 (0.0017) [2024-03-21 07:07:15,521][03784] Fps is (10 sec: 45875.2, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 1403650048. Throughput: 0: 46237.7. Samples: 1404864400. Policy #0 lag: (min: 0.0, avg: 54.7, max: 118.0) [2024-03-21 07:07:15,522][03784] Avg episode reward: [(0, '1.458')] [2024-03-21 07:07:19,956][04017] Updated weights for policy 0, policy_version 42845 (0.0016) [2024-03-21 07:07:20,521][03784] Fps is (10 sec: 55705.7, 60 sec: 48059.8, 300 sec: 45986.3). Total num frames: 1403944960. Throughput: 0: 45837.7. Samples: 1405138200. Policy #0 lag: (min: 0.0, avg: 54.7, max: 118.0) [2024-03-21 07:07:20,522][03784] Avg episode reward: [(0, '1.458')] [2024-03-21 07:07:25,521][03784] Fps is (10 sec: 39321.6, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 1404043264. Throughput: 0: 46688.8. Samples: 1405453500. Policy #0 lag: (min: 0.0, avg: 54.7, max: 118.0) [2024-03-21 07:07:25,522][03784] Avg episode reward: [(0, '1.458')] [2024-03-21 07:07:30,013][04017] Updated weights for policy 0, policy_version 42855 (0.0018) [2024-03-21 07:07:30,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 1404305408. Throughput: 0: 46977.8. Samples: 1405605200. Policy #0 lag: (min: 0.0, avg: 54.7, max: 118.0) [2024-03-21 07:07:30,522][03784] Avg episode reward: [(0, '1.423')] [2024-03-21 07:07:34,695][04017] Updated weights for policy 0, policy_version 42865 (0.0026) [2024-03-21 07:07:35,521][03784] Fps is (10 sec: 55705.4, 60 sec: 49151.9, 300 sec: 45653.0). Total num frames: 1404600320. Throughput: 0: 46628.8. Samples: 1405862100. Policy #0 lag: (min: 1.0, avg: 38.8, max: 79.0) [2024-03-21 07:07:35,522][03784] Avg episode reward: [(0, '1.310')] [2024-03-21 07:07:40,521][03784] Fps is (10 sec: 52428.8, 60 sec: 48605.8, 300 sec: 45542.0). Total num frames: 1404829696. Throughput: 0: 47186.6. Samples: 1406139800. Policy #0 lag: (min: 1.0, avg: 38.8, max: 79.0) [2024-03-21 07:07:40,522][03784] Avg episode reward: [(0, '1.626')] [2024-03-21 07:07:42,173][04017] Updated weights for policy 0, policy_version 42875 (0.0020) [2024-03-21 07:07:45,521][03784] Fps is (10 sec: 36044.5, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 1404960768. Throughput: 0: 46942.1. Samples: 1406266700. Policy #0 lag: (min: 1.0, avg: 38.8, max: 79.0) [2024-03-21 07:07:45,522][03784] Avg episode reward: [(0, '1.485')] [2024-03-21 07:07:49,419][03995] Signal inference workers to stop experience collection... (28300 times) [2024-03-21 07:07:49,479][03995] Signal inference workers to resume experience collection... (28300 times) [2024-03-21 07:07:49,486][04017] InferenceWorker_p0-w0: stopping experience collection (28300 times) [2024-03-21 07:07:49,530][04017] InferenceWorker_p0-w0: resuming experience collection (28300 times) [2024-03-21 07:07:50,508][04017] Updated weights for policy 0, policy_version 42885 (0.0017) [2024-03-21 07:07:50,521][03784] Fps is (10 sec: 42598.2, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 1405255680. Throughput: 0: 45973.3. Samples: 1406521700. Policy #0 lag: (min: 1.0, avg: 38.8, max: 79.0) [2024-03-21 07:07:50,522][03784] Avg episode reward: [(0, '1.498')] [2024-03-21 07:07:55,521][03784] Fps is (10 sec: 55706.5, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 1405517824. Throughput: 0: 45480.1. Samples: 1406771200. Policy #0 lag: (min: 1.0, avg: 38.8, max: 79.0) [2024-03-21 07:07:55,522][03784] Avg episode reward: [(0, '0.967')] [2024-03-21 07:07:57,328][04017] Updated weights for policy 0, policy_version 42895 (0.0010) [2024-03-21 07:08:00,521][03784] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 45097.7). Total num frames: 1405747200. Throughput: 0: 45860.0. Samples: 1406928100. Policy #0 lag: (min: 2.0, avg: 33.4, max: 75.0) [2024-03-21 07:08:00,522][03784] Avg episode reward: [(0, '0.792')] [2024-03-21 07:08:00,708][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000042901_1405779968.pth... [2024-03-21 07:08:00,871][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000042565_1394769920.pth [2024-03-21 07:08:02,122][04017] Updated weights for policy 0, policy_version 42905 (0.0021) [2024-03-21 07:08:05,521][03784] Fps is (10 sec: 55705.7, 60 sec: 48059.8, 300 sec: 45208.7). Total num frames: 1406074880. Throughput: 0: 45180.1. Samples: 1407171300. Policy #0 lag: (min: 2.0, avg: 33.4, max: 75.0) [2024-03-21 07:08:05,522][03784] Avg episode reward: [(0, '1.400')] [2024-03-21 07:08:09,717][04017] Updated weights for policy 0, policy_version 42915 (0.0010) [2024-03-21 07:08:10,521][03784] Fps is (10 sec: 55705.4, 60 sec: 48605.8, 300 sec: 45986.3). Total num frames: 1406304256. Throughput: 0: 44619.9. Samples: 1407461400. Policy #0 lag: (min: 2.0, avg: 33.4, max: 75.0) [2024-03-21 07:08:10,522][03784] Avg episode reward: [(0, '1.124')] [2024-03-21 07:08:15,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 1406435328. Throughput: 0: 44351.1. Samples: 1407601000. Policy #0 lag: (min: 2.0, avg: 33.4, max: 75.0) [2024-03-21 07:08:15,522][03784] Avg episode reward: [(0, '1.008')] [2024-03-21 07:08:18,849][04017] Updated weights for policy 0, policy_version 42925 (0.0013) [2024-03-21 07:08:20,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45329.0, 300 sec: 45986.3). Total num frames: 1406664704. Throughput: 0: 44906.6. Samples: 1407882900. Policy #0 lag: (min: 2.0, avg: 33.4, max: 75.0) [2024-03-21 07:08:20,522][03784] Avg episode reward: [(0, '0.958')] [2024-03-21 07:08:23,929][04017] Updated weights for policy 0, policy_version 42935 (0.0011) [2024-03-21 07:08:25,521][03784] Fps is (10 sec: 58982.6, 60 sec: 49698.2, 300 sec: 45653.1). Total num frames: 1407025152. Throughput: 0: 43986.7. Samples: 1408119200. Policy #0 lag: (min: 2.0, avg: 33.4, max: 75.0) [2024-03-21 07:08:25,522][03784] Avg episode reward: [(0, '0.495')] [2024-03-21 07:08:30,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46967.5, 300 sec: 45319.8). Total num frames: 1407123456. Throughput: 0: 44542.3. Samples: 1408271100. Policy #0 lag: (min: 0.0, avg: 33.8, max: 75.0) [2024-03-21 07:08:30,522][03784] Avg episode reward: [(0, '1.485')] [2024-03-21 07:08:34,062][04017] Updated weights for policy 0, policy_version 42945 (0.0010) [2024-03-21 07:08:35,521][03784] Fps is (10 sec: 22937.8, 60 sec: 44236.9, 300 sec: 44986.6). Total num frames: 1407254528. Throughput: 0: 45291.3. Samples: 1408559800. Policy #0 lag: (min: 0.0, avg: 33.8, max: 75.0) [2024-03-21 07:08:35,522][03784] Avg episode reward: [(0, '1.314')] [2024-03-21 07:08:40,521][03784] Fps is (10 sec: 39321.5, 60 sec: 44782.9, 300 sec: 45097.7). Total num frames: 1407516672. Throughput: 0: 45359.9. Samples: 1408812400. Policy #0 lag: (min: 0.0, avg: 33.8, max: 75.0) [2024-03-21 07:08:40,522][03784] Avg episode reward: [(0, '1.592')] [2024-03-21 07:08:44,319][04017] Updated weights for policy 0, policy_version 42955 (0.0016) [2024-03-21 07:08:45,282][03995] Signal inference workers to stop experience collection... (28350 times) [2024-03-21 07:08:45,393][04017] InferenceWorker_p0-w0: stopping experience collection (28350 times) [2024-03-21 07:08:45,483][03995] Signal inference workers to resume experience collection... (28350 times) [2024-03-21 07:08:45,484][04017] InferenceWorker_p0-w0: resuming experience collection (28350 times) [2024-03-21 07:08:45,521][03784] Fps is (10 sec: 42597.1, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 1407680512. Throughput: 0: 45031.0. Samples: 1408954500. Policy #0 lag: (min: 0.0, avg: 33.8, max: 75.0) [2024-03-21 07:08:45,523][03784] Avg episode reward: [(0, '0.642')] [2024-03-21 07:08:50,521][03784] Fps is (10 sec: 26214.3, 60 sec: 42052.3, 300 sec: 44653.3). Total num frames: 1407778816. Throughput: 0: 45779.9. Samples: 1409231400. Policy #0 lag: (min: 0.0, avg: 33.8, max: 75.0) [2024-03-21 07:08:50,522][03784] Avg episode reward: [(0, '1.075')] [2024-03-21 07:08:51,716][04017] Updated weights for policy 0, policy_version 42965 (0.0011) [2024-03-21 07:08:55,521][03784] Fps is (10 sec: 45876.3, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1408139264. Throughput: 0: 45309.0. Samples: 1409500300. Policy #0 lag: (min: 0.0, avg: 33.8, max: 75.0) [2024-03-21 07:08:55,522][03784] Avg episode reward: [(0, '1.075')] [2024-03-21 07:08:56,648][04017] Updated weights for policy 0, policy_version 42975 (0.0022) [2024-03-21 07:09:00,521][03784] Fps is (10 sec: 72089.9, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 1408499712. Throughput: 0: 45146.6. Samples: 1409632600. Policy #0 lag: (min: 1.0, avg: 31.5, max: 61.0) [2024-03-21 07:09:00,522][03784] Avg episode reward: [(0, '0.836')] [2024-03-21 07:09:02,805][04017] Updated weights for policy 0, policy_version 42985 (0.0015) [2024-03-21 07:09:05,521][03784] Fps is (10 sec: 45875.0, 60 sec: 42052.2, 300 sec: 45430.9). Total num frames: 1408598016. Throughput: 0: 45366.7. Samples: 1409924400. Policy #0 lag: (min: 1.0, avg: 31.5, max: 61.0) [2024-03-21 07:09:05,522][03784] Avg episode reward: [(0, '1.103')] [2024-03-21 07:09:09,447][04017] Updated weights for policy 0, policy_version 42995 (0.0016) [2024-03-21 07:09:10,521][03784] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 45875.2). Total num frames: 1408925696. Throughput: 0: 45900.0. Samples: 1410184700. Policy #0 lag: (min: 1.0, avg: 31.5, max: 61.0) [2024-03-21 07:09:10,522][03784] Avg episode reward: [(0, '1.103')] [2024-03-21 07:09:15,521][03784] Fps is (10 sec: 49151.7, 60 sec: 44236.8, 300 sec: 45097.6). Total num frames: 1409089536. Throughput: 0: 45300.0. Samples: 1410309600. Policy #0 lag: (min: 1.0, avg: 31.5, max: 61.0) [2024-03-21 07:09:15,522][03784] Avg episode reward: [(0, '1.309')] [2024-03-21 07:09:16,480][04017] Updated weights for policy 0, policy_version 43005 (0.0017) [2024-03-21 07:09:20,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44236.9, 300 sec: 44764.4). Total num frames: 1409318912. Throughput: 0: 44453.2. Samples: 1410560200. Policy #0 lag: (min: 1.0, avg: 31.5, max: 61.0) [2024-03-21 07:09:20,522][03784] Avg episode reward: [(0, '1.330')] [2024-03-21 07:09:24,547][04017] Updated weights for policy 0, policy_version 43015 (0.0011) [2024-03-21 07:09:25,521][03784] Fps is (10 sec: 49152.4, 60 sec: 42598.4, 300 sec: 44986.6). Total num frames: 1409581056. Throughput: 0: 44677.9. Samples: 1410822900. Policy #0 lag: (min: 3.0, avg: 34.1, max: 68.0) [2024-03-21 07:09:25,522][03784] Avg episode reward: [(0, '0.523')] [2024-03-21 07:09:27,822][03995] Signal inference workers to stop experience collection... (28400 times) [2024-03-21 07:09:27,964][04017] InferenceWorker_p0-w0: stopping experience collection (28400 times) [2024-03-21 07:09:28,047][03995] Signal inference workers to resume experience collection... (28400 times) [2024-03-21 07:09:28,047][04017] InferenceWorker_p0-w0: resuming experience collection (28400 times) [2024-03-21 07:09:28,448][04017] Updated weights for policy 0, policy_version 43025 (0.0016) [2024-03-21 07:09:30,521][03784] Fps is (10 sec: 65535.5, 60 sec: 47513.6, 300 sec: 45542.0). Total num frames: 1409974272. Throughput: 0: 43966.8. Samples: 1410933000. Policy #0 lag: (min: 3.0, avg: 34.1, max: 68.0) [2024-03-21 07:09:30,522][03784] Avg episode reward: [(0, '1.599')] [2024-03-21 07:09:35,521][03784] Fps is (10 sec: 55705.5, 60 sec: 48059.6, 300 sec: 45097.7). Total num frames: 1410138112. Throughput: 0: 43626.8. Samples: 1411194600. Policy #0 lag: (min: 3.0, avg: 34.1, max: 68.0) [2024-03-21 07:09:35,522][03784] Avg episode reward: [(0, '1.131')] [2024-03-21 07:09:37,200][04017] Updated weights for policy 0, policy_version 43035 (0.0010) [2024-03-21 07:09:40,521][03784] Fps is (10 sec: 36045.2, 60 sec: 46967.5, 300 sec: 44986.6). Total num frames: 1410334720. Throughput: 0: 43546.7. Samples: 1411459900. Policy #0 lag: (min: 3.0, avg: 34.1, max: 68.0) [2024-03-21 07:09:40,522][03784] Avg episode reward: [(0, '1.439')] [2024-03-21 07:09:43,625][04017] Updated weights for policy 0, policy_version 43045 (0.0011) [2024-03-21 07:09:45,521][03784] Fps is (10 sec: 39321.6, 60 sec: 47513.7, 300 sec: 45319.8). Total num frames: 1410531328. Throughput: 0: 43546.7. Samples: 1411592200. Policy #0 lag: (min: 3.0, avg: 34.1, max: 68.0) [2024-03-21 07:09:45,522][03784] Avg episode reward: [(0, '0.988')] [2024-03-21 07:09:50,521][03784] Fps is (10 sec: 36044.6, 60 sec: 48605.9, 300 sec: 45764.1). Total num frames: 1410695168. Throughput: 0: 43540.0. Samples: 1411883700. Policy #0 lag: (min: 3.0, avg: 34.1, max: 68.0) [2024-03-21 07:09:50,522][03784] Avg episode reward: [(0, '0.988')] [2024-03-21 07:09:55,521][03784] Fps is (10 sec: 26214.4, 60 sec: 44236.8, 300 sec: 45430.9). Total num frames: 1410793472. Throughput: 0: 44484.5. Samples: 1412186500. Policy #0 lag: (min: 0.0, avg: 30.1, max: 70.0) [2024-03-21 07:09:55,522][03784] Avg episode reward: [(0, '1.317')] [2024-03-21 07:09:56,229][04017] Updated weights for policy 0, policy_version 43055 (0.0016) [2024-03-21 07:10:00,250][04017] Updated weights for policy 0, policy_version 43065 (0.0012) [2024-03-21 07:10:00,521][03784] Fps is (10 sec: 45874.6, 60 sec: 44236.7, 300 sec: 45430.9). Total num frames: 1411153920. Throughput: 0: 44586.6. Samples: 1412316000. Policy #0 lag: (min: 0.0, avg: 30.1, max: 70.0) [2024-03-21 07:10:00,522][03784] Avg episode reward: [(0, '0.627')] [2024-03-21 07:10:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000043065_1411153920.pth... [2024-03-21 07:10:00,656][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000042740_1400504320.pth [2024-03-21 07:10:05,521][03784] Fps is (10 sec: 58982.6, 60 sec: 46421.4, 300 sec: 45208.7). Total num frames: 1411383296. Throughput: 0: 45077.8. Samples: 1412588700. Policy #0 lag: (min: 0.0, avg: 30.1, max: 70.0) [2024-03-21 07:10:05,522][03784] Avg episode reward: [(0, '0.840')] [2024-03-21 07:10:10,521][03784] Fps is (10 sec: 29491.5, 60 sec: 42052.2, 300 sec: 44875.5). Total num frames: 1411448832. Throughput: 0: 45828.8. Samples: 1412885200. Policy #0 lag: (min: 0.0, avg: 30.1, max: 70.0) [2024-03-21 07:10:10,522][03784] Avg episode reward: [(0, '0.840')] [2024-03-21 07:10:10,735][04017] Updated weights for policy 0, policy_version 43075 (0.0023) [2024-03-21 07:10:15,521][03784] Fps is (10 sec: 26213.9, 60 sec: 42598.3, 300 sec: 44542.2). Total num frames: 1411645440. Throughput: 0: 46357.7. Samples: 1413019100. Policy #0 lag: (min: 0.0, avg: 30.1, max: 70.0) [2024-03-21 07:10:15,522][03784] Avg episode reward: [(0, '0.891')] [2024-03-21 07:10:17,546][04017] Updated weights for policy 0, policy_version 43085 (0.0012) [2024-03-21 07:10:20,521][03784] Fps is (10 sec: 49152.9, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 1411940352. Throughput: 0: 46242.3. Samples: 1413275500. Policy #0 lag: (min: 0.0, avg: 30.1, max: 70.0) [2024-03-21 07:10:20,521][03784] Avg episode reward: [(0, '0.765')] [2024-03-21 07:10:24,561][04017] Updated weights for policy 0, policy_version 43095 (0.0015) [2024-03-21 07:10:25,521][03784] Fps is (10 sec: 52429.6, 60 sec: 43144.5, 300 sec: 44986.6). Total num frames: 1412169728. Throughput: 0: 46740.0. Samples: 1413563200. Policy #0 lag: (min: 0.0, avg: 44.0, max: 110.0) [2024-03-21 07:10:25,522][03784] Avg episode reward: [(0, '1.558')] [2024-03-21 07:10:27,926][03995] Signal inference workers to stop experience collection... (28450 times) [2024-03-21 07:10:28,007][04017] InferenceWorker_p0-w0: stopping experience collection (28450 times) [2024-03-21 07:10:28,146][03995] Signal inference workers to resume experience collection... (28450 times) [2024-03-21 07:10:28,147][04017] InferenceWorker_p0-w0: resuming experience collection (28450 times) [2024-03-21 07:10:28,503][04017] Updated weights for policy 0, policy_version 43105 (0.0015) [2024-03-21 07:10:30,521][03784] Fps is (10 sec: 65535.6, 60 sec: 43690.8, 300 sec: 45875.2). Total num frames: 1412595712. Throughput: 0: 46151.2. Samples: 1413669000. Policy #0 lag: (min: 0.0, avg: 44.0, max: 110.0) [2024-03-21 07:10:30,522][03784] Avg episode reward: [(0, '0.680')] [2024-03-21 07:10:35,120][04017] Updated weights for policy 0, policy_version 43115 (0.0010) [2024-03-21 07:10:35,521][03784] Fps is (10 sec: 62259.4, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 1412792320. Throughput: 0: 45840.1. Samples: 1413946500. Policy #0 lag: (min: 0.0, avg: 44.0, max: 110.0) [2024-03-21 07:10:35,522][03784] Avg episode reward: [(0, '1.154')] [2024-03-21 07:10:40,521][03784] Fps is (10 sec: 29491.1, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 1412890624. Throughput: 0: 45604.5. Samples: 1414238700. Policy #0 lag: (min: 0.0, avg: 44.0, max: 110.0) [2024-03-21 07:10:40,522][03784] Avg episode reward: [(0, '0.627')] [2024-03-21 07:10:43,222][04017] Updated weights for policy 0, policy_version 43125 (0.0017) [2024-03-21 07:10:45,521][03784] Fps is (10 sec: 45874.7, 60 sec: 45329.0, 300 sec: 45986.3). Total num frames: 1413251072. Throughput: 0: 45549.0. Samples: 1414365700. Policy #0 lag: (min: 0.0, avg: 44.0, max: 110.0) [2024-03-21 07:10:45,522][03784] Avg episode reward: [(0, '0.627')] [2024-03-21 07:10:49,663][04017] Updated weights for policy 0, policy_version 43135 (0.0016) [2024-03-21 07:10:50,521][03784] Fps is (10 sec: 62258.6, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 1413513216. Throughput: 0: 45599.9. Samples: 1414640700. Policy #0 lag: (min: 0.0, avg: 38.3, max: 76.0) [2024-03-21 07:10:50,522][03784] Avg episode reward: [(0, '0.675')] [2024-03-21 07:10:55,521][03784] Fps is (10 sec: 45875.6, 60 sec: 48605.9, 300 sec: 44986.6). Total num frames: 1413709824. Throughput: 0: 44884.5. Samples: 1414905000. Policy #0 lag: (min: 0.0, avg: 38.3, max: 76.0) [2024-03-21 07:10:55,522][03784] Avg episode reward: [(0, '1.509')] [2024-03-21 07:10:56,255][04017] Updated weights for policy 0, policy_version 43145 (0.0023) [2024-03-21 07:11:00,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45329.2, 300 sec: 44986.6). Total num frames: 1413873664. Throughput: 0: 44955.7. Samples: 1415042100. Policy #0 lag: (min: 0.0, avg: 38.3, max: 76.0) [2024-03-21 07:11:00,522][03784] Avg episode reward: [(0, '1.105')] [2024-03-21 07:11:05,079][04017] Updated weights for policy 0, policy_version 43155 (0.0011) [2024-03-21 07:11:05,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 45097.7). Total num frames: 1414135808. Throughput: 0: 45437.7. Samples: 1415320200. Policy #0 lag: (min: 0.0, avg: 38.3, max: 76.0) [2024-03-21 07:11:05,522][03784] Avg episode reward: [(0, '1.409')] [2024-03-21 07:11:10,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46421.4, 300 sec: 44875.5). Total num frames: 1414234112. Throughput: 0: 45033.4. Samples: 1415589700. Policy #0 lag: (min: 0.0, avg: 38.3, max: 76.0) [2024-03-21 07:11:10,522][03784] Avg episode reward: [(0, '1.341')] [2024-03-21 07:11:14,786][04017] Updated weights for policy 0, policy_version 43165 (0.0010) [2024-03-21 07:11:15,521][03784] Fps is (10 sec: 29491.0, 60 sec: 46421.4, 300 sec: 45319.8). Total num frames: 1414430720. Throughput: 0: 45768.8. Samples: 1415728600. Policy #0 lag: (min: 0.0, avg: 38.3, max: 76.0) [2024-03-21 07:11:15,522][03784] Avg episode reward: [(0, '1.290')] [2024-03-21 07:11:20,521][03784] Fps is (10 sec: 36044.6, 60 sec: 44236.7, 300 sec: 45097.6). Total num frames: 1414594560. Throughput: 0: 45306.6. Samples: 1415985300. Policy #0 lag: (min: 0.0, avg: 35.2, max: 87.0) [2024-03-21 07:11:20,522][03784] Avg episode reward: [(0, '0.856')] [2024-03-21 07:11:22,155][04017] Updated weights for policy 0, policy_version 43175 (0.0017) [2024-03-21 07:11:23,178][03995] Signal inference workers to stop experience collection... (28500 times) [2024-03-21 07:11:23,254][03995] Signal inference workers to resume experience collection... (28500 times) [2024-03-21 07:11:23,294][04017] InferenceWorker_p0-w0: stopping experience collection (28500 times) [2024-03-21 07:11:23,339][04017] InferenceWorker_p0-w0: resuming experience collection (28500 times) [2024-03-21 07:11:25,521][03784] Fps is (10 sec: 45875.3, 60 sec: 45329.1, 300 sec: 45097.6). Total num frames: 1414889472. Throughput: 0: 45111.1. Samples: 1416268700. Policy #0 lag: (min: 0.0, avg: 35.2, max: 87.0) [2024-03-21 07:11:25,522][03784] Avg episode reward: [(0, '1.175')] [2024-03-21 07:11:29,246][04017] Updated weights for policy 0, policy_version 43185 (0.0016) [2024-03-21 07:11:30,521][03784] Fps is (10 sec: 55705.7, 60 sec: 42598.3, 300 sec: 45764.1). Total num frames: 1415151616. Throughput: 0: 45520.1. Samples: 1416414100. Policy #0 lag: (min: 0.0, avg: 35.2, max: 87.0) [2024-03-21 07:11:30,522][03784] Avg episode reward: [(0, '1.175')] [2024-03-21 07:11:34,398][04017] Updated weights for policy 0, policy_version 43195 (0.0017) [2024-03-21 07:11:35,521][03784] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 45764.1). Total num frames: 1415413760. Throughput: 0: 45682.3. Samples: 1416696400. Policy #0 lag: (min: 0.0, avg: 35.2, max: 87.0) [2024-03-21 07:11:35,522][03784] Avg episode reward: [(0, '1.175')] [2024-03-21 07:11:40,189][04017] Updated weights for policy 0, policy_version 43205 (0.0026) [2024-03-21 07:11:40,521][03784] Fps is (10 sec: 62258.9, 60 sec: 48059.7, 300 sec: 45875.2). Total num frames: 1415774208. Throughput: 0: 45491.0. Samples: 1416952100. Policy #0 lag: (min: 0.0, avg: 35.2, max: 87.0) [2024-03-21 07:11:40,522][03784] Avg episode reward: [(0, '0.696')] [2024-03-21 07:11:45,521][03784] Fps is (10 sec: 62259.1, 60 sec: 46421.4, 300 sec: 45653.0). Total num frames: 1416036352. Throughput: 0: 45459.9. Samples: 1417087800. Policy #0 lag: (min: 4.0, avg: 45.7, max: 93.0) [2024-03-21 07:11:45,522][03784] Avg episode reward: [(0, '0.696')] [2024-03-21 07:11:47,314][04017] Updated weights for policy 0, policy_version 43215 (0.0011) [2024-03-21 07:11:50,521][03784] Fps is (10 sec: 39321.5, 60 sec: 44236.8, 300 sec: 45097.6). Total num frames: 1416167424. Throughput: 0: 45213.2. Samples: 1417354800. Policy #0 lag: (min: 4.0, avg: 45.7, max: 93.0) [2024-03-21 07:11:50,522][03784] Avg episode reward: [(0, '1.685')] [2024-03-21 07:11:55,521][03784] Fps is (10 sec: 32767.8, 60 sec: 44236.7, 300 sec: 44875.5). Total num frames: 1416364032. Throughput: 0: 45911.0. Samples: 1417655700. Policy #0 lag: (min: 4.0, avg: 45.7, max: 93.0) [2024-03-21 07:11:55,522][03784] Avg episode reward: [(0, '1.320')] [2024-03-21 07:11:55,755][04017] Updated weights for policy 0, policy_version 43225 (0.0012) [2024-03-21 07:12:00,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 1416593408. Throughput: 0: 45926.6. Samples: 1417795300. Policy #0 lag: (min: 4.0, avg: 45.7, max: 93.0) [2024-03-21 07:12:00,522][03784] Avg episode reward: [(0, '1.020')] [2024-03-21 07:12:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000043231_1416593408.pth... [2024-03-21 07:12:00,659][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000042901_1405779968.pth [2024-03-21 07:12:04,442][04017] Updated weights for policy 0, policy_version 43235 (0.0024) [2024-03-21 07:12:05,521][03784] Fps is (10 sec: 36045.2, 60 sec: 43144.5, 300 sec: 45208.7). Total num frames: 1416724480. Throughput: 0: 46388.9. Samples: 1418072800. Policy #0 lag: (min: 4.0, avg: 45.7, max: 93.0) [2024-03-21 07:12:05,522][03784] Avg episode reward: [(0, '1.765')] [2024-03-21 07:12:10,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1417019392. Throughput: 0: 46002.2. Samples: 1418338800. Policy #0 lag: (min: 4.0, avg: 45.7, max: 93.0) [2024-03-21 07:12:10,522][03784] Avg episode reward: [(0, '0.945')] [2024-03-21 07:12:10,755][04017] Updated weights for policy 0, policy_version 43245 (0.0011) [2024-03-21 07:12:14,495][03995] Signal inference workers to stop experience collection... (28550 times) [2024-03-21 07:12:14,503][03995] Signal inference workers to resume experience collection... (28550 times) [2024-03-21 07:12:14,581][04017] InferenceWorker_p0-w0: stopping experience collection (28550 times) [2024-03-21 07:12:14,581][04017] InferenceWorker_p0-w0: resuming experience collection (28550 times) [2024-03-21 07:12:15,521][03784] Fps is (10 sec: 52429.0, 60 sec: 46967.5, 300 sec: 45097.7). Total num frames: 1417248768. Throughput: 0: 46240.1. Samples: 1418494900. Policy #0 lag: (min: 2.0, avg: 50.1, max: 117.0) [2024-03-21 07:12:15,522][03784] Avg episode reward: [(0, '0.808')] [2024-03-21 07:12:18,510][04017] Updated weights for policy 0, policy_version 43255 (0.0010) [2024-03-21 07:12:20,521][03784] Fps is (10 sec: 42598.6, 60 sec: 47513.6, 300 sec: 45430.9). Total num frames: 1417445376. Throughput: 0: 46291.1. Samples: 1418779500. Policy #0 lag: (min: 2.0, avg: 50.1, max: 117.0) [2024-03-21 07:12:20,522][03784] Avg episode reward: [(0, '1.582')] [2024-03-21 07:12:25,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 1417609216. Throughput: 0: 47331.2. Samples: 1419082000. Policy #0 lag: (min: 2.0, avg: 50.1, max: 117.0) [2024-03-21 07:12:25,522][03784] Avg episode reward: [(0, '1.271')] [2024-03-21 07:12:27,203][04017] Updated weights for policy 0, policy_version 43265 (0.0020) [2024-03-21 07:12:30,521][03784] Fps is (10 sec: 42598.3, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 1417871360. Throughput: 0: 47295.6. Samples: 1419216100. Policy #0 lag: (min: 2.0, avg: 50.1, max: 117.0) [2024-03-21 07:12:30,522][03784] Avg episode reward: [(0, '1.271')] [2024-03-21 07:12:33,178][04017] Updated weights for policy 0, policy_version 43275 (0.0020) [2024-03-21 07:12:35,521][03784] Fps is (10 sec: 55705.5, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1418166272. Throughput: 0: 46815.6. Samples: 1419461500. Policy #0 lag: (min: 2.0, avg: 50.1, max: 117.0) [2024-03-21 07:12:35,522][03784] Avg episode reward: [(0, '1.123')] [2024-03-21 07:12:37,628][04017] Updated weights for policy 0, policy_version 43285 (0.0015) [2024-03-21 07:12:40,521][03784] Fps is (10 sec: 68813.2, 60 sec: 46421.4, 300 sec: 46097.4). Total num frames: 1418559488. Throughput: 0: 45462.4. Samples: 1419701500. Policy #0 lag: (min: 1.0, avg: 50.0, max: 95.0) [2024-03-21 07:12:40,522][03784] Avg episode reward: [(0, '0.854')] [2024-03-21 07:12:42,876][04017] Updated weights for policy 0, policy_version 43295 (0.0011) [2024-03-21 07:12:45,521][03784] Fps is (10 sec: 55705.7, 60 sec: 44783.0, 300 sec: 45653.1). Total num frames: 1418723328. Throughput: 0: 45502.3. Samples: 1419842900. Policy #0 lag: (min: 1.0, avg: 50.0, max: 95.0) [2024-03-21 07:12:45,522][03784] Avg episode reward: [(0, '1.003')] [2024-03-21 07:12:50,521][03784] Fps is (10 sec: 32767.8, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1418887168. Throughput: 0: 45655.5. Samples: 1420127300. Policy #0 lag: (min: 1.0, avg: 50.0, max: 95.0) [2024-03-21 07:12:50,522][03784] Avg episode reward: [(0, '1.171')] [2024-03-21 07:12:52,200][04017] Updated weights for policy 0, policy_version 43305 (0.0014) [2024-03-21 07:12:55,521][03784] Fps is (10 sec: 45874.8, 60 sec: 46967.5, 300 sec: 45542.0). Total num frames: 1419182080. Throughput: 0: 45413.3. Samples: 1420382400. Policy #0 lag: (min: 1.0, avg: 50.0, max: 95.0) [2024-03-21 07:12:55,522][03784] Avg episode reward: [(0, '0.989')] [2024-03-21 07:12:58,938][04017] Updated weights for policy 0, policy_version 43315 (0.0013) [2024-03-21 07:13:00,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46967.5, 300 sec: 45208.7). Total num frames: 1419411456. Throughput: 0: 44886.6. Samples: 1420514800. Policy #0 lag: (min: 1.0, avg: 50.0, max: 95.0) [2024-03-21 07:13:00,522][03784] Avg episode reward: [(0, '1.266')] [2024-03-21 07:13:04,834][03995] Signal inference workers to stop experience collection... (28600 times) [2024-03-21 07:13:04,903][03995] Signal inference workers to resume experience collection... (28600 times) [2024-03-21 07:13:04,918][04017] InferenceWorker_p0-w0: stopping experience collection (28600 times) [2024-03-21 07:13:04,972][04017] InferenceWorker_p0-w0: resuming experience collection (28600 times) [2024-03-21 07:13:05,302][04017] Updated weights for policy 0, policy_version 43325 (0.0011) [2024-03-21 07:13:05,521][03784] Fps is (10 sec: 49152.2, 60 sec: 49152.0, 300 sec: 45319.8). Total num frames: 1419673600. Throughput: 0: 44906.6. Samples: 1420800300. Policy #0 lag: (min: 1.0, avg: 50.0, max: 95.0) [2024-03-21 07:13:05,522][03784] Avg episode reward: [(0, '1.266')] [2024-03-21 07:13:10,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46967.5, 300 sec: 45430.9). Total num frames: 1419837440. Throughput: 0: 44693.3. Samples: 1421093200. Policy #0 lag: (min: 4.0, avg: 55.8, max: 126.0) [2024-03-21 07:13:10,522][03784] Avg episode reward: [(0, '1.603')] [2024-03-21 07:13:15,521][03784] Fps is (10 sec: 26214.3, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 1419935744. Throughput: 0: 44977.7. Samples: 1421240100. Policy #0 lag: (min: 4.0, avg: 55.8, max: 126.0) [2024-03-21 07:13:15,522][03784] Avg episode reward: [(0, '1.123')] [2024-03-21 07:13:16,762][04017] Updated weights for policy 0, policy_version 43335 (0.0028) [2024-03-21 07:13:20,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46967.4, 300 sec: 44875.5). Total num frames: 1420263424. Throughput: 0: 45622.2. Samples: 1421514500. Policy #0 lag: (min: 4.0, avg: 55.8, max: 126.0) [2024-03-21 07:13:20,522][03784] Avg episode reward: [(0, '1.123')] [2024-03-21 07:13:22,078][04017] Updated weights for policy 0, policy_version 43345 (0.0016) [2024-03-21 07:13:25,521][03784] Fps is (10 sec: 49151.8, 60 sec: 46967.4, 300 sec: 45097.6). Total num frames: 1420427264. Throughput: 0: 46197.6. Samples: 1421780400. Policy #0 lag: (min: 4.0, avg: 55.8, max: 126.0) [2024-03-21 07:13:25,522][03784] Avg episode reward: [(0, '0.569')] [2024-03-21 07:13:28,960][04017] Updated weights for policy 0, policy_version 43355 (0.0016) [2024-03-21 07:13:30,521][03784] Fps is (10 sec: 45875.0, 60 sec: 47513.6, 300 sec: 45653.0). Total num frames: 1420722176. Throughput: 0: 45613.3. Samples: 1421895500. Policy #0 lag: (min: 4.0, avg: 55.8, max: 126.0) [2024-03-21 07:13:30,522][03784] Avg episode reward: [(0, '0.952')] [2024-03-21 07:13:35,521][03784] Fps is (10 sec: 45875.8, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1420886016. Throughput: 0: 45655.6. Samples: 1422181800. Policy #0 lag: (min: 0.0, avg: 43.3, max: 87.0) [2024-03-21 07:13:35,522][03784] Avg episode reward: [(0, '0.849')] [2024-03-21 07:13:38,191][04017] Updated weights for policy 0, policy_version 43365 (0.0015) [2024-03-21 07:13:40,521][03784] Fps is (10 sec: 36044.7, 60 sec: 42052.2, 300 sec: 45430.9). Total num frames: 1421082624. Throughput: 0: 46657.8. Samples: 1422482000. Policy #0 lag: (min: 0.0, avg: 43.3, max: 87.0) [2024-03-21 07:13:40,522][03784] Avg episode reward: [(0, '0.849')] [2024-03-21 07:13:43,843][04017] Updated weights for policy 0, policy_version 43375 (0.0015) [2024-03-21 07:13:45,521][03784] Fps is (10 sec: 49151.9, 60 sec: 44236.8, 300 sec: 46097.4). Total num frames: 1421377536. Throughput: 0: 46922.3. Samples: 1422626300. Policy #0 lag: (min: 0.0, avg: 43.3, max: 87.0) [2024-03-21 07:13:45,522][03784] Avg episode reward: [(0, '0.849')] [2024-03-21 07:13:49,364][04017] Updated weights for policy 0, policy_version 43385 (0.0027) [2024-03-21 07:13:50,521][03784] Fps is (10 sec: 55705.9, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 1421639680. Throughput: 0: 46104.4. Samples: 1422875000. Policy #0 lag: (min: 0.0, avg: 43.3, max: 87.0) [2024-03-21 07:13:50,522][03784] Avg episode reward: [(0, '1.435')] [2024-03-21 07:13:55,521][03784] Fps is (10 sec: 49151.8, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 1421869056. Throughput: 0: 45555.6. Samples: 1423143200. Policy #0 lag: (min: 0.0, avg: 43.3, max: 87.0) [2024-03-21 07:13:55,522][03784] Avg episode reward: [(0, '1.580')] [2024-03-21 07:13:57,789][04017] Updated weights for policy 0, policy_version 43395 (0.0010) [2024-03-21 07:13:59,572][03995] Signal inference workers to stop experience collection... (28650 times) [2024-03-21 07:13:59,641][04017] InferenceWorker_p0-w0: stopping experience collection (28650 times) [2024-03-21 07:13:59,696][03995] Signal inference workers to resume experience collection... (28650 times) [2024-03-21 07:13:59,696][04017] InferenceWorker_p0-w0: resuming experience collection (28650 times) [2024-03-21 07:14:00,521][03784] Fps is (10 sec: 42598.0, 60 sec: 44236.7, 300 sec: 45653.0). Total num frames: 1422065664. Throughput: 0: 45364.4. Samples: 1423281500. Policy #0 lag: (min: 0.0, avg: 43.3, max: 87.0) [2024-03-21 07:14:00,522][03784] Avg episode reward: [(0, '1.306')] [2024-03-21 07:14:00,830][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000043399_1422098432.pth... [2024-03-21 07:14:00,959][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000043065_1411153920.pth [2024-03-21 07:14:05,521][04017] Updated weights for policy 0, policy_version 43405 (0.0022) [2024-03-21 07:14:05,521][03784] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1422295040. Throughput: 0: 45266.8. Samples: 1423551500. Policy #0 lag: (min: 2.0, avg: 46.6, max: 110.0) [2024-03-21 07:14:05,521][03784] Avg episode reward: [(0, '0.777')] [2024-03-21 07:14:10,521][03784] Fps is (10 sec: 52429.3, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 1422589952. Throughput: 0: 44655.6. Samples: 1423789900. Policy #0 lag: (min: 2.0, avg: 46.6, max: 110.0) [2024-03-21 07:14:10,522][03784] Avg episode reward: [(0, '1.374')] [2024-03-21 07:14:12,101][04017] Updated weights for policy 0, policy_version 43415 (0.0015) [2024-03-21 07:14:15,521][03784] Fps is (10 sec: 52428.4, 60 sec: 48059.8, 300 sec: 45764.1). Total num frames: 1422819328. Throughput: 0: 45209.0. Samples: 1423929900. Policy #0 lag: (min: 2.0, avg: 46.6, max: 110.0) [2024-03-21 07:14:15,522][03784] Avg episode reward: [(0, '1.556')] [2024-03-21 07:14:18,413][04017] Updated weights for policy 0, policy_version 43425 (0.0012) [2024-03-21 07:14:20,521][03784] Fps is (10 sec: 39321.9, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1422983168. Throughput: 0: 44562.2. Samples: 1424187100. Policy #0 lag: (min: 2.0, avg: 46.6, max: 110.0) [2024-03-21 07:14:20,522][03784] Avg episode reward: [(0, '1.655')] [2024-03-21 07:14:24,369][04017] Updated weights for policy 0, policy_version 43435 (0.0011) [2024-03-21 07:14:25,521][03784] Fps is (10 sec: 45875.2, 60 sec: 47513.7, 300 sec: 45097.7). Total num frames: 1423278080. Throughput: 0: 44104.5. Samples: 1424466700. Policy #0 lag: (min: 2.0, avg: 46.6, max: 110.0) [2024-03-21 07:14:25,522][03784] Avg episode reward: [(0, '1.771')] [2024-03-21 07:14:30,521][03784] Fps is (10 sec: 52428.3, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1423507456. Throughput: 0: 43811.0. Samples: 1424597800. Policy #0 lag: (min: 0.0, avg: 39.2, max: 124.0) [2024-03-21 07:14:30,522][03784] Avg episode reward: [(0, '0.916')] [2024-03-21 07:14:32,708][04017] Updated weights for policy 0, policy_version 43445 (0.0020) [2024-03-21 07:14:35,521][03784] Fps is (10 sec: 49152.0, 60 sec: 48059.7, 300 sec: 45542.0). Total num frames: 1423769600. Throughput: 0: 44091.2. Samples: 1424859100. Policy #0 lag: (min: 0.0, avg: 39.2, max: 124.0) [2024-03-21 07:14:35,522][03784] Avg episode reward: [(0, '0.496')] [2024-03-21 07:14:40,521][03784] Fps is (10 sec: 39321.8, 60 sec: 46967.5, 300 sec: 45319.8). Total num frames: 1423900672. Throughput: 0: 44491.1. Samples: 1425145300. Policy #0 lag: (min: 0.0, avg: 39.2, max: 124.0) [2024-03-21 07:14:40,522][03784] Avg episode reward: [(0, '1.109')] [2024-03-21 07:14:41,716][04017] Updated weights for policy 0, policy_version 43455 (0.0021) [2024-03-21 07:14:45,521][03784] Fps is (10 sec: 22937.6, 60 sec: 43690.7, 300 sec: 45097.7). Total num frames: 1423998976. Throughput: 0: 44753.5. Samples: 1425295400. Policy #0 lag: (min: 0.0, avg: 39.2, max: 124.0) [2024-03-21 07:14:45,522][03784] Avg episode reward: [(0, '1.078')] [2024-03-21 07:14:49,365][04017] Updated weights for policy 0, policy_version 43465 (0.0015) [2024-03-21 07:14:50,521][03784] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 1424359424. Throughput: 0: 45019.9. Samples: 1425577400. Policy #0 lag: (min: 0.0, avg: 39.2, max: 124.0) [2024-03-21 07:14:50,522][03784] Avg episode reward: [(0, '1.070')] [2024-03-21 07:14:55,521][03784] Fps is (10 sec: 36044.9, 60 sec: 41506.2, 300 sec: 44764.4). Total num frames: 1424359424. Throughput: 0: 46011.2. Samples: 1425860400. Policy #0 lag: (min: 0.0, avg: 39.2, max: 124.0) [2024-03-21 07:14:55,522][03784] Avg episode reward: [(0, '1.181')] [2024-03-21 07:14:56,983][03995] Signal inference workers to stop experience collection... (28700 times) [2024-03-21 07:14:56,984][03995] Signal inference workers to resume experience collection... (28700 times) [2024-03-21 07:14:57,066][04017] InferenceWorker_p0-w0: stopping experience collection (28700 times) [2024-03-21 07:14:57,066][04017] InferenceWorker_p0-w0: resuming experience collection (28700 times) [2024-03-21 07:14:57,686][04017] Updated weights for policy 0, policy_version 43475 (0.0012) [2024-03-21 07:15:00,521][03784] Fps is (10 sec: 26214.6, 60 sec: 42598.5, 300 sec: 44875.5). Total num frames: 1424621568. Throughput: 0: 45800.0. Samples: 1425990900. Policy #0 lag: (min: 4.0, avg: 42.4, max: 111.0) [2024-03-21 07:15:00,522][03784] Avg episode reward: [(0, '1.492')] [2024-03-21 07:15:04,237][04017] Updated weights for policy 0, policy_version 43485 (0.0014) [2024-03-21 07:15:05,521][03784] Fps is (10 sec: 55705.5, 60 sec: 43690.6, 300 sec: 45653.1). Total num frames: 1424916480. Throughput: 0: 46180.0. Samples: 1426265200. Policy #0 lag: (min: 4.0, avg: 42.4, max: 111.0) [2024-03-21 07:15:05,522][03784] Avg episode reward: [(0, '1.492')] [2024-03-21 07:15:10,484][04017] Updated weights for policy 0, policy_version 43495 (0.0011) [2024-03-21 07:15:10,521][03784] Fps is (10 sec: 62259.1, 60 sec: 44236.8, 300 sec: 46097.4). Total num frames: 1425244160. Throughput: 0: 45795.6. Samples: 1426527500. Policy #0 lag: (min: 4.0, avg: 42.4, max: 111.0) [2024-03-21 07:15:10,522][03784] Avg episode reward: [(0, '1.373')] [2024-03-21 07:15:15,521][03784] Fps is (10 sec: 58981.9, 60 sec: 44782.9, 300 sec: 45986.2). Total num frames: 1425506304. Throughput: 0: 45584.4. Samples: 1426649100. Policy #0 lag: (min: 4.0, avg: 42.4, max: 111.0) [2024-03-21 07:15:15,522][03784] Avg episode reward: [(0, '1.429')] [2024-03-21 07:15:17,016][04017] Updated weights for policy 0, policy_version 43505 (0.0014) [2024-03-21 07:15:20,521][03784] Fps is (10 sec: 62258.5, 60 sec: 48059.6, 300 sec: 46430.6). Total num frames: 1425866752. Throughput: 0: 45557.7. Samples: 1426909200. Policy #0 lag: (min: 4.0, avg: 42.4, max: 111.0) [2024-03-21 07:15:20,522][03784] Avg episode reward: [(0, '0.969')] [2024-03-21 07:15:20,692][04017] Updated weights for policy 0, policy_version 43515 (0.0023) [2024-03-21 07:15:25,521][03784] Fps is (10 sec: 49151.5, 60 sec: 45328.9, 300 sec: 45430.9). Total num frames: 1425997824. Throughput: 0: 45224.3. Samples: 1427180400. Policy #0 lag: (min: 4.0, avg: 42.4, max: 111.0) [2024-03-21 07:15:25,523][03784] Avg episode reward: [(0, '1.336')] [2024-03-21 07:15:30,521][03784] Fps is (10 sec: 29491.5, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 1426161664. Throughput: 0: 44826.7. Samples: 1427312600. Policy #0 lag: (min: 0.0, avg: 47.9, max: 112.0) [2024-03-21 07:15:30,522][03784] Avg episode reward: [(0, '1.095')] [2024-03-21 07:15:32,055][04017] Updated weights for policy 0, policy_version 43525 (0.0011) [2024-03-21 07:15:35,521][03784] Fps is (10 sec: 26215.0, 60 sec: 41506.2, 300 sec: 45319.8). Total num frames: 1426259968. Throughput: 0: 45295.6. Samples: 1427615700. Policy #0 lag: (min: 0.0, avg: 47.9, max: 112.0) [2024-03-21 07:15:35,522][03784] Avg episode reward: [(0, '1.317')] [2024-03-21 07:15:40,521][03784] Fps is (10 sec: 26214.3, 60 sec: 42052.3, 300 sec: 44653.4). Total num frames: 1426423808. Throughput: 0: 45642.2. Samples: 1427914300. Policy #0 lag: (min: 0.0, avg: 47.9, max: 112.0) [2024-03-21 07:15:40,522][03784] Avg episode reward: [(0, '0.715')] [2024-03-21 07:15:42,798][04017] Updated weights for policy 0, policy_version 43535 (0.0016) [2024-03-21 07:15:45,521][03784] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1426718720. Throughput: 0: 45702.2. Samples: 1428047500. Policy #0 lag: (min: 0.0, avg: 47.9, max: 112.0) [2024-03-21 07:15:45,522][03784] Avg episode reward: [(0, '1.453')] [2024-03-21 07:15:49,814][04017] Updated weights for policy 0, policy_version 43545 (0.0021) [2024-03-21 07:15:49,870][03995] Signal inference workers to stop experience collection... (28750 times) [2024-03-21 07:15:49,990][04017] InferenceWorker_p0-w0: stopping experience collection (28750 times) [2024-03-21 07:15:50,105][03995] Signal inference workers to resume experience collection... (28750 times) [2024-03-21 07:15:50,105][04017] InferenceWorker_p0-w0: resuming experience collection (28750 times) [2024-03-21 07:15:50,521][03784] Fps is (10 sec: 52428.1, 60 sec: 43144.4, 300 sec: 44875.5). Total num frames: 1426948096. Throughput: 0: 45375.4. Samples: 1428307100. Policy #0 lag: (min: 0.0, avg: 47.9, max: 112.0) [2024-03-21 07:15:50,522][03784] Avg episode reward: [(0, '1.308')] [2024-03-21 07:15:55,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46967.5, 300 sec: 45097.7). Total num frames: 1427177472. Throughput: 0: 45291.1. Samples: 1428565600. Policy #0 lag: (min: 0.0, avg: 38.0, max: 76.0) [2024-03-21 07:15:55,522][03784] Avg episode reward: [(0, '1.445')] [2024-03-21 07:15:55,588][04017] Updated weights for policy 0, policy_version 43555 (0.0015) [2024-03-21 07:16:00,521][03784] Fps is (10 sec: 55706.2, 60 sec: 48059.7, 300 sec: 45319.8). Total num frames: 1427505152. Throughput: 0: 45557.8. Samples: 1428699200. Policy #0 lag: (min: 0.0, avg: 38.0, max: 76.0) [2024-03-21 07:16:00,522][03784] Avg episode reward: [(0, '1.504')] [2024-03-21 07:16:00,538][04017] Updated weights for policy 0, policy_version 43565 (0.0009) [2024-03-21 07:16:00,898][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000043566_1427570688.pth... [2024-03-21 07:16:00,965][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000043231_1416593408.pth [2024-03-21 07:16:05,521][03784] Fps is (10 sec: 65536.0, 60 sec: 48605.9, 300 sec: 46097.4). Total num frames: 1427832832. Throughput: 0: 45122.4. Samples: 1428939700. Policy #0 lag: (min: 0.0, avg: 38.0, max: 76.0) [2024-03-21 07:16:05,522][03784] Avg episode reward: [(0, '1.200')] [2024-03-21 07:16:05,584][04017] Updated weights for policy 0, policy_version 43575 (0.0010) [2024-03-21 07:16:10,521][03784] Fps is (10 sec: 49151.8, 60 sec: 45875.1, 300 sec: 45986.3). Total num frames: 1427996672. Throughput: 0: 45733.4. Samples: 1429238400. Policy #0 lag: (min: 0.0, avg: 38.0, max: 76.0) [2024-03-21 07:16:10,522][03784] Avg episode reward: [(0, '1.512')] [2024-03-21 07:16:15,124][04017] Updated weights for policy 0, policy_version 43585 (0.0012) [2024-03-21 07:16:15,521][03784] Fps is (10 sec: 36044.6, 60 sec: 44783.0, 300 sec: 46097.4). Total num frames: 1428193280. Throughput: 0: 46160.0. Samples: 1429389800. Policy #0 lag: (min: 0.0, avg: 38.0, max: 76.0) [2024-03-21 07:16:15,522][03784] Avg episode reward: [(0, '1.161')] [2024-03-21 07:16:20,521][03784] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 45875.2). Total num frames: 1428422656. Throughput: 0: 45646.5. Samples: 1429669800. Policy #0 lag: (min: 0.0, avg: 38.0, max: 76.0) [2024-03-21 07:16:20,522][03784] Avg episode reward: [(0, '1.939')] [2024-03-21 07:16:20,535][03995] Saving new best policy, reward=1.939! [2024-03-21 07:16:23,635][04017] Updated weights for policy 0, policy_version 43595 (0.0016) [2024-03-21 07:16:25,521][03784] Fps is (10 sec: 45874.8, 60 sec: 44236.9, 300 sec: 45764.1). Total num frames: 1428652032. Throughput: 0: 45477.7. Samples: 1429960800. Policy #0 lag: (min: 0.0, avg: 45.2, max: 88.0) [2024-03-21 07:16:25,522][03784] Avg episode reward: [(0, '1.939')] [2024-03-21 07:16:29,112][04017] Updated weights for policy 0, policy_version 43605 (0.0012) [2024-03-21 07:16:30,521][03784] Fps is (10 sec: 45875.3, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 1428881408. Throughput: 0: 45548.8. Samples: 1430097200. Policy #0 lag: (min: 0.0, avg: 45.2, max: 88.0) [2024-03-21 07:16:30,522][03784] Avg episode reward: [(0, '1.092')] [2024-03-21 07:16:35,103][04017] Updated weights for policy 0, policy_version 43615 (0.0022) [2024-03-21 07:16:35,521][03784] Fps is (10 sec: 52429.1, 60 sec: 48605.8, 300 sec: 45430.9). Total num frames: 1429176320. Throughput: 0: 46204.6. Samples: 1430386300. Policy #0 lag: (min: 0.0, avg: 45.2, max: 88.0) [2024-03-21 07:16:35,522][03784] Avg episode reward: [(0, '1.574')] [2024-03-21 07:16:40,521][03784] Fps is (10 sec: 39321.5, 60 sec: 47513.6, 300 sec: 44875.5). Total num frames: 1429274624. Throughput: 0: 46553.2. Samples: 1430660500. Policy #0 lag: (min: 0.0, avg: 45.2, max: 88.0) [2024-03-21 07:16:40,522][03784] Avg episode reward: [(0, '1.517')] [2024-03-21 07:16:44,016][03995] Signal inference workers to stop experience collection... (28800 times) [2024-03-21 07:16:44,016][03995] Signal inference workers to resume experience collection... (28800 times) [2024-03-21 07:16:44,105][04017] InferenceWorker_p0-w0: stopping experience collection (28800 times) [2024-03-21 07:16:44,105][04017] InferenceWorker_p0-w0: resuming experience collection (28800 times) [2024-03-21 07:16:44,367][04017] Updated weights for policy 0, policy_version 43625 (0.0012) [2024-03-21 07:16:45,521][03784] Fps is (10 sec: 32768.1, 60 sec: 46421.3, 300 sec: 45208.7). Total num frames: 1429504000. Throughput: 0: 46689.0. Samples: 1430800200. Policy #0 lag: (min: 0.0, avg: 45.2, max: 88.0) [2024-03-21 07:16:45,522][03784] Avg episode reward: [(0, '1.318')] [2024-03-21 07:16:50,521][03784] Fps is (10 sec: 49152.0, 60 sec: 46967.5, 300 sec: 45430.9). Total num frames: 1429766144. Throughput: 0: 47804.3. Samples: 1431090900. Policy #0 lag: (min: 0.0, avg: 45.2, max: 88.0) [2024-03-21 07:16:50,522][03784] Avg episode reward: [(0, '1.245')] [2024-03-21 07:16:51,657][04017] Updated weights for policy 0, policy_version 43635 (0.0011) [2024-03-21 07:16:55,521][03784] Fps is (10 sec: 45874.7, 60 sec: 46421.2, 300 sec: 45319.8). Total num frames: 1429962752. Throughput: 0: 46964.5. Samples: 1431351800. Policy #0 lag: (min: 0.0, avg: 36.1, max: 84.0) [2024-03-21 07:16:55,522][03784] Avg episode reward: [(0, '0.858')] [2024-03-21 07:16:58,844][04017] Updated weights for policy 0, policy_version 43645 (0.0012) [2024-03-21 07:17:00,521][03784] Fps is (10 sec: 52428.7, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 1430290432. Throughput: 0: 46619.9. Samples: 1431487700. Policy #0 lag: (min: 0.0, avg: 36.1, max: 84.0) [2024-03-21 07:17:00,522][03784] Avg episode reward: [(0, '1.020')] [2024-03-21 07:17:04,736][04017] Updated weights for policy 0, policy_version 43655 (0.0015) [2024-03-21 07:17:05,521][03784] Fps is (10 sec: 58982.1, 60 sec: 45328.9, 300 sec: 45875.2). Total num frames: 1430552576. Throughput: 0: 46735.5. Samples: 1431772900. Policy #0 lag: (min: 0.0, avg: 36.1, max: 84.0) [2024-03-21 07:17:05,522][03784] Avg episode reward: [(0, '1.376')] [2024-03-21 07:17:10,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 1430716416. Throughput: 0: 46568.9. Samples: 1432056400. Policy #0 lag: (min: 0.0, avg: 36.1, max: 84.0) [2024-03-21 07:17:10,522][03784] Avg episode reward: [(0, '1.376')] [2024-03-21 07:17:13,133][04017] Updated weights for policy 0, policy_version 43665 (0.0009) [2024-03-21 07:17:15,521][03784] Fps is (10 sec: 32768.5, 60 sec: 44782.9, 300 sec: 45542.0). Total num frames: 1430880256. Throughput: 0: 46791.2. Samples: 1432202800. Policy #0 lag: (min: 0.0, avg: 36.1, max: 84.0) [2024-03-21 07:17:15,522][03784] Avg episode reward: [(0, '0.957')] [2024-03-21 07:17:19,462][04017] Updated weights for policy 0, policy_version 43675 (0.0044) [2024-03-21 07:17:20,521][03784] Fps is (10 sec: 45875.2, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 1431175168. Throughput: 0: 46431.1. Samples: 1432475700. Policy #0 lag: (min: 4.0, avg: 47.8, max: 114.0) [2024-03-21 07:17:20,522][03784] Avg episode reward: [(0, '1.455')] [2024-03-21 07:17:25,521][03784] Fps is (10 sec: 39321.8, 60 sec: 43690.8, 300 sec: 45430.9). Total num frames: 1431273472. Throughput: 0: 46764.6. Samples: 1432764900. Policy #0 lag: (min: 4.0, avg: 47.8, max: 114.0) [2024-03-21 07:17:25,521][03784] Avg episode reward: [(0, '0.553')] [2024-03-21 07:17:30,521][03784] Fps is (10 sec: 26214.5, 60 sec: 42598.4, 300 sec: 44986.6). Total num frames: 1431437312. Throughput: 0: 46448.8. Samples: 1432890400. Policy #0 lag: (min: 4.0, avg: 47.8, max: 114.0) [2024-03-21 07:17:30,522][03784] Avg episode reward: [(0, '1.606')] [2024-03-21 07:17:32,470][04017] Updated weights for policy 0, policy_version 43685 (0.0018) [2024-03-21 07:17:35,521][03784] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 44653.3). Total num frames: 1431732224. Throughput: 0: 46100.0. Samples: 1433165400. Policy #0 lag: (min: 4.0, avg: 47.8, max: 114.0) [2024-03-21 07:17:35,522][03784] Avg episode reward: [(0, '0.935')] [2024-03-21 07:17:36,056][04017] Updated weights for policy 0, policy_version 43695 (0.0014) [2024-03-21 07:17:40,521][03784] Fps is (10 sec: 55704.9, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 1431994368. Throughput: 0: 45864.4. Samples: 1433415700. Policy #0 lag: (min: 4.0, avg: 47.8, max: 114.0) [2024-03-21 07:17:40,522][03784] Avg episode reward: [(0, '1.300')] [2024-03-21 07:17:41,777][03995] Signal inference workers to stop experience collection... (28850 times) [2024-03-21 07:17:41,847][03995] Signal inference workers to resume experience collection... (28850 times) [2024-03-21 07:17:41,851][04017] InferenceWorker_p0-w0: stopping experience collection (28850 times) [2024-03-21 07:17:41,919][04017] InferenceWorker_p0-w0: resuming experience collection (28850 times) [2024-03-21 07:17:42,541][04017] Updated weights for policy 0, policy_version 43705 (0.0013) [2024-03-21 07:17:45,521][03784] Fps is (10 sec: 68812.7, 60 sec: 48605.8, 300 sec: 45875.2). Total num frames: 1432420352. Throughput: 0: 45688.9. Samples: 1433543700. Policy #0 lag: (min: 4.0, avg: 47.8, max: 114.0) [2024-03-21 07:17:45,522][03784] Avg episode reward: [(0, '1.002')] [2024-03-21 07:17:45,955][04017] Updated weights for policy 0, policy_version 43715 (0.0011) [2024-03-21 07:17:50,521][03784] Fps is (10 sec: 65536.5, 60 sec: 48059.7, 300 sec: 45653.0). Total num frames: 1432649728. Throughput: 0: 44911.2. Samples: 1433793900. Policy #0 lag: (min: 0.0, avg: 44.4, max: 98.0) [2024-03-21 07:17:50,522][03784] Avg episode reward: [(0, '1.555')] [2024-03-21 07:17:52,911][04017] Updated weights for policy 0, policy_version 43725 (0.0015) [2024-03-21 07:17:55,521][03784] Fps is (10 sec: 45875.3, 60 sec: 48605.9, 300 sec: 45653.0). Total num frames: 1432879104. Throughput: 0: 44817.8. Samples: 1434073200. Policy #0 lag: (min: 0.0, avg: 44.4, max: 98.0) [2024-03-21 07:17:55,522][03784] Avg episode reward: [(0, '1.689')] [2024-03-21 07:18:00,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 1433010176. Throughput: 0: 44897.8. Samples: 1434223200. Policy #0 lag: (min: 0.0, avg: 44.4, max: 98.0) [2024-03-21 07:18:00,522][03784] Avg episode reward: [(0, '1.689')] [2024-03-21 07:18:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000043732_1433010176.pth... [2024-03-21 07:18:00,706][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000043399_1422098432.pth [2024-03-21 07:18:02,764][04017] Updated weights for policy 0, policy_version 43735 (0.0021) [2024-03-21 07:18:05,521][03784] Fps is (10 sec: 36045.0, 60 sec: 44783.0, 300 sec: 45430.9). Total num frames: 1433239552. Throughput: 0: 45148.9. Samples: 1434507400. Policy #0 lag: (min: 0.0, avg: 44.4, max: 98.0) [2024-03-21 07:18:05,522][03784] Avg episode reward: [(0, '0.637')] [2024-03-21 07:18:10,082][04017] Updated weights for policy 0, policy_version 43745 (0.0011) [2024-03-21 07:18:10,521][03784] Fps is (10 sec: 42598.3, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 1433436160. Throughput: 0: 44737.7. Samples: 1434778100. Policy #0 lag: (min: 0.0, avg: 44.4, max: 98.0) [2024-03-21 07:18:10,522][03784] Avg episode reward: [(0, '1.153')] [2024-03-21 07:18:15,521][03784] Fps is (10 sec: 49152.3, 60 sec: 47513.6, 300 sec: 45653.1). Total num frames: 1433731072. Throughput: 0: 44931.2. Samples: 1434912300. Policy #0 lag: (min: 0.0, avg: 44.4, max: 98.0) [2024-03-21 07:18:15,522][03784] Avg episode reward: [(0, '1.338')] [2024-03-21 07:18:15,984][04017] Updated weights for policy 0, policy_version 43755 (0.0013) [2024-03-21 07:18:20,526][03784] Fps is (10 sec: 36026.5, 60 sec: 43687.0, 300 sec: 45319.0). Total num frames: 1433796608. Throughput: 0: 45468.2. Samples: 1435211700. Policy #0 lag: (min: 0.0, avg: 31.8, max: 75.0) [2024-03-21 07:18:20,527][03784] Avg episode reward: [(0, '1.093')] [2024-03-21 07:18:25,521][03784] Fps is (10 sec: 19660.7, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 1433927680. Throughput: 0: 46382.4. Samples: 1435502900. Policy #0 lag: (min: 0.0, avg: 31.8, max: 75.0) [2024-03-21 07:18:25,522][03784] Avg episode reward: [(0, '1.585')] [2024-03-21 07:18:27,041][04017] Updated weights for policy 0, policy_version 43765 (0.0018) [2024-03-21 07:18:29,806][03995] Signal inference workers to stop experience collection... (28900 times) [2024-03-21 07:18:29,807][03995] Signal inference workers to resume experience collection... (28900 times) [2024-03-21 07:18:29,885][04017] InferenceWorker_p0-w0: stopping experience collection (28900 times) [2024-03-21 07:18:29,885][04017] InferenceWorker_p0-w0: resuming experience collection (28900 times) [2024-03-21 07:18:30,521][03784] Fps is (10 sec: 55733.8, 60 sec: 48605.8, 300 sec: 45653.0). Total num frames: 1434353664. Throughput: 0: 46140.0. Samples: 1435620000. Policy #0 lag: (min: 0.0, avg: 31.8, max: 75.0) [2024-03-21 07:18:30,522][03784] Avg episode reward: [(0, '1.060')] [2024-03-21 07:18:32,184][04017] Updated weights for policy 0, policy_version 43775 (0.0010) [2024-03-21 07:18:35,521][03784] Fps is (10 sec: 58982.6, 60 sec: 46421.4, 300 sec: 45542.0). Total num frames: 1434517504. Throughput: 0: 46686.8. Samples: 1435894800. Policy #0 lag: (min: 0.0, avg: 31.8, max: 75.0) [2024-03-21 07:18:35,522][03784] Avg episode reward: [(0, '1.031')] [2024-03-21 07:18:40,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 1434714112. Throughput: 0: 46642.2. Samples: 1436172100. Policy #0 lag: (min: 0.0, avg: 31.8, max: 75.0) [2024-03-21 07:18:40,522][03784] Avg episode reward: [(0, '0.607')] [2024-03-21 07:18:40,760][04017] Updated weights for policy 0, policy_version 43785 (0.0016) [2024-03-21 07:18:44,867][04017] Updated weights for policy 0, policy_version 43795 (0.0011) [2024-03-21 07:18:45,521][03784] Fps is (10 sec: 55705.0, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 1435074560. Throughput: 0: 46237.7. Samples: 1436303900. Policy #0 lag: (min: 0.0, avg: 40.6, max: 75.0) [2024-03-21 07:18:45,522][03784] Avg episode reward: [(0, '0.607')] [2024-03-21 07:18:50,521][03784] Fps is (10 sec: 55706.2, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 1435271168. Throughput: 0: 46013.4. Samples: 1436578000. Policy #0 lag: (min: 0.0, avg: 40.6, max: 75.0) [2024-03-21 07:18:50,522][03784] Avg episode reward: [(0, '1.240')] [2024-03-21 07:18:53,191][04017] Updated weights for policy 0, policy_version 43805 (0.0011) [2024-03-21 07:18:55,521][03784] Fps is (10 sec: 36045.0, 60 sec: 42598.4, 300 sec: 45319.8). Total num frames: 1435435008. Throughput: 0: 45404.5. Samples: 1436821300. Policy #0 lag: (min: 0.0, avg: 40.6, max: 75.0) [2024-03-21 07:18:55,522][03784] Avg episode reward: [(0, '0.702')] [2024-03-21 07:19:00,521][03784] Fps is (10 sec: 29491.0, 60 sec: 42598.4, 300 sec: 44986.6). Total num frames: 1435566080. Throughput: 0: 45215.5. Samples: 1436947000. Policy #0 lag: (min: 0.0, avg: 40.6, max: 75.0) [2024-03-21 07:19:00,522][03784] Avg episode reward: [(0, '1.706')] [2024-03-21 07:19:03,268][04017] Updated weights for policy 0, policy_version 43815 (0.0020) [2024-03-21 07:19:05,521][03784] Fps is (10 sec: 45874.9, 60 sec: 44236.8, 300 sec: 45097.6). Total num frames: 1435893760. Throughput: 0: 44093.9. Samples: 1437195700. Policy #0 lag: (min: 0.0, avg: 40.6, max: 75.0) [2024-03-21 07:19:05,522][03784] Avg episode reward: [(0, '1.274')] [2024-03-21 07:19:09,860][04017] Updated weights for policy 0, policy_version 43825 (0.0011) [2024-03-21 07:19:10,521][03784] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1436057600. Throughput: 0: 43468.9. Samples: 1437459000. Policy #0 lag: (min: 0.0, avg: 40.6, max: 75.0) [2024-03-21 07:19:10,522][03784] Avg episode reward: [(0, '1.480')] [2024-03-21 07:19:15,521][03784] Fps is (10 sec: 36045.0, 60 sec: 42052.2, 300 sec: 44986.6). Total num frames: 1436254208. Throughput: 0: 44042.3. Samples: 1437601900. Policy #0 lag: (min: 1.0, avg: 31.9, max: 67.0) [2024-03-21 07:19:15,522][03784] Avg episode reward: [(0, '1.480')] [2024-03-21 07:19:17,348][04017] Updated weights for policy 0, policy_version 43835 (0.0020) [2024-03-21 07:19:20,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45332.9, 300 sec: 44875.5). Total num frames: 1436516352. Throughput: 0: 44115.5. Samples: 1437880000. Policy #0 lag: (min: 1.0, avg: 31.9, max: 67.0) [2024-03-21 07:19:20,522][03784] Avg episode reward: [(0, '1.047')] [2024-03-21 07:19:24,596][04017] Updated weights for policy 0, policy_version 43845 (0.0011) [2024-03-21 07:19:25,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46421.3, 300 sec: 44764.4). Total num frames: 1436712960. Throughput: 0: 43855.6. Samples: 1438145600. Policy #0 lag: (min: 1.0, avg: 31.9, max: 67.0) [2024-03-21 07:19:25,522][03784] Avg episode reward: [(0, '1.171')] [2024-03-21 07:19:30,521][03784] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 44653.3). Total num frames: 1436942336. Throughput: 0: 44051.1. Samples: 1438286200. Policy #0 lag: (min: 1.0, avg: 31.9, max: 67.0) [2024-03-21 07:19:30,522][03784] Avg episode reward: [(0, '1.493')] [2024-03-21 07:19:31,045][03995] Signal inference workers to stop experience collection... (28950 times) [2024-03-21 07:19:31,118][03995] Signal inference workers to resume experience collection... (28950 times) [2024-03-21 07:19:31,125][04017] InferenceWorker_p0-w0: stopping experience collection (28950 times) [2024-03-21 07:19:31,181][04017] InferenceWorker_p0-w0: resuming experience collection (28950 times) [2024-03-21 07:19:31,531][04017] Updated weights for policy 0, policy_version 43855 (0.0016) [2024-03-21 07:19:35,521][03784] Fps is (10 sec: 55705.1, 60 sec: 45875.1, 300 sec: 45319.8). Total num frames: 1437270016. Throughput: 0: 43517.7. Samples: 1438536300. Policy #0 lag: (min: 1.0, avg: 31.9, max: 67.0) [2024-03-21 07:19:35,522][03784] Avg episode reward: [(0, '1.104')] [2024-03-21 07:19:36,748][04017] Updated weights for policy 0, policy_version 43865 (0.0015) [2024-03-21 07:19:40,521][03784] Fps is (10 sec: 62259.3, 60 sec: 47513.6, 300 sec: 45986.3). Total num frames: 1437564928. Throughput: 0: 43897.7. Samples: 1438796700. Policy #0 lag: (min: 1.0, avg: 31.9, max: 67.0) [2024-03-21 07:19:40,522][03784] Avg episode reward: [(0, '1.175')] [2024-03-21 07:19:42,187][04017] Updated weights for policy 0, policy_version 43875 (0.0020) [2024-03-21 07:19:45,521][03784] Fps is (10 sec: 65536.7, 60 sec: 47513.7, 300 sec: 45986.3). Total num frames: 1437925376. Throughput: 0: 43948.9. Samples: 1438924700. Policy #0 lag: (min: 0.0, avg: 41.8, max: 76.0) [2024-03-21 07:19:45,522][03784] Avg episode reward: [(0, '1.231')] [2024-03-21 07:19:46,991][04017] Updated weights for policy 0, policy_version 43885 (0.0012) [2024-03-21 07:19:50,521][03784] Fps is (10 sec: 52429.5, 60 sec: 46967.5, 300 sec: 46541.7). Total num frames: 1438089216. Throughput: 0: 44429.0. Samples: 1439195000. Policy #0 lag: (min: 0.0, avg: 41.8, max: 76.0) [2024-03-21 07:19:50,522][03784] Avg episode reward: [(0, '0.934')] [2024-03-21 07:19:55,521][03784] Fps is (10 sec: 26214.4, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 1438187520. Throughput: 0: 45320.0. Samples: 1439498400. Policy #0 lag: (min: 0.0, avg: 41.8, max: 76.0) [2024-03-21 07:19:55,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 07:20:00,521][03784] Fps is (10 sec: 19660.5, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 1438285824. Throughput: 0: 45244.4. Samples: 1439637900. Policy #0 lag: (min: 0.0, avg: 41.8, max: 76.0) [2024-03-21 07:20:00,522][03784] Avg episode reward: [(0, '1.470')] [2024-03-21 07:20:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000043893_1438285824.pth... [2024-03-21 07:20:00,652][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000043566_1427570688.pth [2024-03-21 07:20:02,032][04017] Updated weights for policy 0, policy_version 43895 (0.0011) [2024-03-21 07:20:05,521][03784] Fps is (10 sec: 26214.4, 60 sec: 42598.5, 300 sec: 44764.4). Total num frames: 1438449664. Throughput: 0: 45746.7. Samples: 1439938600. Policy #0 lag: (min: 0.0, avg: 41.8, max: 76.0) [2024-03-21 07:20:05,522][03784] Avg episode reward: [(0, '1.470')] [2024-03-21 07:20:10,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43144.5, 300 sec: 44542.3). Total num frames: 1438646272. Throughput: 0: 45924.3. Samples: 1440212200. Policy #0 lag: (min: 1.0, avg: 27.7, max: 67.0) [2024-03-21 07:20:10,522][03784] Avg episode reward: [(0, '1.436')] [2024-03-21 07:20:10,612][04017] Updated weights for policy 0, policy_version 43905 (0.0015) [2024-03-21 07:20:15,521][03784] Fps is (10 sec: 52428.5, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1438973952. Throughput: 0: 45671.2. Samples: 1440341400. Policy #0 lag: (min: 1.0, avg: 27.7, max: 67.0) [2024-03-21 07:20:15,522][03784] Avg episode reward: [(0, '0.876')] [2024-03-21 07:20:16,429][04017] Updated weights for policy 0, policy_version 43915 (0.0020) [2024-03-21 07:20:20,521][03784] Fps is (10 sec: 52429.3, 60 sec: 44236.8, 300 sec: 44653.4). Total num frames: 1439170560. Throughput: 0: 45931.2. Samples: 1440603200. Policy #0 lag: (min: 1.0, avg: 27.7, max: 67.0) [2024-03-21 07:20:20,523][03784] Avg episode reward: [(0, '1.274')] [2024-03-21 07:20:22,821][04017] Updated weights for policy 0, policy_version 43925 (0.0011) [2024-03-21 07:20:23,485][03995] Signal inference workers to stop experience collection... (29000 times) [2024-03-21 07:20:23,559][04017] InferenceWorker_p0-w0: stopping experience collection (29000 times) [2024-03-21 07:20:23,737][03995] Signal inference workers to resume experience collection... (29000 times) [2024-03-21 07:20:23,738][04017] InferenceWorker_p0-w0: resuming experience collection (29000 times) [2024-03-21 07:20:25,521][03784] Fps is (10 sec: 55705.9, 60 sec: 46967.5, 300 sec: 45319.8). Total num frames: 1439531008. Throughput: 0: 45660.1. Samples: 1440851400. Policy #0 lag: (min: 1.0, avg: 27.7, max: 67.0) [2024-03-21 07:20:25,522][03784] Avg episode reward: [(0, '1.271')] [2024-03-21 07:20:28,187][04017] Updated weights for policy 0, policy_version 43935 (0.0012) [2024-03-21 07:20:30,521][03784] Fps is (10 sec: 62259.3, 60 sec: 47513.7, 300 sec: 45875.2). Total num frames: 1439793152. Throughput: 0: 46102.2. Samples: 1440999300. Policy #0 lag: (min: 1.0, avg: 27.7, max: 67.0) [2024-03-21 07:20:30,522][03784] Avg episode reward: [(0, '1.042')] [2024-03-21 07:20:35,521][03784] Fps is (10 sec: 42598.4, 60 sec: 44783.0, 300 sec: 45875.2). Total num frames: 1439956992. Throughput: 0: 46177.7. Samples: 1441273000. Policy #0 lag: (min: 1.0, avg: 27.7, max: 67.0) [2024-03-21 07:20:35,522][03784] Avg episode reward: [(0, '1.393')] [2024-03-21 07:20:35,844][04017] Updated weights for policy 0, policy_version 43945 (0.0020) [2024-03-21 07:20:40,521][03784] Fps is (10 sec: 42598.1, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 1440219136. Throughput: 0: 45464.4. Samples: 1441544300. Policy #0 lag: (min: 1.0, avg: 41.7, max: 76.0) [2024-03-21 07:20:40,522][03784] Avg episode reward: [(0, '1.681')] [2024-03-21 07:20:41,229][04017] Updated weights for policy 0, policy_version 43955 (0.0011) [2024-03-21 07:20:45,521][03784] Fps is (10 sec: 49151.8, 60 sec: 42052.2, 300 sec: 45764.1). Total num frames: 1440448512. Throughput: 0: 45124.5. Samples: 1441668500. Policy #0 lag: (min: 1.0, avg: 41.7, max: 76.0) [2024-03-21 07:20:45,522][03784] Avg episode reward: [(0, '1.028')] [2024-03-21 07:20:49,814][04017] Updated weights for policy 0, policy_version 43965 (0.0017) [2024-03-21 07:20:50,521][03784] Fps is (10 sec: 49151.9, 60 sec: 43690.6, 300 sec: 45875.2). Total num frames: 1440710656. Throughput: 0: 44551.0. Samples: 1441943400. Policy #0 lag: (min: 1.0, avg: 41.7, max: 76.0) [2024-03-21 07:20:50,522][03784] Avg episode reward: [(0, '1.083')] [2024-03-21 07:20:54,151][04017] Updated weights for policy 0, policy_version 43975 (0.0011) [2024-03-21 07:20:55,521][03784] Fps is (10 sec: 62259.7, 60 sec: 48059.8, 300 sec: 45986.3). Total num frames: 1441071104. Throughput: 0: 43989.1. Samples: 1442191700. Policy #0 lag: (min: 1.0, avg: 41.7, max: 76.0) [2024-03-21 07:20:55,522][03784] Avg episode reward: [(0, '0.944')] [2024-03-21 07:21:00,521][03784] Fps is (10 sec: 49152.4, 60 sec: 48606.0, 300 sec: 45319.8). Total num frames: 1441202176. Throughput: 0: 44688.9. Samples: 1442352400. Policy #0 lag: (min: 1.0, avg: 41.7, max: 76.0) [2024-03-21 07:21:00,522][03784] Avg episode reward: [(0, '0.944')] [2024-03-21 07:21:03,219][04017] Updated weights for policy 0, policy_version 43985 (0.0009) [2024-03-21 07:21:05,521][03784] Fps is (10 sec: 36044.5, 60 sec: 49698.1, 300 sec: 45542.0). Total num frames: 1441431552. Throughput: 0: 45091.1. Samples: 1442632300. Policy #0 lag: (min: 1.0, avg: 32.9, max: 66.0) [2024-03-21 07:21:05,522][03784] Avg episode reward: [(0, '0.944')] [2024-03-21 07:21:10,521][03784] Fps is (10 sec: 36044.7, 60 sec: 48605.9, 300 sec: 45319.8). Total num frames: 1441562624. Throughput: 0: 45786.6. Samples: 1442911800. Policy #0 lag: (min: 1.0, avg: 32.9, max: 66.0) [2024-03-21 07:21:10,522][03784] Avg episode reward: [(0, '0.706')] [2024-03-21 07:21:12,318][04017] Updated weights for policy 0, policy_version 43995 (0.0018) [2024-03-21 07:21:15,521][03784] Fps is (10 sec: 29491.2, 60 sec: 45875.2, 300 sec: 45097.7). Total num frames: 1441726464. Throughput: 0: 45797.7. Samples: 1443060200. Policy #0 lag: (min: 1.0, avg: 32.9, max: 66.0) [2024-03-21 07:21:15,522][03784] Avg episode reward: [(0, '0.641')] [2024-03-21 07:21:17,763][04017] Updated weights for policy 0, policy_version 44005 (0.0012) [2024-03-21 07:21:19,168][03995] Signal inference workers to stop experience collection... (29050 times) [2024-03-21 07:21:19,242][04017] InferenceWorker_p0-w0: stopping experience collection (29050 times) [2024-03-21 07:21:19,499][03995] Signal inference workers to resume experience collection... (29050 times) [2024-03-21 07:21:19,500][04017] InferenceWorker_p0-w0: resuming experience collection (29050 times) [2024-03-21 07:21:20,521][03784] Fps is (10 sec: 58982.1, 60 sec: 49698.1, 300 sec: 45764.1). Total num frames: 1442152448. Throughput: 0: 45788.8. Samples: 1443333500. Policy #0 lag: (min: 1.0, avg: 32.9, max: 66.0) [2024-03-21 07:21:20,522][03784] Avg episode reward: [(0, '0.600')] [2024-03-21 07:21:23,035][04017] Updated weights for policy 0, policy_version 44015 (0.0012) [2024-03-21 07:21:25,521][03784] Fps is (10 sec: 55705.5, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1442283520. Throughput: 0: 45748.9. Samples: 1443603000. Policy #0 lag: (min: 1.0, avg: 32.9, max: 66.0) [2024-03-21 07:21:25,522][03784] Avg episode reward: [(0, '1.354')] [2024-03-21 07:21:30,521][03784] Fps is (10 sec: 32768.0, 60 sec: 44782.9, 300 sec: 45097.6). Total num frames: 1442480128. Throughput: 0: 46308.8. Samples: 1443752400. Policy #0 lag: (min: 1.0, avg: 32.9, max: 66.0) [2024-03-21 07:21:30,522][03784] Avg episode reward: [(0, '1.235')] [2024-03-21 07:21:32,069][04017] Updated weights for policy 0, policy_version 44025 (0.0011) [2024-03-21 07:21:35,521][03784] Fps is (10 sec: 42598.2, 60 sec: 45875.1, 300 sec: 45542.0). Total num frames: 1442709504. Throughput: 0: 46504.4. Samples: 1444036100. Policy #0 lag: (min: 0.0, avg: 26.6, max: 58.0) [2024-03-21 07:21:35,522][03784] Avg episode reward: [(0, '1.099')] [2024-03-21 07:21:40,521][03784] Fps is (10 sec: 42598.5, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 1442906112. Throughput: 0: 46833.2. Samples: 1444299200. Policy #0 lag: (min: 0.0, avg: 26.6, max: 58.0) [2024-03-21 07:21:40,522][03784] Avg episode reward: [(0, '1.068')] [2024-03-21 07:21:41,093][04017] Updated weights for policy 0, policy_version 44035 (0.0017) [2024-03-21 07:21:45,521][03784] Fps is (10 sec: 45875.4, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 1443168256. Throughput: 0: 46042.2. Samples: 1444424300. Policy #0 lag: (min: 0.0, avg: 26.6, max: 58.0) [2024-03-21 07:21:45,522][03784] Avg episode reward: [(0, '1.235')] [2024-03-21 07:21:46,473][04017] Updated weights for policy 0, policy_version 44045 (0.0016) [2024-03-21 07:21:50,521][03784] Fps is (10 sec: 55705.6, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 1443463168. Throughput: 0: 46495.5. Samples: 1444724600. Policy #0 lag: (min: 0.0, avg: 26.6, max: 58.0) [2024-03-21 07:21:50,522][03784] Avg episode reward: [(0, '1.174')] [2024-03-21 07:21:53,377][04017] Updated weights for policy 0, policy_version 44055 (0.0011) [2024-03-21 07:21:55,521][03784] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 45430.9). Total num frames: 1443692544. Throughput: 0: 46777.8. Samples: 1445016800. Policy #0 lag: (min: 0.0, avg: 26.6, max: 58.0) [2024-03-21 07:21:55,522][03784] Avg episode reward: [(0, '1.174')] [2024-03-21 07:21:59,198][04017] Updated weights for policy 0, policy_version 44065 (0.0011) [2024-03-21 07:22:00,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1443987456. Throughput: 0: 46475.5. Samples: 1445151600. Policy #0 lag: (min: 0.0, avg: 26.6, max: 58.0) [2024-03-21 07:22:00,522][03784] Avg episode reward: [(0, '0.577')] [2024-03-21 07:22:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000044067_1443987456.pth... [2024-03-21 07:22:00,666][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000043732_1433010176.pth [2024-03-21 07:22:05,521][03784] Fps is (10 sec: 49152.3, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 1444184064. Throughput: 0: 46613.4. Samples: 1445431100. Policy #0 lag: (min: 1.0, avg: 46.2, max: 98.0) [2024-03-21 07:22:05,522][03784] Avg episode reward: [(0, '0.925')] [2024-03-21 07:22:08,818][04017] Updated weights for policy 0, policy_version 44075 (0.0019) [2024-03-21 07:22:10,521][03784] Fps is (10 sec: 29491.4, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1444282368. Throughput: 0: 47086.7. Samples: 1445721900. Policy #0 lag: (min: 1.0, avg: 46.2, max: 98.0) [2024-03-21 07:22:10,522][03784] Avg episode reward: [(0, '1.269')] [2024-03-21 07:22:14,191][03995] Signal inference workers to stop experience collection... (29100 times) [2024-03-21 07:22:14,257][04017] InferenceWorker_p0-w0: stopping experience collection (29100 times) [2024-03-21 07:22:14,432][03995] Signal inference workers to resume experience collection... (29100 times) [2024-03-21 07:22:14,432][04017] InferenceWorker_p0-w0: resuming experience collection (29100 times) [2024-03-21 07:22:14,434][04017] Updated weights for policy 0, policy_version 44085 (0.0018) [2024-03-21 07:22:15,521][03784] Fps is (10 sec: 49151.4, 60 sec: 49151.9, 300 sec: 45764.1). Total num frames: 1444675584. Throughput: 0: 46433.3. Samples: 1445841900. Policy #0 lag: (min: 1.0, avg: 46.2, max: 98.0) [2024-03-21 07:22:15,522][03784] Avg episode reward: [(0, '1.487')] [2024-03-21 07:22:19,256][04017] Updated weights for policy 0, policy_version 44095 (0.0015) [2024-03-21 07:22:20,521][03784] Fps is (10 sec: 62259.3, 60 sec: 45875.3, 300 sec: 46208.4). Total num frames: 1444904960. Throughput: 0: 45486.8. Samples: 1446083000. Policy #0 lag: (min: 1.0, avg: 46.2, max: 98.0) [2024-03-21 07:22:20,522][03784] Avg episode reward: [(0, '1.222')] [2024-03-21 07:22:25,521][03784] Fps is (10 sec: 32768.2, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 1445003264. Throughput: 0: 45808.9. Samples: 1446360600. Policy #0 lag: (min: 1.0, avg: 46.2, max: 98.0) [2024-03-21 07:22:25,522][03784] Avg episode reward: [(0, '0.998')] [2024-03-21 07:22:30,521][03784] Fps is (10 sec: 26214.2, 60 sec: 44783.0, 300 sec: 45542.0). Total num frames: 1445167104. Throughput: 0: 46000.0. Samples: 1446494300. Policy #0 lag: (min: 1.0, avg: 46.2, max: 98.0) [2024-03-21 07:22:30,522][03784] Avg episode reward: [(0, '0.564')] [2024-03-21 07:22:32,177][04017] Updated weights for policy 0, policy_version 44105 (0.0010) [2024-03-21 07:22:35,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44783.0, 300 sec: 45430.9). Total num frames: 1445396480. Throughput: 0: 45122.3. Samples: 1446755100. Policy #0 lag: (min: 0.0, avg: 27.0, max: 58.0) [2024-03-21 07:22:35,522][03784] Avg episode reward: [(0, '1.406')] [2024-03-21 07:22:37,487][04017] Updated weights for policy 0, policy_version 44115 (0.0025) [2024-03-21 07:22:40,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1445625856. Throughput: 0: 44937.7. Samples: 1447039000. Policy #0 lag: (min: 0.0, avg: 27.0, max: 58.0) [2024-03-21 07:22:40,522][03784] Avg episode reward: [(0, '1.055')] [2024-03-21 07:22:45,521][03784] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44653.4). Total num frames: 1445822464. Throughput: 0: 45204.5. Samples: 1447185800. Policy #0 lag: (min: 0.0, avg: 27.0, max: 58.0) [2024-03-21 07:22:45,522][03784] Avg episode reward: [(0, '1.122')] [2024-03-21 07:22:46,724][04017] Updated weights for policy 0, policy_version 44125 (0.0023) [2024-03-21 07:22:50,521][03784] Fps is (10 sec: 45875.5, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 1446084608. Throughput: 0: 45211.1. Samples: 1447465600. Policy #0 lag: (min: 0.0, avg: 27.0, max: 58.0) [2024-03-21 07:22:50,523][03784] Avg episode reward: [(0, '1.435')] [2024-03-21 07:22:52,041][04017] Updated weights for policy 0, policy_version 44135 (0.0012) [2024-03-21 07:22:55,521][03784] Fps is (10 sec: 58982.2, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1446412288. Throughput: 0: 44591.1. Samples: 1447728500. Policy #0 lag: (min: 0.0, avg: 27.0, max: 58.0) [2024-03-21 07:22:55,522][03784] Avg episode reward: [(0, '1.195')] [2024-03-21 07:22:58,575][04017] Updated weights for policy 0, policy_version 44145 (0.0020) [2024-03-21 07:23:00,521][03784] Fps is (10 sec: 62258.7, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 1446707200. Throughput: 0: 44971.1. Samples: 1447865600. Policy #0 lag: (min: 2.0, avg: 34.2, max: 70.0) [2024-03-21 07:23:00,522][03784] Avg episode reward: [(0, '0.702')] [2024-03-21 07:23:05,521][03784] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 45430.9). Total num frames: 1446838272. Throughput: 0: 45015.6. Samples: 1448108700. Policy #0 lag: (min: 2.0, avg: 34.2, max: 70.0) [2024-03-21 07:23:05,522][03784] Avg episode reward: [(0, '1.573')] [2024-03-21 07:23:07,088][04017] Updated weights for policy 0, policy_version 44155 (0.0016) [2024-03-21 07:23:10,521][03784] Fps is (10 sec: 42598.6, 60 sec: 47513.5, 300 sec: 45430.9). Total num frames: 1447133184. Throughput: 0: 44471.1. Samples: 1448361800. Policy #0 lag: (min: 2.0, avg: 34.2, max: 70.0) [2024-03-21 07:23:10,522][03784] Avg episode reward: [(0, '1.508')] [2024-03-21 07:23:11,585][04017] Updated weights for policy 0, policy_version 44165 (0.0011) [2024-03-21 07:23:13,957][03995] Signal inference workers to stop experience collection... (29150 times) [2024-03-21 07:23:13,979][04017] InferenceWorker_p0-w0: stopping experience collection (29150 times) [2024-03-21 07:23:14,196][03995] Signal inference workers to resume experience collection... (29150 times) [2024-03-21 07:23:14,196][04017] InferenceWorker_p0-w0: resuming experience collection (29150 times) [2024-03-21 07:23:15,521][03784] Fps is (10 sec: 49151.9, 60 sec: 44236.9, 300 sec: 45876.0). Total num frames: 1447329792. Throughput: 0: 44933.4. Samples: 1448516300. Policy #0 lag: (min: 2.0, avg: 34.2, max: 70.0) [2024-03-21 07:23:15,522][03784] Avg episode reward: [(0, '1.508')] [2024-03-21 07:23:18,886][04017] Updated weights for policy 0, policy_version 44175 (0.0023) [2024-03-21 07:23:20,521][03784] Fps is (10 sec: 49152.2, 60 sec: 45329.0, 300 sec: 46430.6). Total num frames: 1447624704. Throughput: 0: 45171.1. Samples: 1448787800. Policy #0 lag: (min: 2.0, avg: 34.2, max: 70.0) [2024-03-21 07:23:20,522][03784] Avg episode reward: [(0, '1.146')] [2024-03-21 07:23:25,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45875.3, 300 sec: 45430.9). Total num frames: 1447755776. Throughput: 0: 45469.0. Samples: 1449085100. Policy #0 lag: (min: 2.0, avg: 34.2, max: 70.0) [2024-03-21 07:23:25,522][03784] Avg episode reward: [(0, '1.146')] [2024-03-21 07:23:28,445][04017] Updated weights for policy 0, policy_version 44185 (0.0016) [2024-03-21 07:23:30,521][03784] Fps is (10 sec: 29491.2, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1447919616. Throughput: 0: 45406.6. Samples: 1449229100. Policy #0 lag: (min: 0.0, avg: 38.3, max: 90.0) [2024-03-21 07:23:30,522][03784] Avg episode reward: [(0, '1.540')] [2024-03-21 07:23:35,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1448116224. Throughput: 0: 45637.8. Samples: 1449519300. Policy #0 lag: (min: 0.0, avg: 38.3, max: 90.0) [2024-03-21 07:23:35,522][03784] Avg episode reward: [(0, '1.540')] [2024-03-21 07:23:37,660][04017] Updated weights for policy 0, policy_version 44195 (0.0022) [2024-03-21 07:23:40,521][03784] Fps is (10 sec: 29491.0, 60 sec: 43144.5, 300 sec: 44542.3). Total num frames: 1448214528. Throughput: 0: 45931.1. Samples: 1449795400. Policy #0 lag: (min: 0.0, avg: 38.3, max: 90.0) [2024-03-21 07:23:40,522][03784] Avg episode reward: [(0, '0.746')] [2024-03-21 07:23:43,855][04017] Updated weights for policy 0, policy_version 44205 (0.0018) [2024-03-21 07:23:45,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 45097.6). Total num frames: 1448574976. Throughput: 0: 45502.3. Samples: 1449913200. Policy #0 lag: (min: 0.0, avg: 38.3, max: 90.0) [2024-03-21 07:23:45,522][03784] Avg episode reward: [(0, '1.429')] [2024-03-21 07:23:50,521][03784] Fps is (10 sec: 58982.8, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1448804352. Throughput: 0: 45462.2. Samples: 1450154500. Policy #0 lag: (min: 0.0, avg: 38.3, max: 90.0) [2024-03-21 07:23:50,522][03784] Avg episode reward: [(0, '1.411')] [2024-03-21 07:23:51,370][04017] Updated weights for policy 0, policy_version 44215 (0.0011) [2024-03-21 07:23:55,521][03784] Fps is (10 sec: 36045.0, 60 sec: 42052.3, 300 sec: 45319.8). Total num frames: 1448935424. Throughput: 0: 45946.8. Samples: 1450429400. Policy #0 lag: (min: 0.0, avg: 38.3, max: 90.0) [2024-03-21 07:23:55,522][03784] Avg episode reward: [(0, '1.511')] [2024-03-21 07:23:59,402][04017] Updated weights for policy 0, policy_version 44225 (0.0014) [2024-03-21 07:24:00,521][03784] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 45319.8). Total num frames: 1449263104. Throughput: 0: 45448.7. Samples: 1450561500. Policy #0 lag: (min: 0.0, avg: 43.3, max: 108.0) [2024-03-21 07:24:00,522][03784] Avg episode reward: [(0, '1.377')] [2024-03-21 07:24:00,845][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000044229_1449295872.pth... [2024-03-21 07:24:00,991][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000043893_1438285824.pth [2024-03-21 07:24:03,088][04017] Updated weights for policy 0, policy_version 44235 (0.0016) [2024-03-21 07:24:03,740][03995] Signal inference workers to stop experience collection... (29200 times) [2024-03-21 07:24:03,788][04017] InferenceWorker_p0-w0: stopping experience collection (29200 times) [2024-03-21 07:24:04,021][03995] Signal inference workers to resume experience collection... (29200 times) [2024-03-21 07:24:04,022][04017] InferenceWorker_p0-w0: resuming experience collection (29200 times) [2024-03-21 07:24:05,521][03784] Fps is (10 sec: 62259.0, 60 sec: 45329.0, 300 sec: 45764.1). Total num frames: 1449558016. Throughput: 0: 44877.8. Samples: 1450807300. Policy #0 lag: (min: 0.0, avg: 43.3, max: 108.0) [2024-03-21 07:24:05,522][03784] Avg episode reward: [(0, '1.314')] [2024-03-21 07:24:10,521][03784] Fps is (10 sec: 49152.4, 60 sec: 43690.7, 300 sec: 45764.1). Total num frames: 1449754624. Throughput: 0: 44457.7. Samples: 1451085700. Policy #0 lag: (min: 0.0, avg: 43.3, max: 108.0) [2024-03-21 07:24:10,522][03784] Avg episode reward: [(0, '1.122')] [2024-03-21 07:24:11,057][04017] Updated weights for policy 0, policy_version 44245 (0.0011) [2024-03-21 07:24:15,521][03784] Fps is (10 sec: 45874.9, 60 sec: 44782.9, 300 sec: 45764.1). Total num frames: 1450016768. Throughput: 0: 44431.1. Samples: 1451228500. Policy #0 lag: (min: 0.0, avg: 43.3, max: 108.0) [2024-03-21 07:24:15,522][03784] Avg episode reward: [(0, '1.370')] [2024-03-21 07:24:17,376][04017] Updated weights for policy 0, policy_version 44255 (0.0013) [2024-03-21 07:24:20,521][03784] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 45653.0). Total num frames: 1450180608. Throughput: 0: 43868.9. Samples: 1451493400. Policy #0 lag: (min: 0.0, avg: 43.3, max: 108.0) [2024-03-21 07:24:20,522][03784] Avg episode reward: [(0, '1.443')] [2024-03-21 07:24:25,354][04017] Updated weights for policy 0, policy_version 44265 (0.0021) [2024-03-21 07:24:25,521][03784] Fps is (10 sec: 45875.4, 60 sec: 45329.0, 300 sec: 45875.2). Total num frames: 1450475520. Throughput: 0: 43673.4. Samples: 1451760700. Policy #0 lag: (min: 1.0, avg: 38.4, max: 76.0) [2024-03-21 07:24:25,522][03784] Avg episode reward: [(0, '0.749')] [2024-03-21 07:24:30,521][03784] Fps is (10 sec: 49152.0, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1450672128. Throughput: 0: 44297.8. Samples: 1451906600. Policy #0 lag: (min: 1.0, avg: 38.4, max: 76.0) [2024-03-21 07:24:30,522][03784] Avg episode reward: [(0, '1.587')] [2024-03-21 07:24:34,237][04017] Updated weights for policy 0, policy_version 44275 (0.0017) [2024-03-21 07:24:35,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46421.3, 300 sec: 45208.7). Total num frames: 1450901504. Throughput: 0: 44966.7. Samples: 1452178000. Policy #0 lag: (min: 1.0, avg: 38.4, max: 76.0) [2024-03-21 07:24:35,522][03784] Avg episode reward: [(0, '1.370')] [2024-03-21 07:24:39,379][04017] Updated weights for policy 0, policy_version 44285 (0.0015) [2024-03-21 07:24:40,521][03784] Fps is (10 sec: 52428.3, 60 sec: 49698.1, 300 sec: 44986.6). Total num frames: 1451196416. Throughput: 0: 44975.4. Samples: 1452453300. Policy #0 lag: (min: 1.0, avg: 38.4, max: 76.0) [2024-03-21 07:24:40,522][03784] Avg episode reward: [(0, '0.683')] [2024-03-21 07:24:45,521][03784] Fps is (10 sec: 52428.6, 60 sec: 47513.6, 300 sec: 45208.7). Total num frames: 1451425792. Throughput: 0: 45017.9. Samples: 1452587300. Policy #0 lag: (min: 1.0, avg: 38.4, max: 76.0) [2024-03-21 07:24:45,522][03784] Avg episode reward: [(0, '0.683')] [2024-03-21 07:24:45,772][04017] Updated weights for policy 0, policy_version 44295 (0.0015) [2024-03-21 07:24:50,521][03784] Fps is (10 sec: 49151.7, 60 sec: 48059.6, 300 sec: 45764.1). Total num frames: 1451687936. Throughput: 0: 46133.2. Samples: 1452883300. Policy #0 lag: (min: 1.0, avg: 38.4, max: 76.0) [2024-03-21 07:24:50,522][03784] Avg episode reward: [(0, '0.683')] [2024-03-21 07:24:52,884][04017] Updated weights for policy 0, policy_version 44305 (0.0018) [2024-03-21 07:24:55,521][03784] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 46208.4). Total num frames: 1451917312. Throughput: 0: 46006.7. Samples: 1453156000. Policy #0 lag: (min: 0.0, avg: 36.6, max: 72.0) [2024-03-21 07:24:55,522][03784] Avg episode reward: [(0, '1.430')] [2024-03-21 07:25:00,521][03784] Fps is (10 sec: 32768.5, 60 sec: 45875.3, 300 sec: 45986.3). Total num frames: 1452015616. Throughput: 0: 46093.4. Samples: 1453302700. Policy #0 lag: (min: 0.0, avg: 36.6, max: 72.0) [2024-03-21 07:25:00,522][03784] Avg episode reward: [(0, '1.418')] [2024-03-21 07:25:03,038][04017] Updated weights for policy 0, policy_version 44315 (0.0014) [2024-03-21 07:25:05,521][03784] Fps is (10 sec: 29491.4, 60 sec: 44236.8, 300 sec: 45986.3). Total num frames: 1452212224. Throughput: 0: 46088.9. Samples: 1453567400. Policy #0 lag: (min: 0.0, avg: 36.6, max: 72.0) [2024-03-21 07:25:05,522][03784] Avg episode reward: [(0, '1.478')] [2024-03-21 07:25:08,233][03995] Signal inference workers to stop experience collection... (29250 times) [2024-03-21 07:25:08,371][04017] InferenceWorker_p0-w0: stopping experience collection (29250 times) [2024-03-21 07:25:08,477][03995] Signal inference workers to resume experience collection... (29250 times) [2024-03-21 07:25:08,477][04017] InferenceWorker_p0-w0: resuming experience collection (29250 times) [2024-03-21 07:25:10,471][04017] Updated weights for policy 0, policy_version 44325 (0.0012) [2024-03-21 07:25:10,521][03784] Fps is (10 sec: 42598.6, 60 sec: 44783.0, 300 sec: 45653.1). Total num frames: 1452441600. Throughput: 0: 46046.7. Samples: 1453832800. Policy #0 lag: (min: 0.0, avg: 36.6, max: 72.0) [2024-03-21 07:25:10,522][03784] Avg episode reward: [(0, '0.839')] [2024-03-21 07:25:15,521][03784] Fps is (10 sec: 36044.9, 60 sec: 42598.5, 300 sec: 45430.9). Total num frames: 1452572672. Throughput: 0: 45831.2. Samples: 1453969000. Policy #0 lag: (min: 0.0, avg: 36.6, max: 72.0) [2024-03-21 07:25:15,522][03784] Avg episode reward: [(0, '1.659')] [2024-03-21 07:25:18,970][04017] Updated weights for policy 0, policy_version 44335 (0.0011) [2024-03-21 07:25:20,521][03784] Fps is (10 sec: 32767.3, 60 sec: 43144.4, 300 sec: 44875.5). Total num frames: 1452769280. Throughput: 0: 45859.8. Samples: 1454241700. Policy #0 lag: (min: 0.0, avg: 36.6, max: 72.0) [2024-03-21 07:25:20,523][03784] Avg episode reward: [(0, '1.188')] [2024-03-21 07:25:24,821][04017] Updated weights for policy 0, policy_version 44345 (0.0020) [2024-03-21 07:25:25,521][03784] Fps is (10 sec: 55705.5, 60 sec: 44236.8, 300 sec: 45208.7). Total num frames: 1453129728. Throughput: 0: 45622.3. Samples: 1454506300. Policy #0 lag: (min: 0.0, avg: 39.9, max: 81.0) [2024-03-21 07:25:25,522][03784] Avg episode reward: [(0, '0.869')] [2024-03-21 07:25:30,521][03784] Fps is (10 sec: 58983.4, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 1453359104. Throughput: 0: 45315.5. Samples: 1454626500. Policy #0 lag: (min: 0.0, avg: 39.9, max: 81.0) [2024-03-21 07:25:30,522][03784] Avg episode reward: [(0, '1.627')] [2024-03-21 07:25:31,399][04017] Updated weights for policy 0, policy_version 44355 (0.0021) [2024-03-21 07:25:35,521][03784] Fps is (10 sec: 45875.0, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 1453588480. Throughput: 0: 44684.6. Samples: 1454894100. Policy #0 lag: (min: 0.0, avg: 39.9, max: 81.0) [2024-03-21 07:25:35,522][03784] Avg episode reward: [(0, '0.754')] [2024-03-21 07:25:38,326][04017] Updated weights for policy 0, policy_version 44365 (0.0009) [2024-03-21 07:25:40,521][03784] Fps is (10 sec: 55705.9, 60 sec: 45329.2, 300 sec: 45653.1). Total num frames: 1453916160. Throughput: 0: 44157.8. Samples: 1455143100. Policy #0 lag: (min: 0.0, avg: 39.9, max: 81.0) [2024-03-21 07:25:40,522][03784] Avg episode reward: [(0, '0.841')] [2024-03-21 07:25:43,781][04017] Updated weights for policy 0, policy_version 44375 (0.0010) [2024-03-21 07:25:45,521][03784] Fps is (10 sec: 49152.1, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 1454080000. Throughput: 0: 44037.8. Samples: 1455284400. Policy #0 lag: (min: 0.0, avg: 39.9, max: 81.0) [2024-03-21 07:25:45,523][03784] Avg episode reward: [(0, '0.688')] [2024-03-21 07:25:49,243][04017] Updated weights for policy 0, policy_version 44385 (0.0028) [2024-03-21 07:25:50,521][03784] Fps is (10 sec: 55705.6, 60 sec: 46421.5, 300 sec: 45430.9). Total num frames: 1454473216. Throughput: 0: 44268.9. Samples: 1455559500. Policy #0 lag: (min: 2.0, avg: 43.9, max: 77.0) [2024-03-21 07:25:50,522][03784] Avg episode reward: [(0, '0.688')] [2024-03-21 07:25:54,004][03995] Signal inference workers to stop experience collection... (29300 times) [2024-03-21 07:25:54,079][04017] InferenceWorker_p0-w0: stopping experience collection (29300 times) [2024-03-21 07:25:54,300][03995] Signal inference workers to resume experience collection... (29300 times) [2024-03-21 07:25:54,300][04017] InferenceWorker_p0-w0: resuming experience collection (29300 times) [2024-03-21 07:25:55,521][03784] Fps is (10 sec: 62259.2, 60 sec: 46421.4, 300 sec: 45764.1). Total num frames: 1454702592. Throughput: 0: 44453.3. Samples: 1455833200. Policy #0 lag: (min: 2.0, avg: 43.9, max: 77.0) [2024-03-21 07:25:55,522][03784] Avg episode reward: [(0, '1.594')] [2024-03-21 07:25:55,625][04017] Updated weights for policy 0, policy_version 44395 (0.0010) [2024-03-21 07:26:00,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46967.4, 300 sec: 45430.9). Total num frames: 1454833664. Throughput: 0: 44771.0. Samples: 1455983700. Policy #0 lag: (min: 2.0, avg: 43.9, max: 77.0) [2024-03-21 07:26:00,522][03784] Avg episode reward: [(0, '1.594')] [2024-03-21 07:26:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000044398_1454833664.pth... [2024-03-21 07:26:00,702][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000044067_1443987456.pth [2024-03-21 07:26:05,521][03784] Fps is (10 sec: 22937.6, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1454931968. Throughput: 0: 44909.1. Samples: 1456262600. Policy #0 lag: (min: 2.0, avg: 43.9, max: 77.0) [2024-03-21 07:26:05,522][03784] Avg episode reward: [(0, '1.066')] [2024-03-21 07:26:10,521][03784] Fps is (10 sec: 19660.7, 60 sec: 43144.4, 300 sec: 45097.6). Total num frames: 1455030272. Throughput: 0: 45524.3. Samples: 1456554900. Policy #0 lag: (min: 2.0, avg: 43.9, max: 77.0) [2024-03-21 07:26:10,522][03784] Avg episode reward: [(0, '1.210')] [2024-03-21 07:26:10,961][04017] Updated weights for policy 0, policy_version 44405 (0.0011) [2024-03-21 07:26:15,521][03784] Fps is (10 sec: 39321.3, 60 sec: 45875.1, 300 sec: 44653.3). Total num frames: 1455325184. Throughput: 0: 46013.3. Samples: 1456697100. Policy #0 lag: (min: 2.0, avg: 43.9, max: 77.0) [2024-03-21 07:26:15,522][03784] Avg episode reward: [(0, '0.686')] [2024-03-21 07:26:16,063][04017] Updated weights for policy 0, policy_version 44415 (0.0018) [2024-03-21 07:26:20,521][03784] Fps is (10 sec: 55706.2, 60 sec: 46967.6, 300 sec: 45097.7). Total num frames: 1455587328. Throughput: 0: 45962.2. Samples: 1456962400. Policy #0 lag: (min: 0.0, avg: 34.3, max: 86.0) [2024-03-21 07:26:20,522][03784] Avg episode reward: [(0, '0.668')] [2024-03-21 07:26:23,056][04017] Updated weights for policy 0, policy_version 44425 (0.0018) [2024-03-21 07:26:25,521][03784] Fps is (10 sec: 49152.1, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 1455816704. Throughput: 0: 46455.5. Samples: 1457233600. Policy #0 lag: (min: 0.0, avg: 34.3, max: 86.0) [2024-03-21 07:26:25,522][03784] Avg episode reward: [(0, '0.668')] [2024-03-21 07:26:29,303][04017] Updated weights for policy 0, policy_version 44435 (0.0010) [2024-03-21 07:26:30,521][03784] Fps is (10 sec: 55705.1, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1456144384. Throughput: 0: 46077.7. Samples: 1457357900. Policy #0 lag: (min: 0.0, avg: 34.3, max: 86.0) [2024-03-21 07:26:30,522][03784] Avg episode reward: [(0, '1.012')] [2024-03-21 07:26:32,836][04017] Updated weights for policy 0, policy_version 44445 (0.0012) [2024-03-21 07:26:35,521][03784] Fps is (10 sec: 75366.3, 60 sec: 49698.1, 300 sec: 46319.5). Total num frames: 1456570368. Throughput: 0: 45619.9. Samples: 1457612400. Policy #0 lag: (min: 0.0, avg: 34.3, max: 86.0) [2024-03-21 07:26:35,522][03784] Avg episode reward: [(0, '1.086')] [2024-03-21 07:26:37,700][04017] Updated weights for policy 0, policy_version 44455 (0.0012) [2024-03-21 07:26:40,521][03784] Fps is (10 sec: 62259.5, 60 sec: 47513.5, 300 sec: 46097.4). Total num frames: 1456766976. Throughput: 0: 46055.5. Samples: 1457905700. Policy #0 lag: (min: 0.0, avg: 34.3, max: 86.0) [2024-03-21 07:26:40,522][03784] Avg episode reward: [(0, '1.086')] [2024-03-21 07:26:45,521][03784] Fps is (10 sec: 26214.5, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1456832512. Throughput: 0: 46266.7. Samples: 1458065700. Policy #0 lag: (min: 0.0, avg: 34.3, max: 86.0) [2024-03-21 07:26:45,522][03784] Avg episode reward: [(0, '1.419')] [2024-03-21 07:26:49,314][04017] Updated weights for policy 0, policy_version 44465 (0.0027) [2024-03-21 07:26:50,521][03784] Fps is (10 sec: 29491.3, 60 sec: 43144.5, 300 sec: 45319.8). Total num frames: 1457061888. Throughput: 0: 46411.1. Samples: 1458351100. Policy #0 lag: (min: 1.0, avg: 39.4, max: 85.0) [2024-03-21 07:26:50,522][03784] Avg episode reward: [(0, '0.606')] [2024-03-21 07:26:50,899][03995] Signal inference workers to stop experience collection... (29350 times) [2024-03-21 07:26:50,900][03995] Signal inference workers to resume experience collection... (29350 times) [2024-03-21 07:26:50,945][04017] InferenceWorker_p0-w0: stopping experience collection (29350 times) [2024-03-21 07:26:50,945][04017] InferenceWorker_p0-w0: resuming experience collection (29350 times) [2024-03-21 07:26:55,521][03784] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 45208.7). Total num frames: 1457324032. Throughput: 0: 46004.6. Samples: 1458625100. Policy #0 lag: (min: 1.0, avg: 39.4, max: 85.0) [2024-03-21 07:26:55,522][03784] Avg episode reward: [(0, '1.283')] [2024-03-21 07:26:55,625][04017] Updated weights for policy 0, policy_version 44475 (0.0012) [2024-03-21 07:27:00,521][03784] Fps is (10 sec: 52428.9, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1457586176. Throughput: 0: 46028.9. Samples: 1458768400. Policy #0 lag: (min: 1.0, avg: 39.4, max: 85.0) [2024-03-21 07:27:00,522][03784] Avg episode reward: [(0, '0.780')] [2024-03-21 07:27:02,030][04017] Updated weights for policy 0, policy_version 44485 (0.0011) [2024-03-21 07:27:05,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46967.5, 300 sec: 45653.0). Total num frames: 1457750016. Throughput: 0: 46020.0. Samples: 1459033300. Policy #0 lag: (min: 1.0, avg: 39.4, max: 85.0) [2024-03-21 07:27:05,523][03784] Avg episode reward: [(0, '0.867')] [2024-03-21 07:27:10,521][03784] Fps is (10 sec: 26214.5, 60 sec: 46967.6, 300 sec: 44653.4). Total num frames: 1457848320. Throughput: 0: 46802.3. Samples: 1459339700. Policy #0 lag: (min: 1.0, avg: 39.4, max: 85.0) [2024-03-21 07:27:10,522][03784] Avg episode reward: [(0, '0.867')] [2024-03-21 07:27:12,834][04017] Updated weights for policy 0, policy_version 44495 (0.0016) [2024-03-21 07:27:15,521][03784] Fps is (10 sec: 39321.4, 60 sec: 46967.5, 300 sec: 44875.5). Total num frames: 1458143232. Throughput: 0: 46951.2. Samples: 1459470700. Policy #0 lag: (min: 0.0, avg: 34.7, max: 86.0) [2024-03-21 07:27:15,522][03784] Avg episode reward: [(0, '1.416')] [2024-03-21 07:27:18,069][04017] Updated weights for policy 0, policy_version 44505 (0.0012) [2024-03-21 07:27:20,521][03784] Fps is (10 sec: 65535.3, 60 sec: 48605.8, 300 sec: 45764.1). Total num frames: 1458503680. Throughput: 0: 47137.7. Samples: 1459733600. Policy #0 lag: (min: 0.0, avg: 34.7, max: 86.0) [2024-03-21 07:27:20,522][03784] Avg episode reward: [(0, '1.416')] [2024-03-21 07:27:25,521][03784] Fps is (10 sec: 49151.7, 60 sec: 46967.4, 300 sec: 45653.0). Total num frames: 1458634752. Throughput: 0: 46237.7. Samples: 1459986400. Policy #0 lag: (min: 0.0, avg: 34.7, max: 86.0) [2024-03-21 07:27:25,522][03784] Avg episode reward: [(0, '0.480')] [2024-03-21 07:27:25,689][04017] Updated weights for policy 0, policy_version 44515 (0.0014) [2024-03-21 07:27:30,521][03784] Fps is (10 sec: 26214.4, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1458765824. Throughput: 0: 45951.1. Samples: 1460133500. Policy #0 lag: (min: 0.0, avg: 34.7, max: 86.0) [2024-03-21 07:27:30,522][03784] Avg episode reward: [(0, '1.182')] [2024-03-21 07:27:35,521][03784] Fps is (10 sec: 32768.5, 60 sec: 39867.8, 300 sec: 45208.8). Total num frames: 1458962432. Throughput: 0: 46284.5. Samples: 1460433900. Policy #0 lag: (min: 0.0, avg: 34.7, max: 86.0) [2024-03-21 07:27:35,522][03784] Avg episode reward: [(0, '1.380')] [2024-03-21 07:27:35,556][04017] Updated weights for policy 0, policy_version 44525 (0.0012) [2024-03-21 07:27:40,179][04017] Updated weights for policy 0, policy_version 44535 (0.0020) [2024-03-21 07:27:40,521][03784] Fps is (10 sec: 58982.9, 60 sec: 43144.6, 300 sec: 45875.2). Total num frames: 1459355648. Throughput: 0: 45724.4. Samples: 1460682700. Policy #0 lag: (min: 0.0, avg: 34.7, max: 86.0) [2024-03-21 07:27:40,522][03784] Avg episode reward: [(0, '1.415')] [2024-03-21 07:27:42,010][03995] Signal inference workers to stop experience collection... (29400 times) [2024-03-21 07:27:42,084][04017] InferenceWorker_p0-w0: stopping experience collection (29400 times) [2024-03-21 07:27:42,287][03995] Signal inference workers to resume experience collection... (29400 times) [2024-03-21 07:27:42,287][04017] InferenceWorker_p0-w0: resuming experience collection (29400 times) [2024-03-21 07:27:43,565][04017] Updated weights for policy 0, policy_version 44545 (0.0013) [2024-03-21 07:27:45,521][03784] Fps is (10 sec: 75366.4, 60 sec: 48059.8, 300 sec: 46208.4). Total num frames: 1459716096. Throughput: 0: 45211.2. Samples: 1460802900. Policy #0 lag: (min: 7.0, avg: 45.1, max: 85.0) [2024-03-21 07:27:45,522][03784] Avg episode reward: [(0, '0.973')] [2024-03-21 07:27:50,521][03784] Fps is (10 sec: 55705.4, 60 sec: 47513.6, 300 sec: 45764.1). Total num frames: 1459912704. Throughput: 0: 45442.2. Samples: 1461078200. Policy #0 lag: (min: 7.0, avg: 45.1, max: 85.0) [2024-03-21 07:27:50,522][03784] Avg episode reward: [(0, '0.961')] [2024-03-21 07:27:51,926][04017] Updated weights for policy 0, policy_version 44555 (0.0015) [2024-03-21 07:27:55,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46967.4, 300 sec: 45542.0). Total num frames: 1460142080. Throughput: 0: 44848.8. Samples: 1461357900. Policy #0 lag: (min: 7.0, avg: 45.1, max: 85.0) [2024-03-21 07:27:55,522][03784] Avg episode reward: [(0, '1.024')] [2024-03-21 07:28:00,154][04017] Updated weights for policy 0, policy_version 44565 (0.0011) [2024-03-21 07:28:00,521][03784] Fps is (10 sec: 39321.2, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 1460305920. Throughput: 0: 44917.7. Samples: 1461492000. Policy #0 lag: (min: 7.0, avg: 45.1, max: 85.0) [2024-03-21 07:28:00,522][03784] Avg episode reward: [(0, '1.250')] [2024-03-21 07:28:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000044565_1460305920.pth... [2024-03-21 07:28:00,684][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000044229_1449295872.pth [2024-03-21 07:28:05,521][03784] Fps is (10 sec: 26214.6, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 1460404224. Throughput: 0: 45329.0. Samples: 1461773400. Policy #0 lag: (min: 7.0, avg: 45.1, max: 85.0) [2024-03-21 07:28:05,522][03784] Avg episode reward: [(0, '1.255')] [2024-03-21 07:28:08,797][04017] Updated weights for policy 0, policy_version 44575 (0.0015) [2024-03-21 07:28:10,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46967.4, 300 sec: 45208.7). Total num frames: 1460666368. Throughput: 0: 45557.8. Samples: 1462036500. Policy #0 lag: (min: 7.0, avg: 45.1, max: 85.0) [2024-03-21 07:28:10,522][03784] Avg episode reward: [(0, '0.923')] [2024-03-21 07:28:13,957][04017] Updated weights for policy 0, policy_version 44585 (0.0011) [2024-03-21 07:28:15,521][03784] Fps is (10 sec: 55704.8, 60 sec: 46967.4, 300 sec: 45208.7). Total num frames: 1460961280. Throughput: 0: 45177.7. Samples: 1462166500. Policy #0 lag: (min: 1.0, avg: 56.4, max: 105.0) [2024-03-21 07:28:15,522][03784] Avg episode reward: [(0, '1.004')] [2024-03-21 07:28:20,521][03784] Fps is (10 sec: 45875.4, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1461125120. Throughput: 0: 45144.3. Samples: 1462465400. Policy #0 lag: (min: 1.0, avg: 56.4, max: 105.0) [2024-03-21 07:28:20,522][03784] Avg episode reward: [(0, '1.004')] [2024-03-21 07:28:24,183][04017] Updated weights for policy 0, policy_version 44595 (0.0010) [2024-03-21 07:28:25,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 1461387264. Throughput: 0: 45611.0. Samples: 1462735200. Policy #0 lag: (min: 1.0, avg: 56.4, max: 105.0) [2024-03-21 07:28:25,522][03784] Avg episode reward: [(0, '1.270')] [2024-03-21 07:28:29,595][04017] Updated weights for policy 0, policy_version 44605 (0.0012) [2024-03-21 07:28:30,521][03784] Fps is (10 sec: 55706.4, 60 sec: 48606.0, 300 sec: 45986.3). Total num frames: 1461682176. Throughput: 0: 45608.9. Samples: 1462855300. Policy #0 lag: (min: 1.0, avg: 56.4, max: 105.0) [2024-03-21 07:28:30,521][03784] Avg episode reward: [(0, '1.313')] [2024-03-21 07:28:33,595][04017] Updated weights for policy 0, policy_version 44615 (0.0015) [2024-03-21 07:28:35,521][03784] Fps is (10 sec: 62259.6, 60 sec: 50790.4, 300 sec: 46763.8). Total num frames: 1462009856. Throughput: 0: 45351.1. Samples: 1463119000. Policy #0 lag: (min: 1.0, avg: 56.4, max: 105.0) [2024-03-21 07:28:35,522][03784] Avg episode reward: [(0, '1.313')] [2024-03-21 07:28:40,521][03784] Fps is (10 sec: 42598.0, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 1462108160. Throughput: 0: 45162.2. Samples: 1463390200. Policy #0 lag: (min: 1.0, avg: 56.4, max: 105.0) [2024-03-21 07:28:40,522][03784] Avg episode reward: [(0, '1.098')] [2024-03-21 07:28:44,256][03995] Signal inference workers to stop experience collection... (29450 times) [2024-03-21 07:28:44,332][04017] InferenceWorker_p0-w0: stopping experience collection (29450 times) [2024-03-21 07:28:44,523][03995] Signal inference workers to resume experience collection... (29450 times) [2024-03-21 07:28:44,523][04017] InferenceWorker_p0-w0: resuming experience collection (29450 times) [2024-03-21 07:28:45,521][03784] Fps is (10 sec: 19660.8, 60 sec: 41506.1, 300 sec: 45430.9). Total num frames: 1462206464. Throughput: 0: 45249.0. Samples: 1463528200. Policy #0 lag: (min: 0.0, avg: 38.2, max: 97.0) [2024-03-21 07:28:45,522][03784] Avg episode reward: [(0, '1.395')] [2024-03-21 07:28:47,189][04017] Updated weights for policy 0, policy_version 44625 (0.0012) [2024-03-21 07:28:50,521][03784] Fps is (10 sec: 22937.4, 60 sec: 40413.8, 300 sec: 45430.9). Total num frames: 1462337536. Throughput: 0: 44328.8. Samples: 1463768200. Policy #0 lag: (min: 0.0, avg: 38.2, max: 97.0) [2024-03-21 07:28:50,522][03784] Avg episode reward: [(0, '1.457')] [2024-03-21 07:28:55,521][03784] Fps is (10 sec: 36045.0, 60 sec: 40413.9, 300 sec: 45097.7). Total num frames: 1462566912. Throughput: 0: 44186.8. Samples: 1464024900. Policy #0 lag: (min: 0.0, avg: 38.2, max: 97.0) [2024-03-21 07:28:55,522][03784] Avg episode reward: [(0, '0.696')] [2024-03-21 07:28:55,584][04017] Updated weights for policy 0, policy_version 44635 (0.0012) [2024-03-21 07:29:00,521][03784] Fps is (10 sec: 39321.4, 60 sec: 40413.9, 300 sec: 44653.3). Total num frames: 1462730752. Throughput: 0: 44284.4. Samples: 1464159300. Policy #0 lag: (min: 0.0, avg: 38.2, max: 97.0) [2024-03-21 07:29:00,522][03784] Avg episode reward: [(0, '1.373')] [2024-03-21 07:29:04,710][04017] Updated weights for policy 0, policy_version 44645 (0.0014) [2024-03-21 07:29:05,521][03784] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 44875.5). Total num frames: 1462992896. Throughput: 0: 43475.6. Samples: 1464421800. Policy #0 lag: (min: 0.0, avg: 38.2, max: 97.0) [2024-03-21 07:29:05,522][03784] Avg episode reward: [(0, '1.322')] [2024-03-21 07:29:08,536][04017] Updated weights for policy 0, policy_version 44655 (0.0016) [2024-03-21 07:29:10,521][03784] Fps is (10 sec: 55706.3, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 1463287808. Throughput: 0: 43084.5. Samples: 1464674000. Policy #0 lag: (min: 0.0, avg: 38.2, max: 97.0) [2024-03-21 07:29:10,522][03784] Avg episode reward: [(0, '0.961')] [2024-03-21 07:29:15,521][03784] Fps is (10 sec: 55705.0, 60 sec: 43144.5, 300 sec: 45319.8). Total num frames: 1463549952. Throughput: 0: 43419.8. Samples: 1464809200. Policy #0 lag: (min: 0.0, avg: 28.9, max: 61.0) [2024-03-21 07:29:15,522][03784] Avg episode reward: [(0, '0.955')] [2024-03-21 07:29:15,721][04017] Updated weights for policy 0, policy_version 44665 (0.0013) [2024-03-21 07:29:20,521][03784] Fps is (10 sec: 52428.0, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 1463812096. Throughput: 0: 43504.3. Samples: 1465076700. Policy #0 lag: (min: 0.0, avg: 28.9, max: 61.0) [2024-03-21 07:29:20,522][03784] Avg episode reward: [(0, '1.569')] [2024-03-21 07:29:21,516][04017] Updated weights for policy 0, policy_version 44675 (0.0012) [2024-03-21 07:29:25,521][03784] Fps is (10 sec: 58983.5, 60 sec: 45875.3, 300 sec: 45653.1). Total num frames: 1464139776. Throughput: 0: 43544.5. Samples: 1465349700. Policy #0 lag: (min: 0.0, avg: 28.9, max: 61.0) [2024-03-21 07:29:25,522][03784] Avg episode reward: [(0, '1.280')] [2024-03-21 07:29:26,242][04017] Updated weights for policy 0, policy_version 44685 (0.0015) [2024-03-21 07:29:29,579][03995] Signal inference workers to stop experience collection... (29500 times) [2024-03-21 07:29:29,634][04017] InferenceWorker_p0-w0: stopping experience collection (29500 times) [2024-03-21 07:29:29,818][03995] Signal inference workers to resume experience collection... (29500 times) [2024-03-21 07:29:29,818][04017] InferenceWorker_p0-w0: resuming experience collection (29500 times) [2024-03-21 07:29:30,402][04017] Updated weights for policy 0, policy_version 44695 (0.0020) [2024-03-21 07:29:30,521][03784] Fps is (10 sec: 75367.2, 60 sec: 48059.6, 300 sec: 46319.5). Total num frames: 1464565760. Throughput: 0: 43542.2. Samples: 1465487600. Policy #0 lag: (min: 0.0, avg: 28.9, max: 61.0) [2024-03-21 07:29:30,522][03784] Avg episode reward: [(0, '1.280')] [2024-03-21 07:29:35,521][03784] Fps is (10 sec: 52428.3, 60 sec: 44236.7, 300 sec: 45653.0). Total num frames: 1464664064. Throughput: 0: 44340.0. Samples: 1465763500. Policy #0 lag: (min: 0.0, avg: 28.9, max: 61.0) [2024-03-21 07:29:35,522][03784] Avg episode reward: [(0, '0.469')] [2024-03-21 07:29:40,521][03784] Fps is (10 sec: 16384.1, 60 sec: 43690.7, 300 sec: 45097.7). Total num frames: 1464729600. Throughput: 0: 44928.9. Samples: 1466046700. Policy #0 lag: (min: 0.0, avg: 28.9, max: 61.0) [2024-03-21 07:29:40,522][03784] Avg episode reward: [(0, '1.208')] [2024-03-21 07:29:45,521][03784] Fps is (10 sec: 19660.9, 60 sec: 44236.8, 300 sec: 44653.4). Total num frames: 1464860672. Throughput: 0: 45213.4. Samples: 1466193900. Policy #0 lag: (min: 0.0, avg: 33.0, max: 72.0) [2024-03-21 07:29:45,522][03784] Avg episode reward: [(0, '1.208')] [2024-03-21 07:29:45,567][04017] Updated weights for policy 0, policy_version 44705 (0.0015) [2024-03-21 07:29:50,521][03784] Fps is (10 sec: 45874.2, 60 sec: 47513.5, 300 sec: 44986.6). Total num frames: 1465188352. Throughput: 0: 45259.9. Samples: 1466458500. Policy #0 lag: (min: 0.0, avg: 33.0, max: 72.0) [2024-03-21 07:29:50,522][03784] Avg episode reward: [(0, '0.896')] [2024-03-21 07:29:50,752][04017] Updated weights for policy 0, policy_version 44715 (0.0016) [2024-03-21 07:29:55,521][03784] Fps is (10 sec: 55706.0, 60 sec: 47513.6, 300 sec: 45430.9). Total num frames: 1465417728. Throughput: 0: 45593.4. Samples: 1466725700. Policy #0 lag: (min: 0.0, avg: 33.0, max: 72.0) [2024-03-21 07:29:55,522][03784] Avg episode reward: [(0, '1.588')] [2024-03-21 07:30:00,521][03784] Fps is (10 sec: 29491.4, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 1465483264. Throughput: 0: 46015.6. Samples: 1466879900. Policy #0 lag: (min: 0.0, avg: 33.0, max: 72.0) [2024-03-21 07:30:00,522][03784] Avg episode reward: [(0, '0.972')] [2024-03-21 07:30:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000044723_1465483264.pth... [2024-03-21 07:30:00,684][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000044398_1454833664.pth [2024-03-21 07:30:01,373][04017] Updated weights for policy 0, policy_version 44725 (0.0012) [2024-03-21 07:30:05,521][03784] Fps is (10 sec: 26214.5, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 1465679872. Throughput: 0: 46106.9. Samples: 1467151500. Policy #0 lag: (min: 0.0, avg: 33.0, max: 72.0) [2024-03-21 07:30:05,522][03784] Avg episode reward: [(0, '1.451')] [2024-03-21 07:30:09,565][04017] Updated weights for policy 0, policy_version 44735 (0.0016) [2024-03-21 07:30:10,521][03784] Fps is (10 sec: 45875.5, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 1465942016. Throughput: 0: 46088.8. Samples: 1467423700. Policy #0 lag: (min: 0.0, avg: 33.0, max: 72.0) [2024-03-21 07:30:10,522][03784] Avg episode reward: [(0, '1.171')] [2024-03-21 07:30:13,678][04017] Updated weights for policy 0, policy_version 44745 (0.0012) [2024-03-21 07:30:15,521][03784] Fps is (10 sec: 68812.2, 60 sec: 46967.6, 300 sec: 46097.4). Total num frames: 1466368000. Throughput: 0: 45895.6. Samples: 1467552900. Policy #0 lag: (min: 1.0, avg: 35.3, max: 78.0) [2024-03-21 07:30:15,522][03784] Avg episode reward: [(0, '0.987')] [2024-03-21 07:30:18,151][04017] Updated weights for policy 0, policy_version 44755 (0.0020) [2024-03-21 07:30:20,521][03784] Fps is (10 sec: 65536.0, 60 sec: 46421.4, 300 sec: 45653.0). Total num frames: 1466597376. Throughput: 0: 45373.3. Samples: 1467805300. Policy #0 lag: (min: 1.0, avg: 35.3, max: 78.0) [2024-03-21 07:30:20,522][03784] Avg episode reward: [(0, '1.259')] [2024-03-21 07:30:25,521][03784] Fps is (10 sec: 45874.9, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 1466826752. Throughput: 0: 45355.5. Samples: 1468087700. Policy #0 lag: (min: 1.0, avg: 35.3, max: 78.0) [2024-03-21 07:30:25,522][03784] Avg episode reward: [(0, '1.355')] [2024-03-21 07:30:26,098][04017] Updated weights for policy 0, policy_version 44765 (0.0012) [2024-03-21 07:30:30,382][03995] Signal inference workers to stop experience collection... (29550 times) [2024-03-21 07:30:30,383][03995] Signal inference workers to resume experience collection... (29550 times) [2024-03-21 07:30:30,446][04017] InferenceWorker_p0-w0: stopping experience collection (29550 times) [2024-03-21 07:30:30,446][04017] InferenceWorker_p0-w0: resuming experience collection (29550 times) [2024-03-21 07:30:30,521][03784] Fps is (10 sec: 49152.5, 60 sec: 42052.3, 300 sec: 45764.1). Total num frames: 1467088896. Throughput: 0: 45302.3. Samples: 1468232500. Policy #0 lag: (min: 1.0, avg: 35.3, max: 78.0) [2024-03-21 07:30:30,522][03784] Avg episode reward: [(0, '1.121')] [2024-03-21 07:30:31,955][04017] Updated weights for policy 0, policy_version 44775 (0.0010) [2024-03-21 07:30:35,521][03784] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1467285504. Throughput: 0: 45462.3. Samples: 1468504300. Policy #0 lag: (min: 1.0, avg: 35.3, max: 78.0) [2024-03-21 07:30:35,522][03784] Avg episode reward: [(0, '1.147')] [2024-03-21 07:30:38,125][04017] Updated weights for policy 0, policy_version 44785 (0.0011) [2024-03-21 07:30:40,521][03784] Fps is (10 sec: 55704.9, 60 sec: 48605.8, 300 sec: 45986.3). Total num frames: 1467645952. Throughput: 0: 45411.0. Samples: 1468769200. Policy #0 lag: (min: 5.0, avg: 49.7, max: 96.0) [2024-03-21 07:30:40,522][03784] Avg episode reward: [(0, '1.177')] [2024-03-21 07:30:45,521][03784] Fps is (10 sec: 52428.6, 60 sec: 49151.9, 300 sec: 45208.7). Total num frames: 1467809792. Throughput: 0: 45337.8. Samples: 1468920100. Policy #0 lag: (min: 5.0, avg: 49.7, max: 96.0) [2024-03-21 07:30:45,522][03784] Avg episode reward: [(0, '0.843')] [2024-03-21 07:30:45,816][04017] Updated weights for policy 0, policy_version 44795 (0.0017) [2024-03-21 07:30:50,521][03784] Fps is (10 sec: 22937.6, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 1467875328. Throughput: 0: 45764.2. Samples: 1469210900. Policy #0 lag: (min: 5.0, avg: 49.7, max: 96.0) [2024-03-21 07:30:50,522][03784] Avg episode reward: [(0, '1.321')] [2024-03-21 07:30:55,521][03784] Fps is (10 sec: 32768.3, 60 sec: 45329.0, 300 sec: 45097.7). Total num frames: 1468137472. Throughput: 0: 46144.5. Samples: 1469500200. Policy #0 lag: (min: 5.0, avg: 49.7, max: 96.0) [2024-03-21 07:30:55,522][03784] Avg episode reward: [(0, '1.321')] [2024-03-21 07:30:56,731][04017] Updated weights for policy 0, policy_version 44805 (0.0017) [2024-03-21 07:31:00,521][03784] Fps is (10 sec: 49152.6, 60 sec: 48059.8, 300 sec: 45542.0). Total num frames: 1468366848. Throughput: 0: 46086.7. Samples: 1469626800. Policy #0 lag: (min: 5.0, avg: 49.7, max: 96.0) [2024-03-21 07:31:00,522][03784] Avg episode reward: [(0, '0.800')] [2024-03-21 07:31:01,512][04017] Updated weights for policy 0, policy_version 44815 (0.0011) [2024-03-21 07:31:05,521][03784] Fps is (10 sec: 49151.9, 60 sec: 49151.9, 300 sec: 46097.4). Total num frames: 1468628992. Throughput: 0: 46562.2. Samples: 1469900600. Policy #0 lag: (min: 5.0, avg: 49.7, max: 96.0) [2024-03-21 07:31:05,522][03784] Avg episode reward: [(0, '1.330')] [2024-03-21 07:31:09,194][04017] Updated weights for policy 0, policy_version 44825 (0.0012) [2024-03-21 07:31:10,521][03784] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 45875.2). Total num frames: 1468858368. Throughput: 0: 46722.3. Samples: 1470190200. Policy #0 lag: (min: 0.0, avg: 34.8, max: 88.0) [2024-03-21 07:31:10,522][03784] Avg episode reward: [(0, '1.299')] [2024-03-21 07:31:15,521][03784] Fps is (10 sec: 49151.8, 60 sec: 45875.1, 300 sec: 45875.2). Total num frames: 1469120512. Throughput: 0: 46866.6. Samples: 1470341500. Policy #0 lag: (min: 0.0, avg: 34.8, max: 88.0) [2024-03-21 07:31:15,522][03784] Avg episode reward: [(0, '1.299')] [2024-03-21 07:31:17,463][04017] Updated weights for policy 0, policy_version 44835 (0.0015) [2024-03-21 07:31:20,521][03784] Fps is (10 sec: 45874.9, 60 sec: 45329.0, 300 sec: 45764.1). Total num frames: 1469317120. Throughput: 0: 47286.7. Samples: 1470632200. Policy #0 lag: (min: 0.0, avg: 34.8, max: 88.0) [2024-03-21 07:31:20,522][03784] Avg episode reward: [(0, '0.946')] [2024-03-21 07:31:22,758][04017] Updated weights for policy 0, policy_version 44845 (0.0012) [2024-03-21 07:31:25,521][03784] Fps is (10 sec: 45875.6, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 1469579264. Throughput: 0: 47415.7. Samples: 1470902900. Policy #0 lag: (min: 0.0, avg: 34.8, max: 88.0) [2024-03-21 07:31:25,522][03784] Avg episode reward: [(0, '1.079')] [2024-03-21 07:31:28,346][03995] Signal inference workers to stop experience collection... (29600 times) [2024-03-21 07:31:28,415][04017] InferenceWorker_p0-w0: stopping experience collection (29600 times) [2024-03-21 07:31:28,684][03995] Signal inference workers to resume experience collection... (29600 times) [2024-03-21 07:31:28,684][04017] InferenceWorker_p0-w0: resuming experience collection (29600 times) [2024-03-21 07:31:30,521][03784] Fps is (10 sec: 45875.5, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 1469775872. Throughput: 0: 47169.0. Samples: 1471042700. Policy #0 lag: (min: 0.0, avg: 34.8, max: 88.0) [2024-03-21 07:31:30,522][03784] Avg episode reward: [(0, '1.394')] [2024-03-21 07:31:30,961][04017] Updated weights for policy 0, policy_version 44855 (0.0013) [2024-03-21 07:31:35,521][03784] Fps is (10 sec: 45874.7, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 1470038016. Throughput: 0: 46482.2. Samples: 1471302600. Policy #0 lag: (min: 0.0, avg: 34.8, max: 88.0) [2024-03-21 07:31:35,522][03784] Avg episode reward: [(0, '0.888')] [2024-03-21 07:31:36,434][04017] Updated weights for policy 0, policy_version 44865 (0.0011) [2024-03-21 07:31:40,521][03784] Fps is (10 sec: 52428.7, 60 sec: 44236.8, 300 sec: 45653.0). Total num frames: 1470300160. Throughput: 0: 46246.6. Samples: 1471581300. Policy #0 lag: (min: 0.0, avg: 46.8, max: 101.0) [2024-03-21 07:31:40,522][03784] Avg episode reward: [(0, '0.700')] [2024-03-21 07:31:43,437][04017] Updated weights for policy 0, policy_version 44875 (0.0018) [2024-03-21 07:31:45,521][03784] Fps is (10 sec: 52429.0, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 1470562304. Throughput: 0: 46433.2. Samples: 1471716300. Policy #0 lag: (min: 0.0, avg: 46.8, max: 101.0) [2024-03-21 07:31:45,522][03784] Avg episode reward: [(0, '0.841')] [2024-03-21 07:31:50,398][04017] Updated weights for policy 0, policy_version 44885 (0.0024) [2024-03-21 07:31:50,521][03784] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 45653.0). Total num frames: 1470791680. Throughput: 0: 46884.4. Samples: 1472010400. Policy #0 lag: (min: 0.0, avg: 46.8, max: 101.0) [2024-03-21 07:31:50,522][03784] Avg episode reward: [(0, '1.331')] [2024-03-21 07:31:55,521][03784] Fps is (10 sec: 49152.0, 60 sec: 48605.8, 300 sec: 45653.0). Total num frames: 1471053824. Throughput: 0: 46437.7. Samples: 1472279900. Policy #0 lag: (min: 0.0, avg: 46.8, max: 101.0) [2024-03-21 07:31:55,522][03784] Avg episode reward: [(0, '0.989')] [2024-03-21 07:31:57,014][04017] Updated weights for policy 0, policy_version 44895 (0.0020) [2024-03-21 07:32:00,521][03784] Fps is (10 sec: 49152.2, 60 sec: 48605.8, 300 sec: 45875.2). Total num frames: 1471283200. Throughput: 0: 46246.7. Samples: 1472422600. Policy #0 lag: (min: 0.0, avg: 46.8, max: 101.0) [2024-03-21 07:32:00,522][03784] Avg episode reward: [(0, '1.090')] [2024-03-21 07:32:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000044900_1471283200.pth... [2024-03-21 07:32:00,687][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000044565_1460305920.pth [2024-03-21 07:32:02,441][04017] Updated weights for policy 0, policy_version 44905 (0.0010) [2024-03-21 07:32:05,521][03784] Fps is (10 sec: 45875.6, 60 sec: 48059.8, 300 sec: 46319.5). Total num frames: 1471512576. Throughput: 0: 45262.3. Samples: 1472669000. Policy #0 lag: (min: 0.0, avg: 46.8, max: 101.0) [2024-03-21 07:32:05,522][03784] Avg episode reward: [(0, '1.442')] [2024-03-21 07:32:10,521][03784] Fps is (10 sec: 29491.1, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 1471578112. Throughput: 0: 45848.8. Samples: 1472966100. Policy #0 lag: (min: 0.0, avg: 49.2, max: 103.0) [2024-03-21 07:32:10,522][03784] Avg episode reward: [(0, '1.533')] [2024-03-21 07:32:15,521][03784] Fps is (10 sec: 22937.5, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1471741952. Throughput: 0: 46042.2. Samples: 1473114600. Policy #0 lag: (min: 0.0, avg: 49.2, max: 103.0) [2024-03-21 07:32:15,522][03784] Avg episode reward: [(0, '0.702')] [2024-03-21 07:32:15,793][04017] Updated weights for policy 0, policy_version 44915 (0.0015) [2024-03-21 07:32:19,370][03995] Signal inference workers to stop experience collection... (29650 times) [2024-03-21 07:32:19,473][04017] InferenceWorker_p0-w0: stopping experience collection (29650 times) [2024-03-21 07:32:19,756][03995] Signal inference workers to resume experience collection... (29650 times) [2024-03-21 07:32:19,756][04017] InferenceWorker_p0-w0: resuming experience collection (29650 times) [2024-03-21 07:32:20,521][03784] Fps is (10 sec: 45875.4, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1472036864. Throughput: 0: 45844.5. Samples: 1473365600. Policy #0 lag: (min: 0.0, avg: 49.2, max: 103.0) [2024-03-21 07:32:20,522][03784] Avg episode reward: [(0, '0.718')] [2024-03-21 07:32:21,106][04017] Updated weights for policy 0, policy_version 44925 (0.0016) [2024-03-21 07:32:25,521][03784] Fps is (10 sec: 52428.6, 60 sec: 44782.9, 300 sec: 45764.1). Total num frames: 1472266240. Throughput: 0: 44900.0. Samples: 1473601800. Policy #0 lag: (min: 0.0, avg: 49.2, max: 103.0) [2024-03-21 07:32:25,522][03784] Avg episode reward: [(0, '1.265')] [2024-03-21 07:32:29,664][04017] Updated weights for policy 0, policy_version 44935 (0.0012) [2024-03-21 07:32:30,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45329.0, 300 sec: 45875.2). Total num frames: 1472495616. Throughput: 0: 44786.7. Samples: 1473731700. Policy #0 lag: (min: 0.0, avg: 49.2, max: 103.0) [2024-03-21 07:32:30,522][03784] Avg episode reward: [(0, '1.164')] [2024-03-21 07:32:35,521][03784] Fps is (10 sec: 29491.3, 60 sec: 42052.3, 300 sec: 44764.4). Total num frames: 1472561152. Throughput: 0: 44871.1. Samples: 1474029600. Policy #0 lag: (min: 0.0, avg: 49.2, max: 103.0) [2024-03-21 07:32:35,522][03784] Avg episode reward: [(0, '1.233')] [2024-03-21 07:32:38,641][04017] Updated weights for policy 0, policy_version 44945 (0.0012) [2024-03-21 07:32:40,521][03784] Fps is (10 sec: 36044.8, 60 sec: 42598.4, 300 sec: 44542.3). Total num frames: 1472856064. Throughput: 0: 44968.9. Samples: 1474303500. Policy #0 lag: (min: 1.0, avg: 47.9, max: 97.0) [2024-03-21 07:32:40,522][03784] Avg episode reward: [(0, '1.401')] [2024-03-21 07:32:43,270][04017] Updated weights for policy 0, policy_version 44955 (0.0011) [2024-03-21 07:32:45,521][03784] Fps is (10 sec: 65536.1, 60 sec: 44236.8, 300 sec: 45097.7). Total num frames: 1473216512. Throughput: 0: 44668.9. Samples: 1474432700. Policy #0 lag: (min: 1.0, avg: 47.9, max: 97.0) [2024-03-21 07:32:45,522][03784] Avg episode reward: [(0, '0.902')] [2024-03-21 07:32:47,451][04017] Updated weights for policy 0, policy_version 44965 (0.0019) [2024-03-21 07:32:50,521][03784] Fps is (10 sec: 68811.7, 60 sec: 45875.1, 300 sec: 45430.9). Total num frames: 1473544192. Throughput: 0: 45379.8. Samples: 1474711100. Policy #0 lag: (min: 1.0, avg: 47.9, max: 97.0) [2024-03-21 07:32:50,522][03784] Avg episode reward: [(0, '1.368')] [2024-03-21 07:32:55,521][03784] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 1473675264. Throughput: 0: 44582.2. Samples: 1474972300. Policy #0 lag: (min: 1.0, avg: 47.9, max: 97.0) [2024-03-21 07:32:55,522][03784] Avg episode reward: [(0, '0.978')] [2024-03-21 07:32:56,007][04017] Updated weights for policy 0, policy_version 44975 (0.0014) [2024-03-21 07:33:00,521][03784] Fps is (10 sec: 39322.2, 60 sec: 44236.8, 300 sec: 45875.2). Total num frames: 1473937408. Throughput: 0: 44433.3. Samples: 1475114100. Policy #0 lag: (min: 1.0, avg: 47.9, max: 97.0) [2024-03-21 07:33:00,522][03784] Avg episode reward: [(0, '1.497')] [2024-03-21 07:33:03,243][04017] Updated weights for policy 0, policy_version 44985 (0.0011) [2024-03-21 07:33:05,521][03784] Fps is (10 sec: 52429.3, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 1474199552. Throughput: 0: 44986.7. Samples: 1475390000. Policy #0 lag: (min: 1.0, avg: 47.9, max: 97.0) [2024-03-21 07:33:05,522][03784] Avg episode reward: [(0, '1.754')] [2024-03-21 07:33:09,180][04017] Updated weights for policy 0, policy_version 44995 (0.0012) [2024-03-21 07:33:09,699][03995] Signal inference workers to stop experience collection... (29700 times) [2024-03-21 07:33:09,815][03995] Signal inference workers to resume experience collection... (29700 times) [2024-03-21 07:33:09,863][04017] InferenceWorker_p0-w0: stopping experience collection (29700 times) [2024-03-21 07:33:09,900][04017] InferenceWorker_p0-w0: resuming experience collection (29700 times) [2024-03-21 07:33:10,521][03784] Fps is (10 sec: 58982.2, 60 sec: 49152.0, 300 sec: 45986.3). Total num frames: 1474527232. Throughput: 0: 45348.9. Samples: 1475642500. Policy #0 lag: (min: 0.0, avg: 47.4, max: 93.0) [2024-03-21 07:33:10,522][03784] Avg episode reward: [(0, '1.284')] [2024-03-21 07:33:15,521][03784] Fps is (10 sec: 36044.2, 60 sec: 46967.4, 300 sec: 45542.0). Total num frames: 1474560000. Throughput: 0: 45715.5. Samples: 1475788900. Policy #0 lag: (min: 0.0, avg: 47.4, max: 93.0) [2024-03-21 07:33:15,522][03784] Avg episode reward: [(0, '1.146')] [2024-03-21 07:33:19,289][04017] Updated weights for policy 0, policy_version 45005 (0.0010) [2024-03-21 07:33:20,521][03784] Fps is (10 sec: 26214.6, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1474789376. Throughput: 0: 45337.8. Samples: 1476069800. Policy #0 lag: (min: 0.0, avg: 47.4, max: 93.0) [2024-03-21 07:33:20,522][03784] Avg episode reward: [(0, '1.707')] [2024-03-21 07:33:25,521][03784] Fps is (10 sec: 45875.5, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1475018752. Throughput: 0: 45008.9. Samples: 1476328900. Policy #0 lag: (min: 0.0, avg: 47.4, max: 93.0) [2024-03-21 07:33:25,522][03784] Avg episode reward: [(0, '1.338')] [2024-03-21 07:33:25,987][04017] Updated weights for policy 0, policy_version 45015 (0.0011) [2024-03-21 07:33:30,521][03784] Fps is (10 sec: 39321.6, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 1475182592. Throughput: 0: 44995.6. Samples: 1476457500. Policy #0 lag: (min: 0.0, avg: 47.4, max: 93.0) [2024-03-21 07:33:30,522][03784] Avg episode reward: [(0, '1.367')] [2024-03-21 07:33:35,521][03784] Fps is (10 sec: 26214.8, 60 sec: 45329.1, 300 sec: 44653.4). Total num frames: 1475280896. Throughput: 0: 45049.2. Samples: 1476738300. Policy #0 lag: (min: 0.0, avg: 33.8, max: 75.0) [2024-03-21 07:33:35,522][03784] Avg episode reward: [(0, '1.581')] [2024-03-21 07:33:36,707][04017] Updated weights for policy 0, policy_version 45025 (0.0017) [2024-03-21 07:33:40,521][03784] Fps is (10 sec: 45874.8, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1475641344. Throughput: 0: 44675.6. Samples: 1476982700. Policy #0 lag: (min: 0.0, avg: 33.8, max: 75.0) [2024-03-21 07:33:40,522][03784] Avg episode reward: [(0, '1.063')] [2024-03-21 07:33:43,398][04017] Updated weights for policy 0, policy_version 45035 (0.0018) [2024-03-21 07:33:45,521][03784] Fps is (10 sec: 55704.9, 60 sec: 43690.6, 300 sec: 45764.1). Total num frames: 1475837952. Throughput: 0: 44786.7. Samples: 1477129500. Policy #0 lag: (min: 0.0, avg: 33.8, max: 75.0) [2024-03-21 07:33:45,522][03784] Avg episode reward: [(0, '0.800')] [2024-03-21 07:33:50,521][03784] Fps is (10 sec: 26214.3, 60 sec: 39321.7, 300 sec: 45208.7). Total num frames: 1475903488. Throughput: 0: 45204.3. Samples: 1477424200. Policy #0 lag: (min: 0.0, avg: 33.8, max: 75.0) [2024-03-21 07:33:50,522][03784] Avg episode reward: [(0, '1.335')] [2024-03-21 07:33:52,411][04017] Updated weights for policy 0, policy_version 45045 (0.0018) [2024-03-21 07:33:55,521][03784] Fps is (10 sec: 45875.0, 60 sec: 43690.7, 300 sec: 45986.3). Total num frames: 1476296704. Throughput: 0: 44900.0. Samples: 1477663000. Policy #0 lag: (min: 0.0, avg: 33.8, max: 75.0) [2024-03-21 07:33:55,522][03784] Avg episode reward: [(0, '1.118')] [2024-03-21 07:33:57,853][04017] Updated weights for policy 0, policy_version 45055 (0.0010) [2024-03-21 07:34:00,521][03784] Fps is (10 sec: 58981.7, 60 sec: 42598.3, 300 sec: 45764.1). Total num frames: 1476493312. Throughput: 0: 44902.2. Samples: 1477809500. Policy #0 lag: (min: 0.0, avg: 33.8, max: 75.0) [2024-03-21 07:34:00,523][03784] Avg episode reward: [(0, '1.452')] [2024-03-21 07:34:00,800][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000045060_1476526080.pth... [2024-03-21 07:34:00,927][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000044723_1465483264.pth [2024-03-21 07:34:01,632][03995] Signal inference workers to stop experience collection... (29750 times) [2024-03-21 07:34:01,760][04017] InferenceWorker_p0-w0: stopping experience collection (29750 times) [2024-03-21 07:34:01,805][03995] Signal inference workers to resume experience collection... (29750 times) [2024-03-21 07:34:01,829][04017] InferenceWorker_p0-w0: resuming experience collection (29750 times) [2024-03-21 07:34:02,559][04017] Updated weights for policy 0, policy_version 45065 (0.0014) [2024-03-21 07:34:05,521][03784] Fps is (10 sec: 58982.8, 60 sec: 44782.9, 300 sec: 46097.4). Total num frames: 1476886528. Throughput: 0: 43826.6. Samples: 1478042000. Policy #0 lag: (min: 7.0, avg: 63.5, max: 127.0) [2024-03-21 07:34:05,522][03784] Avg episode reward: [(0, '1.073')] [2024-03-21 07:34:08,203][04017] Updated weights for policy 0, policy_version 45075 (0.0020) [2024-03-21 07:34:10,522][03784] Fps is (10 sec: 62255.0, 60 sec: 43143.9, 300 sec: 45986.2). Total num frames: 1477115904. Throughput: 0: 44030.3. Samples: 1478310300. Policy #0 lag: (min: 7.0, avg: 63.5, max: 127.0) [2024-03-21 07:34:10,523][03784] Avg episode reward: [(0, '1.358')] [2024-03-21 07:34:14,264][04017] Updated weights for policy 0, policy_version 45085 (0.0016) [2024-03-21 07:34:15,521][03784] Fps is (10 sec: 52429.4, 60 sec: 47513.8, 300 sec: 46097.4). Total num frames: 1477410816. Throughput: 0: 43984.5. Samples: 1478436800. Policy #0 lag: (min: 7.0, avg: 63.5, max: 127.0) [2024-03-21 07:34:15,522][03784] Avg episode reward: [(0, '0.638')] [2024-03-21 07:34:20,521][03784] Fps is (10 sec: 39324.9, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 1477509120. Throughput: 0: 43739.9. Samples: 1478706600. Policy #0 lag: (min: 7.0, avg: 63.5, max: 127.0) [2024-03-21 07:34:20,522][03784] Avg episode reward: [(0, '1.066')] [2024-03-21 07:34:22,568][04017] Updated weights for policy 0, policy_version 45095 (0.0012) [2024-03-21 07:34:25,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46967.6, 300 sec: 44986.6). Total num frames: 1477836800. Throughput: 0: 44406.8. Samples: 1478981000. Policy #0 lag: (min: 7.0, avg: 63.5, max: 127.0) [2024-03-21 07:34:25,522][03784] Avg episode reward: [(0, '0.836')] [2024-03-21 07:34:30,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46421.3, 300 sec: 45097.7). Total num frames: 1477967872. Throughput: 0: 44573.3. Samples: 1479135300. Policy #0 lag: (min: 7.0, avg: 63.5, max: 127.0) [2024-03-21 07:34:30,522][03784] Avg episode reward: [(0, '0.772')] [2024-03-21 07:34:30,698][04017] Updated weights for policy 0, policy_version 45105 (0.0011) [2024-03-21 07:34:35,521][03784] Fps is (10 sec: 36044.6, 60 sec: 48605.8, 300 sec: 45653.0). Total num frames: 1478197248. Throughput: 0: 44640.1. Samples: 1479433000. Policy #0 lag: (min: 0.0, avg: 34.2, max: 76.0) [2024-03-21 07:34:35,522][03784] Avg episode reward: [(0, '0.772')] [2024-03-21 07:34:38,632][04017] Updated weights for policy 0, policy_version 45115 (0.0022) [2024-03-21 07:34:40,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 1478328320. Throughput: 0: 45673.4. Samples: 1479718300. Policy #0 lag: (min: 0.0, avg: 34.2, max: 76.0) [2024-03-21 07:34:40,522][03784] Avg episode reward: [(0, '1.104')] [2024-03-21 07:34:45,521][03784] Fps is (10 sec: 29491.2, 60 sec: 44236.8, 300 sec: 45097.7). Total num frames: 1478492160. Throughput: 0: 45538.0. Samples: 1479858700. Policy #0 lag: (min: 0.0, avg: 34.2, max: 76.0) [2024-03-21 07:34:45,522][03784] Avg episode reward: [(0, '1.104')] [2024-03-21 07:34:48,784][04017] Updated weights for policy 0, policy_version 45125 (0.0012) [2024-03-21 07:34:50,521][03784] Fps is (10 sec: 42598.6, 60 sec: 47513.7, 300 sec: 45208.7). Total num frames: 1478754304. Throughput: 0: 46213.4. Samples: 1480121600. Policy #0 lag: (min: 0.0, avg: 34.2, max: 76.0) [2024-03-21 07:34:50,522][03784] Avg episode reward: [(0, '1.665')] [2024-03-21 07:34:53,454][04017] Updated weights for policy 0, policy_version 45135 (0.0018) [2024-03-21 07:34:55,521][03784] Fps is (10 sec: 55705.3, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 1479049216. Throughput: 0: 45889.7. Samples: 1480375300. Policy #0 lag: (min: 0.0, avg: 34.2, max: 76.0) [2024-03-21 07:34:55,522][03784] Avg episode reward: [(0, '1.309')] [2024-03-21 07:35:00,521][03784] Fps is (10 sec: 39321.2, 60 sec: 44236.9, 300 sec: 45653.0). Total num frames: 1479147520. Throughput: 0: 46186.5. Samples: 1480515200. Policy #0 lag: (min: 0.0, avg: 34.2, max: 76.0) [2024-03-21 07:35:00,522][03784] Avg episode reward: [(0, '1.076')] [2024-03-21 07:35:02,622][04017] Updated weights for policy 0, policy_version 45145 (0.0009) [2024-03-21 07:35:04,803][03995] Signal inference workers to stop experience collection... (29800 times) [2024-03-21 07:35:04,804][03995] Signal inference workers to resume experience collection... (29800 times) [2024-03-21 07:35:04,897][04017] InferenceWorker_p0-w0: stopping experience collection (29800 times) [2024-03-21 07:35:04,897][04017] InferenceWorker_p0-w0: resuming experience collection (29800 times) [2024-03-21 07:35:05,521][03784] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 45764.1). Total num frames: 1479442432. Throughput: 0: 46275.4. Samples: 1480789000. Policy #0 lag: (min: 2.0, avg: 32.1, max: 92.0) [2024-03-21 07:35:05,522][03784] Avg episode reward: [(0, '1.055')] [2024-03-21 07:35:09,307][04017] Updated weights for policy 0, policy_version 45155 (0.0027) [2024-03-21 07:35:10,521][03784] Fps is (10 sec: 58983.1, 60 sec: 43691.3, 300 sec: 45319.8). Total num frames: 1479737344. Throughput: 0: 46102.2. Samples: 1481055600. Policy #0 lag: (min: 2.0, avg: 32.1, max: 92.0) [2024-03-21 07:35:10,522][03784] Avg episode reward: [(0, '0.631')] [2024-03-21 07:35:15,521][03784] Fps is (10 sec: 49153.0, 60 sec: 42052.2, 300 sec: 45208.7). Total num frames: 1479933952. Throughput: 0: 45555.6. Samples: 1481185300. Policy #0 lag: (min: 2.0, avg: 32.1, max: 92.0) [2024-03-21 07:35:15,522][03784] Avg episode reward: [(0, '1.141')] [2024-03-21 07:35:16,126][04017] Updated weights for policy 0, policy_version 45165 (0.0019) [2024-03-21 07:35:20,521][03784] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 45208.7). Total num frames: 1480163328. Throughput: 0: 44786.7. Samples: 1481448400. Policy #0 lag: (min: 2.0, avg: 32.1, max: 92.0) [2024-03-21 07:35:20,522][03784] Avg episode reward: [(0, '1.031')] [2024-03-21 07:35:22,206][04017] Updated weights for policy 0, policy_version 45175 (0.0010) [2024-03-21 07:35:25,521][03784] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 1480458240. Throughput: 0: 44764.5. Samples: 1481732700. Policy #0 lag: (min: 2.0, avg: 32.1, max: 92.0) [2024-03-21 07:35:25,522][03784] Avg episode reward: [(0, '1.031')] [2024-03-21 07:35:28,633][04017] Updated weights for policy 0, policy_version 45185 (0.0013) [2024-03-21 07:35:30,521][03784] Fps is (10 sec: 58982.0, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 1480753152. Throughput: 0: 44695.5. Samples: 1481870000. Policy #0 lag: (min: 3.0, avg: 52.8, max: 100.0) [2024-03-21 07:35:30,522][03784] Avg episode reward: [(0, '1.256')] [2024-03-21 07:35:35,521][03784] Fps is (10 sec: 39321.6, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 1480851456. Throughput: 0: 44828.9. Samples: 1482138900. Policy #0 lag: (min: 3.0, avg: 52.8, max: 100.0) [2024-03-21 07:35:35,522][03784] Avg episode reward: [(0, '0.994')] [2024-03-21 07:35:37,064][04017] Updated weights for policy 0, policy_version 45195 (0.0016) [2024-03-21 07:35:40,521][03784] Fps is (10 sec: 49151.7, 60 sec: 48605.8, 300 sec: 45542.0). Total num frames: 1481244672. Throughput: 0: 44908.8. Samples: 1482396200. Policy #0 lag: (min: 3.0, avg: 52.8, max: 100.0) [2024-03-21 07:35:40,522][03784] Avg episode reward: [(0, '1.544')] [2024-03-21 07:35:40,665][04017] Updated weights for policy 0, policy_version 45205 (0.0022) [2024-03-21 07:35:45,521][03784] Fps is (10 sec: 58982.1, 60 sec: 49151.9, 300 sec: 45986.3). Total num frames: 1481441280. Throughput: 0: 44595.6. Samples: 1482522000. Policy #0 lag: (min: 3.0, avg: 52.8, max: 100.0) [2024-03-21 07:35:45,522][03784] Avg episode reward: [(0, '0.888')] [2024-03-21 07:35:49,806][04017] Updated weights for policy 0, policy_version 45215 (0.0013) [2024-03-21 07:35:50,521][03784] Fps is (10 sec: 39321.9, 60 sec: 48059.7, 300 sec: 45764.1). Total num frames: 1481637888. Throughput: 0: 44957.9. Samples: 1482812100. Policy #0 lag: (min: 3.0, avg: 52.8, max: 100.0) [2024-03-21 07:35:50,522][03784] Avg episode reward: [(0, '1.402')] [2024-03-21 07:35:55,521][03784] Fps is (10 sec: 32768.2, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1481768960. Throughput: 0: 45384.4. Samples: 1483097900. Policy #0 lag: (min: 3.0, avg: 52.8, max: 100.0) [2024-03-21 07:35:55,522][03784] Avg episode reward: [(0, '1.223')] [2024-03-21 07:35:56,533][03995] Signal inference workers to stop experience collection... (29850 times) [2024-03-21 07:35:56,533][03995] Signal inference workers to resume experience collection... (29850 times) [2024-03-21 07:35:56,624][04017] InferenceWorker_p0-w0: stopping experience collection (29850 times) [2024-03-21 07:35:56,624][04017] InferenceWorker_p0-w0: resuming experience collection (29850 times) [2024-03-21 07:35:59,098][04017] Updated weights for policy 0, policy_version 45225 (0.0010) [2024-03-21 07:36:00,521][03784] Fps is (10 sec: 39321.8, 60 sec: 48059.8, 300 sec: 45430.9). Total num frames: 1482031104. Throughput: 0: 45802.2. Samples: 1483246400. Policy #0 lag: (min: 0.0, avg: 38.6, max: 86.0) [2024-03-21 07:36:00,522][03784] Avg episode reward: [(0, '1.230')] [2024-03-21 07:36:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000045228_1482031104.pth... [2024-03-21 07:36:00,657][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000044900_1471283200.pth [2024-03-21 07:36:05,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45329.2, 300 sec: 45097.7). Total num frames: 1482162176. Throughput: 0: 46626.7. Samples: 1483546600. Policy #0 lag: (min: 0.0, avg: 38.6, max: 86.0) [2024-03-21 07:36:05,522][03784] Avg episode reward: [(0, '1.363')] [2024-03-21 07:36:10,400][04017] Updated weights for policy 0, policy_version 45235 (0.0016) [2024-03-21 07:36:10,521][03784] Fps is (10 sec: 22937.5, 60 sec: 42052.2, 300 sec: 44542.3). Total num frames: 1482260480. Throughput: 0: 46560.0. Samples: 1483827900. Policy #0 lag: (min: 0.0, avg: 38.6, max: 86.0) [2024-03-21 07:36:10,522][03784] Avg episode reward: [(0, '1.059')] [2024-03-21 07:36:15,521][03784] Fps is (10 sec: 36043.9, 60 sec: 43144.4, 300 sec: 44764.4). Total num frames: 1482522624. Throughput: 0: 46022.1. Samples: 1483941000. Policy #0 lag: (min: 0.0, avg: 38.6, max: 86.0) [2024-03-21 07:36:15,522][03784] Avg episode reward: [(0, '1.387')] [2024-03-21 07:36:16,097][04017] Updated weights for policy 0, policy_version 45245 (0.0016) [2024-03-21 07:36:20,521][03784] Fps is (10 sec: 62259.3, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 1482883072. Throughput: 0: 45602.3. Samples: 1484191000. Policy #0 lag: (min: 0.0, avg: 38.6, max: 86.0) [2024-03-21 07:36:20,521][03784] Avg episode reward: [(0, '1.318')] [2024-03-21 07:36:20,596][04017] Updated weights for policy 0, policy_version 45255 (0.0013) [2024-03-21 07:36:25,521][03784] Fps is (10 sec: 68813.7, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 1483210752. Throughput: 0: 45815.6. Samples: 1484457900. Policy #0 lag: (min: 0.0, avg: 38.6, max: 86.0) [2024-03-21 07:36:25,522][03784] Avg episode reward: [(0, '1.236')] [2024-03-21 07:36:25,715][04017] Updated weights for policy 0, policy_version 45265 (0.0014) [2024-03-21 07:36:30,521][03784] Fps is (10 sec: 58981.8, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 1483472896. Throughput: 0: 46237.8. Samples: 1484602700. Policy #0 lag: (min: 2.0, avg: 48.8, max: 100.0) [2024-03-21 07:36:30,522][03784] Avg episode reward: [(0, '1.043')] [2024-03-21 07:36:31,507][04017] Updated weights for policy 0, policy_version 45275 (0.0031) [2024-03-21 07:36:35,521][03784] Fps is (10 sec: 58983.2, 60 sec: 49152.1, 300 sec: 45764.1). Total num frames: 1483800576. Throughput: 0: 46177.9. Samples: 1484890100. Policy #0 lag: (min: 2.0, avg: 48.8, max: 100.0) [2024-03-21 07:36:35,522][03784] Avg episode reward: [(0, '1.632')] [2024-03-21 07:36:37,690][04017] Updated weights for policy 0, policy_version 45285 (0.0016) [2024-03-21 07:36:40,521][03784] Fps is (10 sec: 45875.3, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 1483931648. Throughput: 0: 46148.8. Samples: 1485174600. Policy #0 lag: (min: 2.0, avg: 48.8, max: 100.0) [2024-03-21 07:36:40,522][03784] Avg episode reward: [(0, '0.830')] [2024-03-21 07:36:45,363][04017] Updated weights for policy 0, policy_version 45295 (0.0015) [2024-03-21 07:36:45,521][03784] Fps is (10 sec: 42597.6, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1484226560. Throughput: 0: 45771.0. Samples: 1485306100. Policy #0 lag: (min: 2.0, avg: 48.8, max: 100.0) [2024-03-21 07:36:45,522][03784] Avg episode reward: [(0, '1.201')] [2024-03-21 07:36:49,843][03995] Signal inference workers to stop experience collection... (29900 times) [2024-03-21 07:36:49,844][03995] Signal inference workers to resume experience collection... (29900 times) [2024-03-21 07:36:49,916][04017] InferenceWorker_p0-w0: stopping experience collection (29900 times) [2024-03-21 07:36:49,916][04017] InferenceWorker_p0-w0: resuming experience collection (29900 times) [2024-03-21 07:36:50,521][03784] Fps is (10 sec: 55705.8, 60 sec: 47513.6, 300 sec: 45542.0). Total num frames: 1484488704. Throughput: 0: 45422.1. Samples: 1485590600. Policy #0 lag: (min: 2.0, avg: 48.8, max: 100.0) [2024-03-21 07:36:50,522][03784] Avg episode reward: [(0, '0.641')] [2024-03-21 07:36:53,319][04017] Updated weights for policy 0, policy_version 45305 (0.0014) [2024-03-21 07:36:55,521][03784] Fps is (10 sec: 39322.0, 60 sec: 47513.6, 300 sec: 45208.7). Total num frames: 1484619776. Throughput: 0: 45775.5. Samples: 1485887800. Policy #0 lag: (min: 2.0, avg: 48.8, max: 100.0) [2024-03-21 07:36:55,522][03784] Avg episode reward: [(0, '1.272')] [2024-03-21 07:37:00,521][03784] Fps is (10 sec: 22937.5, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 1484718080. Throughput: 0: 46253.5. Samples: 1486022400. Policy #0 lag: (min: 0.0, avg: 27.9, max: 75.0) [2024-03-21 07:37:00,522][03784] Avg episode reward: [(0, '1.203')] [2024-03-21 07:37:04,553][04017] Updated weights for policy 0, policy_version 45315 (0.0011) [2024-03-21 07:37:05,521][03784] Fps is (10 sec: 32767.8, 60 sec: 46421.2, 300 sec: 45319.8). Total num frames: 1484947456. Throughput: 0: 46819.9. Samples: 1486297900. Policy #0 lag: (min: 0.0, avg: 27.9, max: 75.0) [2024-03-21 07:37:05,522][03784] Avg episode reward: [(0, '1.203')] [2024-03-21 07:37:09,322][04017] Updated weights for policy 0, policy_version 45325 (0.0012) [2024-03-21 07:37:10,521][03784] Fps is (10 sec: 49151.8, 60 sec: 49151.9, 300 sec: 45653.0). Total num frames: 1485209600. Throughput: 0: 46106.6. Samples: 1486532700. Policy #0 lag: (min: 0.0, avg: 27.9, max: 75.0) [2024-03-21 07:37:10,522][03784] Avg episode reward: [(0, '0.586')] [2024-03-21 07:37:15,521][03784] Fps is (10 sec: 39322.3, 60 sec: 46967.7, 300 sec: 45097.7). Total num frames: 1485340672. Throughput: 0: 45720.2. Samples: 1486660100. Policy #0 lag: (min: 0.0, avg: 27.9, max: 75.0) [2024-03-21 07:37:15,521][03784] Avg episode reward: [(0, '0.747')] [2024-03-21 07:37:18,692][04017] Updated weights for policy 0, policy_version 45335 (0.0017) [2024-03-21 07:37:20,521][03784] Fps is (10 sec: 39322.0, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 1485602816. Throughput: 0: 45322.2. Samples: 1486929600. Policy #0 lag: (min: 0.0, avg: 27.9, max: 75.0) [2024-03-21 07:37:20,522][03784] Avg episode reward: [(0, '0.734')] [2024-03-21 07:37:25,521][03784] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 44986.6). Total num frames: 1485766656. Throughput: 0: 45408.9. Samples: 1487218000. Policy #0 lag: (min: 0.0, avg: 27.9, max: 75.0) [2024-03-21 07:37:25,522][03784] Avg episode reward: [(0, '1.165')] [2024-03-21 07:37:27,079][04017] Updated weights for policy 0, policy_version 45345 (0.0010) [2024-03-21 07:37:30,521][03784] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 45653.0). Total num frames: 1486028800. Throughput: 0: 45775.6. Samples: 1487366000. Policy #0 lag: (min: 0.0, avg: 33.8, max: 72.0) [2024-03-21 07:37:30,522][03784] Avg episode reward: [(0, '0.669')] [2024-03-21 07:37:34,798][04017] Updated weights for policy 0, policy_version 45355 (0.0009) [2024-03-21 07:37:35,521][03784] Fps is (10 sec: 45875.0, 60 sec: 40413.8, 300 sec: 45319.8). Total num frames: 1486225408. Throughput: 0: 45482.1. Samples: 1487637300. Policy #0 lag: (min: 0.0, avg: 33.8, max: 72.0) [2024-03-21 07:37:35,522][03784] Avg episode reward: [(0, '1.114')] [2024-03-21 07:37:39,617][04017] Updated weights for policy 0, policy_version 45365 (0.0017) [2024-03-21 07:37:40,521][03784] Fps is (10 sec: 55706.2, 60 sec: 44236.9, 300 sec: 45319.8). Total num frames: 1486585856. Throughput: 0: 44535.6. Samples: 1487891900. Policy #0 lag: (min: 0.0, avg: 33.8, max: 72.0) [2024-03-21 07:37:40,521][03784] Avg episode reward: [(0, '1.527')] [2024-03-21 07:37:44,558][04017] Updated weights for policy 0, policy_version 45375 (0.0011) [2024-03-21 07:37:45,521][03784] Fps is (10 sec: 68814.0, 60 sec: 44783.1, 300 sec: 45319.9). Total num frames: 1486913536. Throughput: 0: 44631.2. Samples: 1488030800. Policy #0 lag: (min: 0.0, avg: 33.8, max: 72.0) [2024-03-21 07:37:45,522][03784] Avg episode reward: [(0, '1.278')] [2024-03-21 07:37:46,482][03995] Signal inference workers to stop experience collection... (29950 times) [2024-03-21 07:37:46,487][03995] Signal inference workers to resume experience collection... (29950 times) [2024-03-21 07:37:46,564][04017] InferenceWorker_p0-w0: stopping experience collection (29950 times) [2024-03-21 07:37:46,564][04017] InferenceWorker_p0-w0: resuming experience collection (29950 times) [2024-03-21 07:37:50,521][03784] Fps is (10 sec: 52428.2, 60 sec: 43690.6, 300 sec: 45542.0). Total num frames: 1487110144. Throughput: 0: 44962.3. Samples: 1488321200. Policy #0 lag: (min: 0.0, avg: 33.8, max: 72.0) [2024-03-21 07:37:50,522][03784] Avg episode reward: [(0, '1.278')] [2024-03-21 07:37:51,066][04017] Updated weights for policy 0, policy_version 45385 (0.0011) [2024-03-21 07:37:55,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46421.4, 300 sec: 45653.1). Total num frames: 1487405056. Throughput: 0: 45809.0. Samples: 1488594100. Policy #0 lag: (min: 0.0, avg: 33.8, max: 74.0) [2024-03-21 07:37:55,522][03784] Avg episode reward: [(0, '1.143')] [2024-03-21 07:37:57,156][04017] Updated weights for policy 0, policy_version 45395 (0.0012) [2024-03-21 07:38:00,521][03784] Fps is (10 sec: 65536.4, 60 sec: 50790.5, 300 sec: 45986.3). Total num frames: 1487765504. Throughput: 0: 46086.6. Samples: 1488734000. Policy #0 lag: (min: 0.0, avg: 33.8, max: 74.0) [2024-03-21 07:38:00,522][03784] Avg episode reward: [(0, '1.143')] [2024-03-21 07:38:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000045403_1487765504.pth... [2024-03-21 07:38:00,624][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000045060_1476526080.pth [2024-03-21 07:38:05,333][04017] Updated weights for policy 0, policy_version 45405 (0.0019) [2024-03-21 07:38:05,521][03784] Fps is (10 sec: 42598.2, 60 sec: 48059.8, 300 sec: 45097.7). Total num frames: 1487831040. Throughput: 0: 46451.1. Samples: 1489019900. Policy #0 lag: (min: 0.0, avg: 33.8, max: 74.0) [2024-03-21 07:38:05,522][03784] Avg episode reward: [(0, '1.449')] [2024-03-21 07:38:10,521][03784] Fps is (10 sec: 22937.2, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1487994880. Throughput: 0: 46064.3. Samples: 1489290900. Policy #0 lag: (min: 0.0, avg: 33.8, max: 74.0) [2024-03-21 07:38:10,522][03784] Avg episode reward: [(0, '0.693')] [2024-03-21 07:38:12,995][04017] Updated weights for policy 0, policy_version 45415 (0.0016) [2024-03-21 07:38:15,521][03784] Fps is (10 sec: 39321.6, 60 sec: 48059.7, 300 sec: 45542.0). Total num frames: 1488224256. Throughput: 0: 45864.5. Samples: 1489429900. Policy #0 lag: (min: 0.0, avg: 33.8, max: 74.0) [2024-03-21 07:38:15,522][03784] Avg episode reward: [(0, '0.761')] [2024-03-21 07:38:19,998][04017] Updated weights for policy 0, policy_version 45425 (0.0014) [2024-03-21 07:38:20,521][03784] Fps is (10 sec: 52429.2, 60 sec: 48605.8, 300 sec: 45764.1). Total num frames: 1488519168. Throughput: 0: 46055.6. Samples: 1489709800. Policy #0 lag: (min: 0.0, avg: 33.8, max: 74.0) [2024-03-21 07:38:20,522][03784] Avg episode reward: [(0, '0.740')] [2024-03-21 07:38:25,521][03784] Fps is (10 sec: 45875.2, 60 sec: 48605.9, 300 sec: 45764.1). Total num frames: 1488683008. Throughput: 0: 45646.6. Samples: 1489946000. Policy #0 lag: (min: 1.0, avg: 40.1, max: 87.0) [2024-03-21 07:38:25,522][03784] Avg episode reward: [(0, '0.863')] [2024-03-21 07:38:30,521][03784] Fps is (10 sec: 16384.0, 60 sec: 44236.8, 300 sec: 45430.9). Total num frames: 1488683008. Throughput: 0: 45755.4. Samples: 1490089800. Policy #0 lag: (min: 1.0, avg: 40.1, max: 87.0) [2024-03-21 07:38:30,522][03784] Avg episode reward: [(0, '1.326')] [2024-03-21 07:38:32,565][04017] Updated weights for policy 0, policy_version 45435 (0.0016) [2024-03-21 07:38:35,521][03784] Fps is (10 sec: 26214.2, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 1488945152. Throughput: 0: 45571.1. Samples: 1490371900. Policy #0 lag: (min: 1.0, avg: 40.1, max: 87.0) [2024-03-21 07:38:35,522][03784] Avg episode reward: [(0, '1.863')] [2024-03-21 07:38:40,521][03784] Fps is (10 sec: 36044.8, 60 sec: 40959.9, 300 sec: 44764.4). Total num frames: 1489043456. Throughput: 0: 45006.5. Samples: 1490619400. Policy #0 lag: (min: 1.0, avg: 40.1, max: 87.0) [2024-03-21 07:38:40,522][03784] Avg episode reward: [(0, '1.143')] [2024-03-21 07:38:41,567][04017] Updated weights for policy 0, policy_version 45445 (0.0016) [2024-03-21 07:38:45,521][03784] Fps is (10 sec: 45875.1, 60 sec: 41506.0, 300 sec: 45764.1). Total num frames: 1489403904. Throughput: 0: 44642.1. Samples: 1490742900. Policy #0 lag: (min: 1.0, avg: 40.1, max: 87.0) [2024-03-21 07:38:45,522][03784] Avg episode reward: [(0, '1.133')] [2024-03-21 07:38:47,301][03995] Signal inference workers to stop experience collection... (30000 times) [2024-03-21 07:38:47,360][04017] InferenceWorker_p0-w0: stopping experience collection (30000 times) [2024-03-21 07:38:47,363][03995] Signal inference workers to resume experience collection... (30000 times) [2024-03-21 07:38:47,412][04017] InferenceWorker_p0-w0: resuming experience collection (30000 times) [2024-03-21 07:38:47,420][04017] Updated weights for policy 0, policy_version 45455 (0.0014) [2024-03-21 07:38:50,521][03784] Fps is (10 sec: 58982.9, 60 sec: 42052.3, 300 sec: 45208.7). Total num frames: 1489633280. Throughput: 0: 43895.6. Samples: 1490995200. Policy #0 lag: (min: 1.0, avg: 40.1, max: 87.0) [2024-03-21 07:38:50,522][03784] Avg episode reward: [(0, '0.967')] [2024-03-21 07:38:53,280][04017] Updated weights for policy 0, policy_version 45465 (0.0011) [2024-03-21 07:38:55,521][03784] Fps is (10 sec: 45875.5, 60 sec: 40960.0, 300 sec: 45319.8). Total num frames: 1489862656. Throughput: 0: 44040.1. Samples: 1491272700. Policy #0 lag: (min: 0.0, avg: 43.1, max: 112.0) [2024-03-21 07:38:55,522][03784] Avg episode reward: [(0, '0.895')] [2024-03-21 07:38:58,064][04017] Updated weights for policy 0, policy_version 45475 (0.0020) [2024-03-21 07:39:00,521][03784] Fps is (10 sec: 65535.4, 60 sec: 42052.2, 300 sec: 45430.9). Total num frames: 1490288640. Throughput: 0: 43355.5. Samples: 1491380900. Policy #0 lag: (min: 0.0, avg: 43.1, max: 112.0) [2024-03-21 07:39:00,522][03784] Avg episode reward: [(0, '0.780')] [2024-03-21 07:39:02,230][04017] Updated weights for policy 0, policy_version 45485 (0.0012) [2024-03-21 07:39:05,521][03784] Fps is (10 sec: 75366.3, 60 sec: 46421.3, 300 sec: 45764.3). Total num frames: 1490616320. Throughput: 0: 42982.3. Samples: 1491644000. Policy #0 lag: (min: 0.0, avg: 43.1, max: 112.0) [2024-03-21 07:39:05,522][03784] Avg episode reward: [(0, '0.780')] [2024-03-21 07:39:10,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 45097.6). Total num frames: 1490714624. Throughput: 0: 44402.1. Samples: 1491944100. Policy #0 lag: (min: 0.0, avg: 43.1, max: 112.0) [2024-03-21 07:39:10,522][03784] Avg episode reward: [(0, '0.780')] [2024-03-21 07:39:11,397][04017] Updated weights for policy 0, policy_version 45495 (0.0019) [2024-03-21 07:39:15,521][03784] Fps is (10 sec: 26214.3, 60 sec: 44236.7, 300 sec: 45319.8). Total num frames: 1490878464. Throughput: 0: 44346.7. Samples: 1492085400. Policy #0 lag: (min: 0.0, avg: 43.1, max: 112.0) [2024-03-21 07:39:15,522][03784] Avg episode reward: [(0, '1.398')] [2024-03-21 07:39:20,521][03784] Fps is (10 sec: 36044.6, 60 sec: 42598.4, 300 sec: 44875.5). Total num frames: 1491075072. Throughput: 0: 43859.9. Samples: 1492345600. Policy #0 lag: (min: 0.0, avg: 43.1, max: 112.0) [2024-03-21 07:39:20,522][03784] Avg episode reward: [(0, '1.323')] [2024-03-21 07:39:20,643][04017] Updated weights for policy 0, policy_version 45505 (0.0013) [2024-03-21 07:39:25,521][03784] Fps is (10 sec: 49152.7, 60 sec: 44783.0, 300 sec: 45430.9). Total num frames: 1491369984. Throughput: 0: 44131.2. Samples: 1492605300. Policy #0 lag: (min: 0.0, avg: 45.0, max: 114.0) [2024-03-21 07:39:25,522][03784] Avg episode reward: [(0, '0.871')] [2024-03-21 07:39:25,956][04017] Updated weights for policy 0, policy_version 45515 (0.0011) [2024-03-21 07:39:30,521][03784] Fps is (10 sec: 49152.7, 60 sec: 48059.8, 300 sec: 45319.8). Total num frames: 1491566592. Throughput: 0: 44609.0. Samples: 1492750300. Policy #0 lag: (min: 0.0, avg: 45.0, max: 114.0) [2024-03-21 07:39:30,522][03784] Avg episode reward: [(0, '1.696')] [2024-03-21 07:39:35,521][03784] Fps is (10 sec: 36044.7, 60 sec: 46421.4, 300 sec: 45430.9). Total num frames: 1491730432. Throughput: 0: 45180.0. Samples: 1493028300. Policy #0 lag: (min: 0.0, avg: 45.0, max: 114.0) [2024-03-21 07:39:35,522][03784] Avg episode reward: [(0, '0.646')] [2024-03-21 07:39:35,551][04017] Updated weights for policy 0, policy_version 45525 (0.0022) [2024-03-21 07:39:40,521][03784] Fps is (10 sec: 32767.9, 60 sec: 47513.6, 300 sec: 45430.9). Total num frames: 1491894272. Throughput: 0: 44866.7. Samples: 1493291700. Policy #0 lag: (min: 0.0, avg: 45.0, max: 114.0) [2024-03-21 07:39:40,522][03784] Avg episode reward: [(0, '1.623')] [2024-03-21 07:39:42,319][03995] Signal inference workers to stop experience collection... (30050 times) [2024-03-21 07:39:42,319][03995] Signal inference workers to resume experience collection... (30050 times) [2024-03-21 07:39:42,389][04017] InferenceWorker_p0-w0: stopping experience collection (30050 times) [2024-03-21 07:39:42,390][04017] InferenceWorker_p0-w0: resuming experience collection (30050 times) [2024-03-21 07:39:43,347][04017] Updated weights for policy 0, policy_version 45535 (0.0011) [2024-03-21 07:39:45,521][03784] Fps is (10 sec: 42598.1, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1492156416. Throughput: 0: 45295.6. Samples: 1493419200. Policy #0 lag: (min: 0.0, avg: 45.0, max: 114.0) [2024-03-21 07:39:45,522][03784] Avg episode reward: [(0, '1.595')] [2024-03-21 07:39:50,521][03784] Fps is (10 sec: 45874.8, 60 sec: 45329.0, 300 sec: 45097.6). Total num frames: 1492353024. Throughput: 0: 45288.8. Samples: 1493682000. Policy #0 lag: (min: 0.0, avg: 45.0, max: 114.0) [2024-03-21 07:39:50,522][03784] Avg episode reward: [(0, '1.593')] [2024-03-21 07:39:51,347][04017] Updated weights for policy 0, policy_version 45545 (0.0019) [2024-03-21 07:39:55,521][03784] Fps is (10 sec: 45875.7, 60 sec: 45875.3, 300 sec: 45653.1). Total num frames: 1492615168. Throughput: 0: 44880.1. Samples: 1493963700. Policy #0 lag: (min: 0.0, avg: 37.3, max: 81.0) [2024-03-21 07:39:55,522][03784] Avg episode reward: [(0, '1.249')] [2024-03-21 07:40:00,426][04017] Updated weights for policy 0, policy_version 45555 (0.0011) [2024-03-21 07:40:00,521][03784] Fps is (10 sec: 39321.4, 60 sec: 40959.9, 300 sec: 45097.7). Total num frames: 1492746240. Throughput: 0: 44873.3. Samples: 1494104700. Policy #0 lag: (min: 0.0, avg: 37.3, max: 81.0) [2024-03-21 07:40:00,522][03784] Avg episode reward: [(0, '1.161')] [2024-03-21 07:40:00,767][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000045556_1492779008.pth... [2024-03-21 07:40:00,889][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000045228_1482031104.pth [2024-03-21 07:40:05,521][03784] Fps is (10 sec: 42598.4, 60 sec: 40413.9, 300 sec: 45097.7). Total num frames: 1493041152. Throughput: 0: 45275.8. Samples: 1494383000. Policy #0 lag: (min: 0.0, avg: 37.3, max: 81.0) [2024-03-21 07:40:05,522][03784] Avg episode reward: [(0, '0.781')] [2024-03-21 07:40:06,363][04017] Updated weights for policy 0, policy_version 45565 (0.0013) [2024-03-21 07:40:10,521][03784] Fps is (10 sec: 55706.4, 60 sec: 43144.6, 300 sec: 45319.8). Total num frames: 1493303296. Throughput: 0: 45686.6. Samples: 1494661200. Policy #0 lag: (min: 0.0, avg: 37.3, max: 81.0) [2024-03-21 07:40:10,522][03784] Avg episode reward: [(0, '1.331')] [2024-03-21 07:40:11,714][04017] Updated weights for policy 0, policy_version 45575 (0.0011) [2024-03-21 07:40:15,521][03784] Fps is (10 sec: 65535.1, 60 sec: 46967.5, 300 sec: 45875.2). Total num frames: 1493696512. Throughput: 0: 44946.6. Samples: 1494772900. Policy #0 lag: (min: 0.0, avg: 37.3, max: 81.0) [2024-03-21 07:40:15,522][03784] Avg episode reward: [(0, '1.416')] [2024-03-21 07:40:15,762][04017] Updated weights for policy 0, policy_version 45585 (0.0014) [2024-03-21 07:40:20,521][03784] Fps is (10 sec: 58982.0, 60 sec: 46967.5, 300 sec: 45542.0). Total num frames: 1493893120. Throughput: 0: 45131.0. Samples: 1495059200. Policy #0 lag: (min: 0.0, avg: 43.8, max: 80.0) [2024-03-21 07:40:20,522][03784] Avg episode reward: [(0, '0.634')] [2024-03-21 07:40:25,521][03784] Fps is (10 sec: 29491.2, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1493991424. Throughput: 0: 45979.9. Samples: 1495360800. Policy #0 lag: (min: 0.0, avg: 43.8, max: 80.0) [2024-03-21 07:40:25,522][03784] Avg episode reward: [(0, '1.262')] [2024-03-21 07:40:26,401][04017] Updated weights for policy 0, policy_version 45595 (0.0021) [2024-03-21 07:40:30,480][04017] Updated weights for policy 0, policy_version 45605 (0.0011) [2024-03-21 07:40:30,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 1494384640. Throughput: 0: 45922.2. Samples: 1495485700. Policy #0 lag: (min: 0.0, avg: 43.8, max: 80.0) [2024-03-21 07:40:30,522][03784] Avg episode reward: [(0, '1.205')] [2024-03-21 07:40:35,521][03784] Fps is (10 sec: 55706.3, 60 sec: 46967.5, 300 sec: 45097.7). Total num frames: 1494548480. Throughput: 0: 46357.9. Samples: 1495768100. Policy #0 lag: (min: 0.0, avg: 43.8, max: 80.0) [2024-03-21 07:40:35,522][03784] Avg episode reward: [(0, '1.205')] [2024-03-21 07:40:38,365][03995] Signal inference workers to stop experience collection... (30100 times) [2024-03-21 07:40:38,366][03995] Signal inference workers to resume experience collection... (30100 times) [2024-03-21 07:40:38,487][04017] InferenceWorker_p0-w0: stopping experience collection (30100 times) [2024-03-21 07:40:38,488][04017] InferenceWorker_p0-w0: resuming experience collection (30100 times) [2024-03-21 07:40:39,438][04017] Updated weights for policy 0, policy_version 45615 (0.0012) [2024-03-21 07:40:40,521][03784] Fps is (10 sec: 42598.8, 60 sec: 48605.9, 300 sec: 45319.8). Total num frames: 1494810624. Throughput: 0: 46300.0. Samples: 1496047200. Policy #0 lag: (min: 0.0, avg: 43.8, max: 80.0) [2024-03-21 07:40:40,522][03784] Avg episode reward: [(0, '1.693')] [2024-03-21 07:40:45,079][04017] Updated weights for policy 0, policy_version 45625 (0.0016) [2024-03-21 07:40:45,521][03784] Fps is (10 sec: 49151.7, 60 sec: 48059.8, 300 sec: 45430.9). Total num frames: 1495040000. Throughput: 0: 46153.5. Samples: 1496181600. Policy #0 lag: (min: 0.0, avg: 43.8, max: 80.0) [2024-03-21 07:40:45,522][03784] Avg episode reward: [(0, '1.685')] [2024-03-21 07:40:50,521][03784] Fps is (10 sec: 36044.9, 60 sec: 46967.6, 300 sec: 45430.9). Total num frames: 1495171072. Throughput: 0: 46517.8. Samples: 1496476300. Policy #0 lag: (min: 0.0, avg: 27.2, max: 75.0) [2024-03-21 07:40:50,522][03784] Avg episode reward: [(0, '0.973')] [2024-03-21 07:40:55,133][04017] Updated weights for policy 0, policy_version 45635 (0.0011) [2024-03-21 07:40:55,521][03784] Fps is (10 sec: 32768.1, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1495367680. Throughput: 0: 47062.3. Samples: 1496779000. Policy #0 lag: (min: 0.0, avg: 27.2, max: 75.0) [2024-03-21 07:40:55,522][03784] Avg episode reward: [(0, '0.973')] [2024-03-21 07:41:00,521][03784] Fps is (10 sec: 49151.0, 60 sec: 48605.9, 300 sec: 45764.1). Total num frames: 1495662592. Throughput: 0: 47571.0. Samples: 1496913600. Policy #0 lag: (min: 0.0, avg: 27.2, max: 75.0) [2024-03-21 07:41:00,522][03784] Avg episode reward: [(0, '0.577')] [2024-03-21 07:41:00,846][04017] Updated weights for policy 0, policy_version 45645 (0.0027) [2024-03-21 07:41:05,521][03784] Fps is (10 sec: 49151.5, 60 sec: 46967.4, 300 sec: 46097.3). Total num frames: 1495859200. Throughput: 0: 47755.5. Samples: 1497208200. Policy #0 lag: (min: 0.0, avg: 27.2, max: 75.0) [2024-03-21 07:41:05,522][03784] Avg episode reward: [(0, '1.375')] [2024-03-21 07:41:09,634][04017] Updated weights for policy 0, policy_version 45655 (0.0011) [2024-03-21 07:41:10,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46421.2, 300 sec: 45986.3). Total num frames: 1496088576. Throughput: 0: 47713.3. Samples: 1497507900. Policy #0 lag: (min: 0.0, avg: 27.2, max: 75.0) [2024-03-21 07:41:10,522][03784] Avg episode reward: [(0, '0.953')] [2024-03-21 07:41:14,665][04017] Updated weights for policy 0, policy_version 45665 (0.0016) [2024-03-21 07:41:15,526][03784] Fps is (10 sec: 55681.3, 60 sec: 45325.8, 300 sec: 45874.5). Total num frames: 1496416256. Throughput: 0: 47919.8. Samples: 1497642300. Policy #0 lag: (min: 0.0, avg: 27.2, max: 75.0) [2024-03-21 07:41:15,526][03784] Avg episode reward: [(0, '1.306')] [2024-03-21 07:41:19,659][04017] Updated weights for policy 0, policy_version 45675 (0.0017) [2024-03-21 07:41:20,521][03784] Fps is (10 sec: 58983.1, 60 sec: 46421.4, 300 sec: 45653.1). Total num frames: 1496678400. Throughput: 0: 47535.5. Samples: 1497907200. Policy #0 lag: (min: 1.0, avg: 43.1, max: 86.0) [2024-03-21 07:41:20,522][03784] Avg episode reward: [(0, '1.306')] [2024-03-21 07:41:25,521][03784] Fps is (10 sec: 39339.1, 60 sec: 46967.5, 300 sec: 45208.7). Total num frames: 1496809472. Throughput: 0: 48006.7. Samples: 1498207500. Policy #0 lag: (min: 1.0, avg: 43.1, max: 86.0) [2024-03-21 07:41:25,522][03784] Avg episode reward: [(0, '1.482')] [2024-03-21 07:41:28,993][04017] Updated weights for policy 0, policy_version 45685 (0.0013) [2024-03-21 07:41:30,521][03784] Fps is (10 sec: 45874.8, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1497137152. Throughput: 0: 48375.5. Samples: 1498358500. Policy #0 lag: (min: 1.0, avg: 43.1, max: 86.0) [2024-03-21 07:41:30,522][03784] Avg episode reward: [(0, '0.744')] [2024-03-21 07:41:30,699][03995] Signal inference workers to stop experience collection... (30150 times) [2024-03-21 07:41:30,700][03995] Signal inference workers to resume experience collection... (30150 times) [2024-03-21 07:41:30,797][04017] InferenceWorker_p0-w0: stopping experience collection (30150 times) [2024-03-21 07:41:30,797][04017] InferenceWorker_p0-w0: resuming experience collection (30150 times) [2024-03-21 07:41:32,376][04017] Updated weights for policy 0, policy_version 45695 (0.0017) [2024-03-21 07:41:35,521][03784] Fps is (10 sec: 68813.3, 60 sec: 49152.0, 300 sec: 45986.3). Total num frames: 1497497600. Throughput: 0: 47373.4. Samples: 1498608100. Policy #0 lag: (min: 1.0, avg: 43.1, max: 86.0) [2024-03-21 07:41:35,522][03784] Avg episode reward: [(0, '1.271')] [2024-03-21 07:41:37,612][04017] Updated weights for policy 0, policy_version 45705 (0.0017) [2024-03-21 07:41:40,521][03784] Fps is (10 sec: 68813.5, 60 sec: 50244.2, 300 sec: 46097.4). Total num frames: 1497825280. Throughput: 0: 46708.9. Samples: 1498880900. Policy #0 lag: (min: 1.0, avg: 43.1, max: 86.0) [2024-03-21 07:41:40,522][03784] Avg episode reward: [(0, '1.171')] [2024-03-21 07:41:44,823][04017] Updated weights for policy 0, policy_version 45715 (0.0017) [2024-03-21 07:41:45,521][03784] Fps is (10 sec: 52428.0, 60 sec: 49698.1, 300 sec: 45875.2). Total num frames: 1498021888. Throughput: 0: 46604.5. Samples: 1499010800. Policy #0 lag: (min: 1.0, avg: 43.1, max: 86.0) [2024-03-21 07:41:45,522][03784] Avg episode reward: [(0, '1.518')] [2024-03-21 07:41:50,521][03784] Fps is (10 sec: 36044.7, 60 sec: 50244.2, 300 sec: 45986.3). Total num frames: 1498185728. Throughput: 0: 46082.3. Samples: 1499281900. Policy #0 lag: (min: 0.0, avg: 42.2, max: 85.0) [2024-03-21 07:41:50,522][03784] Avg episode reward: [(0, '0.823')] [2024-03-21 07:41:52,574][04017] Updated weights for policy 0, policy_version 45725 (0.0020) [2024-03-21 07:41:55,521][03784] Fps is (10 sec: 32768.1, 60 sec: 49698.1, 300 sec: 46208.4). Total num frames: 1498349568. Throughput: 0: 45286.8. Samples: 1499545800. Policy #0 lag: (min: 0.0, avg: 42.2, max: 85.0) [2024-03-21 07:41:55,522][03784] Avg episode reward: [(0, '1.541')] [2024-03-21 07:42:00,521][03784] Fps is (10 sec: 22937.4, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 1498415104. Throughput: 0: 45393.2. Samples: 1499684800. Policy #0 lag: (min: 0.0, avg: 42.2, max: 85.0) [2024-03-21 07:42:00,522][03784] Avg episode reward: [(0, '1.636')] [2024-03-21 07:42:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000045728_1498415104.pth... [2024-03-21 07:42:00,650][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000045403_1487765504.pth [2024-03-21 07:42:05,521][03784] Fps is (10 sec: 16384.0, 60 sec: 44236.8, 300 sec: 45097.7). Total num frames: 1498513408. Throughput: 0: 45917.8. Samples: 1499973500. Policy #0 lag: (min: 0.0, avg: 42.2, max: 85.0) [2024-03-21 07:42:05,522][03784] Avg episode reward: [(0, '1.073')] [2024-03-21 07:42:09,755][04017] Updated weights for policy 0, policy_version 45735 (0.0012) [2024-03-21 07:42:10,521][03784] Fps is (10 sec: 29491.1, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1498710016. Throughput: 0: 44977.6. Samples: 1500231500. Policy #0 lag: (min: 0.0, avg: 42.2, max: 85.0) [2024-03-21 07:42:10,522][03784] Avg episode reward: [(0, '0.748')] [2024-03-21 07:42:15,521][03784] Fps is (10 sec: 42598.3, 60 sec: 42055.3, 300 sec: 45208.7). Total num frames: 1498939392. Throughput: 0: 44413.4. Samples: 1500357100. Policy #0 lag: (min: 0.0, avg: 42.2, max: 85.0) [2024-03-21 07:42:15,522][03784] Avg episode reward: [(0, '1.413')] [2024-03-21 07:42:15,548][04017] Updated weights for policy 0, policy_version 45745 (0.0012) [2024-03-21 07:42:20,270][04017] Updated weights for policy 0, policy_version 45755 (0.0022) [2024-03-21 07:42:20,521][03784] Fps is (10 sec: 58983.3, 60 sec: 43690.7, 300 sec: 45875.2). Total num frames: 1499299840. Throughput: 0: 44713.3. Samples: 1500620200. Policy #0 lag: (min: 0.0, avg: 23.3, max: 62.0) [2024-03-21 07:42:20,522][03784] Avg episode reward: [(0, '1.558')] [2024-03-21 07:42:25,521][03784] Fps is (10 sec: 55705.7, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 1499496448. Throughput: 0: 45117.8. Samples: 1500911200. Policy #0 lag: (min: 0.0, avg: 23.3, max: 62.0) [2024-03-21 07:42:25,522][03784] Avg episode reward: [(0, '1.319')] [2024-03-21 07:42:25,835][03995] Signal inference workers to stop experience collection... (30200 times) [2024-03-21 07:42:25,866][04017] InferenceWorker_p0-w0: stopping experience collection (30200 times) [2024-03-21 07:42:25,911][03995] Signal inference workers to resume experience collection... (30200 times) [2024-03-21 07:42:25,913][04017] InferenceWorker_p0-w0: resuming experience collection (30200 times) [2024-03-21 07:42:26,614][04017] Updated weights for policy 0, policy_version 45765 (0.0014) [2024-03-21 07:42:30,521][03784] Fps is (10 sec: 58981.5, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 1499889664. Throughput: 0: 44799.9. Samples: 1501026800. Policy #0 lag: (min: 0.0, avg: 23.3, max: 62.0) [2024-03-21 07:42:30,522][03784] Avg episode reward: [(0, '0.909')] [2024-03-21 07:42:30,919][04017] Updated weights for policy 0, policy_version 45775 (0.0037) [2024-03-21 07:42:35,521][03784] Fps is (10 sec: 72090.4, 60 sec: 45329.1, 300 sec: 46208.4). Total num frames: 1500217344. Throughput: 0: 44886.8. Samples: 1501301800. Policy #0 lag: (min: 0.0, avg: 23.3, max: 62.0) [2024-03-21 07:42:35,522][03784] Avg episode reward: [(0, '1.197')] [2024-03-21 07:42:36,232][04017] Updated weights for policy 0, policy_version 45785 (0.0012) [2024-03-21 07:42:40,521][03784] Fps is (10 sec: 55706.2, 60 sec: 43690.7, 300 sec: 45875.2). Total num frames: 1500446720. Throughput: 0: 45046.7. Samples: 1501572900. Policy #0 lag: (min: 0.0, avg: 23.3, max: 62.0) [2024-03-21 07:42:40,522][03784] Avg episode reward: [(0, '1.417')] [2024-03-21 07:42:43,298][04017] Updated weights for policy 0, policy_version 45795 (0.0015) [2024-03-21 07:42:45,521][03784] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 45764.1). Total num frames: 1500610560. Throughput: 0: 44800.1. Samples: 1501700800. Policy #0 lag: (min: 0.0, avg: 23.3, max: 62.0) [2024-03-21 07:42:45,522][03784] Avg episode reward: [(0, '0.953')] [2024-03-21 07:42:50,521][03784] Fps is (10 sec: 32767.9, 60 sec: 43144.5, 300 sec: 45319.8). Total num frames: 1500774400. Throughput: 0: 44911.1. Samples: 1501994500. Policy #0 lag: (min: 0.0, avg: 34.5, max: 75.0) [2024-03-21 07:42:50,522][03784] Avg episode reward: [(0, '1.304')] [2024-03-21 07:42:55,457][04017] Updated weights for policy 0, policy_version 45805 (0.0011) [2024-03-21 07:42:55,521][03784] Fps is (10 sec: 32767.3, 60 sec: 43144.4, 300 sec: 44653.3). Total num frames: 1500938240. Throughput: 0: 45340.0. Samples: 1502271800. Policy #0 lag: (min: 0.0, avg: 34.5, max: 75.0) [2024-03-21 07:42:55,523][03784] Avg episode reward: [(0, '1.008')] [2024-03-21 07:43:00,521][03784] Fps is (10 sec: 26214.7, 60 sec: 43690.8, 300 sec: 44764.4). Total num frames: 1501036544. Throughput: 0: 45653.4. Samples: 1502411500. Policy #0 lag: (min: 0.0, avg: 34.5, max: 75.0) [2024-03-21 07:43:00,521][03784] Avg episode reward: [(0, '1.106')] [2024-03-21 07:43:04,505][04017] Updated weights for policy 0, policy_version 45815 (0.0009) [2024-03-21 07:43:05,521][03784] Fps is (10 sec: 39322.3, 60 sec: 46967.5, 300 sec: 45208.8). Total num frames: 1501331456. Throughput: 0: 45560.0. Samples: 1502670400. Policy #0 lag: (min: 0.0, avg: 34.5, max: 75.0) [2024-03-21 07:43:05,522][03784] Avg episode reward: [(0, '1.819')] [2024-03-21 07:43:08,996][04017] Updated weights for policy 0, policy_version 45825 (0.0025) [2024-03-21 07:43:10,521][03784] Fps is (10 sec: 58981.8, 60 sec: 48605.9, 300 sec: 45430.9). Total num frames: 1501626368. Throughput: 0: 44328.9. Samples: 1502906000. Policy #0 lag: (min: 0.0, avg: 34.5, max: 75.0) [2024-03-21 07:43:10,522][03784] Avg episode reward: [(0, '1.367')] [2024-03-21 07:43:15,521][03784] Fps is (10 sec: 49152.0, 60 sec: 48059.8, 300 sec: 45097.7). Total num frames: 1501822976. Throughput: 0: 44964.6. Samples: 1503050200. Policy #0 lag: (min: 0.0, avg: 34.5, max: 75.0) [2024-03-21 07:43:15,522][03784] Avg episode reward: [(0, '0.868')] [2024-03-21 07:43:17,057][04017] Updated weights for policy 0, policy_version 45835 (0.0016) [2024-03-21 07:43:19,415][03995] Signal inference workers to stop experience collection... (30250 times) [2024-03-21 07:43:19,472][04017] InferenceWorker_p0-w0: stopping experience collection (30250 times) [2024-03-21 07:43:19,727][03995] Signal inference workers to resume experience collection... (30250 times) [2024-03-21 07:43:19,727][04017] InferenceWorker_p0-w0: resuming experience collection (30250 times) [2024-03-21 07:43:20,521][03784] Fps is (10 sec: 52429.1, 60 sec: 47513.6, 300 sec: 45653.0). Total num frames: 1502150656. Throughput: 0: 44759.9. Samples: 1503316000. Policy #0 lag: (min: 0.0, avg: 34.0, max: 83.0) [2024-03-21 07:43:20,522][03784] Avg episode reward: [(0, '1.424')] [2024-03-21 07:43:21,298][04017] Updated weights for policy 0, policy_version 45845 (0.0021) [2024-03-21 07:43:25,521][03784] Fps is (10 sec: 52428.3, 60 sec: 47513.6, 300 sec: 46319.5). Total num frames: 1502347264. Throughput: 0: 44419.9. Samples: 1503571800. Policy #0 lag: (min: 0.0, avg: 34.0, max: 83.0) [2024-03-21 07:43:25,522][03784] Avg episode reward: [(0, '0.865')] [2024-03-21 07:43:30,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43690.8, 300 sec: 45986.3). Total num frames: 1502511104. Throughput: 0: 44351.1. Samples: 1503696600. Policy #0 lag: (min: 0.0, avg: 34.0, max: 83.0) [2024-03-21 07:43:30,522][03784] Avg episode reward: [(0, '0.933')] [2024-03-21 07:43:33,128][04017] Updated weights for policy 0, policy_version 45855 (0.0024) [2024-03-21 07:43:35,521][03784] Fps is (10 sec: 29491.5, 60 sec: 40413.8, 300 sec: 46097.4). Total num frames: 1502642176. Throughput: 0: 44077.9. Samples: 1503978000. Policy #0 lag: (min: 0.0, avg: 34.0, max: 83.0) [2024-03-21 07:43:35,522][03784] Avg episode reward: [(0, '0.933')] [2024-03-21 07:43:40,521][03784] Fps is (10 sec: 26214.1, 60 sec: 38775.4, 300 sec: 45319.8). Total num frames: 1502773248. Throughput: 0: 43497.8. Samples: 1504229200. Policy #0 lag: (min: 0.0, avg: 34.0, max: 83.0) [2024-03-21 07:43:40,522][03784] Avg episode reward: [(0, '0.883')] [2024-03-21 07:43:41,824][04017] Updated weights for policy 0, policy_version 45865 (0.0021) [2024-03-21 07:43:45,521][03784] Fps is (10 sec: 36044.4, 60 sec: 39867.7, 300 sec: 45319.8). Total num frames: 1503002624. Throughput: 0: 42928.8. Samples: 1504343300. Policy #0 lag: (min: 0.0, avg: 34.0, max: 83.0) [2024-03-21 07:43:45,522][03784] Avg episode reward: [(0, '0.761')] [2024-03-21 07:43:49,643][04017] Updated weights for policy 0, policy_version 45875 (0.0009) [2024-03-21 07:43:50,521][03784] Fps is (10 sec: 49152.2, 60 sec: 41506.1, 300 sec: 45430.9). Total num frames: 1503264768. Throughput: 0: 43631.0. Samples: 1504633800. Policy #0 lag: (min: 0.0, avg: 35.0, max: 78.0) [2024-03-21 07:43:50,522][03784] Avg episode reward: [(0, '0.631')] [2024-03-21 07:43:54,819][04017] Updated weights for policy 0, policy_version 45885 (0.0012) [2024-03-21 07:43:55,521][03784] Fps is (10 sec: 55706.3, 60 sec: 43690.8, 300 sec: 44986.6). Total num frames: 1503559680. Throughput: 0: 44513.4. Samples: 1504909100. Policy #0 lag: (min: 0.0, avg: 35.0, max: 78.0) [2024-03-21 07:43:55,522][03784] Avg episode reward: [(0, '1.665')] [2024-03-21 07:44:00,521][03784] Fps is (10 sec: 39321.4, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1503657984. Throughput: 0: 44762.1. Samples: 1505064500. Policy #0 lag: (min: 0.0, avg: 35.0, max: 78.0) [2024-03-21 07:44:00,522][03784] Avg episode reward: [(0, '1.665')] [2024-03-21 07:44:00,847][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000045889_1503690752.pth... [2024-03-21 07:44:00,983][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000045556_1492779008.pth [2024-03-21 07:44:02,889][04017] Updated weights for policy 0, policy_version 45895 (0.0018) [2024-03-21 07:44:05,521][03784] Fps is (10 sec: 55705.9, 60 sec: 46421.4, 300 sec: 45430.9). Total num frames: 1504116736. Throughput: 0: 44809.0. Samples: 1505332400. Policy #0 lag: (min: 0.0, avg: 35.0, max: 78.0) [2024-03-21 07:44:05,521][03784] Avg episode reward: [(0, '1.110')] [2024-03-21 07:44:06,154][04017] Updated weights for policy 0, policy_version 45905 (0.0013) [2024-03-21 07:44:10,521][03784] Fps is (10 sec: 72090.7, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 1504378880. Throughput: 0: 44920.1. Samples: 1505593200. Policy #0 lag: (min: 0.0, avg: 35.0, max: 78.0) [2024-03-21 07:44:10,522][03784] Avg episode reward: [(0, '1.187')] [2024-03-21 07:44:11,139][03995] Signal inference workers to stop experience collection... (30300 times) [2024-03-21 07:44:11,139][03995] Signal inference workers to resume experience collection... (30300 times) [2024-03-21 07:44:11,213][04017] InferenceWorker_p0-w0: stopping experience collection (30300 times) [2024-03-21 07:44:11,214][04017] InferenceWorker_p0-w0: resuming experience collection (30300 times) [2024-03-21 07:44:13,740][04017] Updated weights for policy 0, policy_version 45915 (0.0015) [2024-03-21 07:44:15,521][03784] Fps is (10 sec: 49151.1, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 1504608256. Throughput: 0: 45215.5. Samples: 1505731300. Policy #0 lag: (min: 0.0, avg: 35.1, max: 74.0) [2024-03-21 07:44:15,522][03784] Avg episode reward: [(0, '0.857')] [2024-03-21 07:44:18,141][04017] Updated weights for policy 0, policy_version 45925 (0.0022) [2024-03-21 07:44:20,521][03784] Fps is (10 sec: 55705.1, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 1504935936. Throughput: 0: 44544.4. Samples: 1505982500. Policy #0 lag: (min: 0.0, avg: 35.1, max: 74.0) [2024-03-21 07:44:20,522][03784] Avg episode reward: [(0, '1.484')] [2024-03-21 07:44:25,521][03784] Fps is (10 sec: 55706.3, 60 sec: 46967.6, 300 sec: 46097.4). Total num frames: 1505165312. Throughput: 0: 45040.1. Samples: 1506256000. Policy #0 lag: (min: 0.0, avg: 35.1, max: 74.0) [2024-03-21 07:44:25,522][03784] Avg episode reward: [(0, '1.472')] [2024-03-21 07:44:28,460][04017] Updated weights for policy 0, policy_version 45935 (0.0015) [2024-03-21 07:44:30,521][03784] Fps is (10 sec: 26214.5, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 1505198080. Throughput: 0: 45937.8. Samples: 1506410500. Policy #0 lag: (min: 0.0, avg: 35.1, max: 74.0) [2024-03-21 07:44:30,522][03784] Avg episode reward: [(0, '0.917')] [2024-03-21 07:44:35,521][03784] Fps is (10 sec: 13107.2, 60 sec: 44236.8, 300 sec: 45430.9). Total num frames: 1505296384. Throughput: 0: 46215.6. Samples: 1506713500. Policy #0 lag: (min: 0.0, avg: 35.1, max: 74.0) [2024-03-21 07:44:35,522][03784] Avg episode reward: [(0, '1.753')] [2024-03-21 07:44:38,976][04017] Updated weights for policy 0, policy_version 45945 (0.0020) [2024-03-21 07:44:40,521][03784] Fps is (10 sec: 36044.0, 60 sec: 46421.2, 300 sec: 45430.9). Total num frames: 1505558528. Throughput: 0: 46217.5. Samples: 1506988900. Policy #0 lag: (min: 0.0, avg: 35.1, max: 74.0) [2024-03-21 07:44:40,523][03784] Avg episode reward: [(0, '0.614')] [2024-03-21 07:44:45,521][03784] Fps is (10 sec: 42598.2, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1505722368. Throughput: 0: 46084.5. Samples: 1507138300. Policy #0 lag: (min: 0.0, avg: 35.1, max: 74.0) [2024-03-21 07:44:45,522][03784] Avg episode reward: [(0, '0.614')] [2024-03-21 07:44:47,172][04017] Updated weights for policy 0, policy_version 45955 (0.0020) [2024-03-21 07:44:50,521][03784] Fps is (10 sec: 49152.9, 60 sec: 46421.3, 300 sec: 45541.9). Total num frames: 1506050048. Throughput: 0: 45897.6. Samples: 1507397800. Policy #0 lag: (min: 0.0, avg: 27.4, max: 72.0) [2024-03-21 07:44:50,522][03784] Avg episode reward: [(0, '1.103')] [2024-03-21 07:44:54,110][04017] Updated weights for policy 0, policy_version 45965 (0.0016) [2024-03-21 07:44:55,521][03784] Fps is (10 sec: 52428.8, 60 sec: 44782.9, 300 sec: 45764.1). Total num frames: 1506246656. Throughput: 0: 46226.6. Samples: 1507673400. Policy #0 lag: (min: 0.0, avg: 27.4, max: 72.0) [2024-03-21 07:44:55,522][03784] Avg episode reward: [(0, '0.763')] [2024-03-21 07:44:59,153][04017] Updated weights for policy 0, policy_version 45975 (0.0012) [2024-03-21 07:45:00,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48605.9, 300 sec: 45875.2). Total num frames: 1506574336. Throughput: 0: 46124.4. Samples: 1507806900. Policy #0 lag: (min: 0.0, avg: 27.4, max: 72.0) [2024-03-21 07:45:00,522][03784] Avg episode reward: [(0, '1.284')] [2024-03-21 07:45:05,017][04017] Updated weights for policy 0, policy_version 45985 (0.0011) [2024-03-21 07:45:05,521][03784] Fps is (10 sec: 62260.3, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 1506869248. Throughput: 0: 46609.1. Samples: 1508079900. Policy #0 lag: (min: 0.0, avg: 27.4, max: 72.0) [2024-03-21 07:45:05,522][03784] Avg episode reward: [(0, '1.139')] [2024-03-21 07:45:09,459][03995] Signal inference workers to stop experience collection... (30350 times) [2024-03-21 07:45:09,574][04017] InferenceWorker_p0-w0: stopping experience collection (30350 times) [2024-03-21 07:45:09,673][03995] Signal inference workers to resume experience collection... (30350 times) [2024-03-21 07:45:09,673][04017] InferenceWorker_p0-w0: resuming experience collection (30350 times) [2024-03-21 07:45:10,521][03784] Fps is (10 sec: 52429.4, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 1507098624. Throughput: 0: 46460.0. Samples: 1508346700. Policy #0 lag: (min: 0.0, avg: 27.4, max: 72.0) [2024-03-21 07:45:10,522][03784] Avg episode reward: [(0, '0.969')] [2024-03-21 07:45:11,017][04017] Updated weights for policy 0, policy_version 45995 (0.0017) [2024-03-21 07:45:15,521][03784] Fps is (10 sec: 52428.2, 60 sec: 46421.4, 300 sec: 45764.1). Total num frames: 1507393536. Throughput: 0: 46000.0. Samples: 1508480500. Policy #0 lag: (min: 0.0, avg: 42.3, max: 76.0) [2024-03-21 07:45:15,522][03784] Avg episode reward: [(0, '0.836')] [2024-03-21 07:45:17,544][04017] Updated weights for policy 0, policy_version 46005 (0.0013) [2024-03-21 07:45:20,521][03784] Fps is (10 sec: 45875.3, 60 sec: 43690.7, 300 sec: 45986.3). Total num frames: 1507557376. Throughput: 0: 45657.8. Samples: 1508768100. Policy #0 lag: (min: 0.0, avg: 42.3, max: 76.0) [2024-03-21 07:45:20,522][03784] Avg episode reward: [(0, '1.403')] [2024-03-21 07:45:25,265][04017] Updated weights for policy 0, policy_version 46015 (0.0015) [2024-03-21 07:45:25,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 1507819520. Throughput: 0: 45613.6. Samples: 1509041500. Policy #0 lag: (min: 0.0, avg: 42.3, max: 76.0) [2024-03-21 07:45:25,522][03784] Avg episode reward: [(0, '0.493')] [2024-03-21 07:45:30,521][03784] Fps is (10 sec: 36045.3, 60 sec: 45329.2, 300 sec: 45319.8). Total num frames: 1507917824. Throughput: 0: 45380.2. Samples: 1509180400. Policy #0 lag: (min: 0.0, avg: 42.3, max: 76.0) [2024-03-21 07:45:30,521][03784] Avg episode reward: [(0, '1.029')] [2024-03-21 07:45:35,043][04017] Updated weights for policy 0, policy_version 46025 (0.0010) [2024-03-21 07:45:35,521][03784] Fps is (10 sec: 32768.1, 60 sec: 47513.6, 300 sec: 45208.7). Total num frames: 1508147200. Throughput: 0: 45922.3. Samples: 1509464300. Policy #0 lag: (min: 0.0, avg: 42.3, max: 76.0) [2024-03-21 07:45:35,522][03784] Avg episode reward: [(0, '1.738')] [2024-03-21 07:45:40,295][04017] Updated weights for policy 0, policy_version 46035 (0.0012) [2024-03-21 07:45:40,521][03784] Fps is (10 sec: 55704.0, 60 sec: 48606.0, 300 sec: 45542.0). Total num frames: 1508474880. Throughput: 0: 45504.4. Samples: 1509721100. Policy #0 lag: (min: 0.0, avg: 42.3, max: 76.0) [2024-03-21 07:45:40,522][03784] Avg episode reward: [(0, '1.329')] [2024-03-21 07:45:45,521][03784] Fps is (10 sec: 58982.4, 60 sec: 50244.3, 300 sec: 45986.3). Total num frames: 1508737024. Throughput: 0: 45571.2. Samples: 1509857600. Policy #0 lag: (min: 1.0, avg: 28.5, max: 77.0) [2024-03-21 07:45:45,522][03784] Avg episode reward: [(0, '0.783')] [2024-03-21 07:45:47,495][04017] Updated weights for policy 0, policy_version 46045 (0.0019) [2024-03-21 07:45:50,521][03784] Fps is (10 sec: 36045.2, 60 sec: 46421.4, 300 sec: 45653.0). Total num frames: 1508835328. Throughput: 0: 45575.4. Samples: 1510130800. Policy #0 lag: (min: 1.0, avg: 28.5, max: 77.0) [2024-03-21 07:45:50,522][03784] Avg episode reward: [(0, '1.213')] [2024-03-21 07:45:55,521][03784] Fps is (10 sec: 36044.4, 60 sec: 47513.6, 300 sec: 45542.0). Total num frames: 1509097472. Throughput: 0: 46248.8. Samples: 1510427900. Policy #0 lag: (min: 1.0, avg: 28.5, max: 77.0) [2024-03-21 07:45:55,522][03784] Avg episode reward: [(0, '1.579')] [2024-03-21 07:45:55,894][04017] Updated weights for policy 0, policy_version 46055 (0.0011) [2024-03-21 07:46:00,521][03784] Fps is (10 sec: 45874.6, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 1509294080. Throughput: 0: 46646.5. Samples: 1510579600. Policy #0 lag: (min: 1.0, avg: 28.5, max: 77.0) [2024-03-21 07:46:00,522][03784] Avg episode reward: [(0, '1.403')] [2024-03-21 07:46:00,756][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000046061_1509326848.pth... [2024-03-21 07:46:00,891][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000045728_1498415104.pth [2024-03-21 07:46:05,057][04017] Updated weights for policy 0, policy_version 46065 (0.0012) [2024-03-21 07:46:05,521][03784] Fps is (10 sec: 39321.7, 60 sec: 43690.5, 300 sec: 45430.9). Total num frames: 1509490688. Throughput: 0: 46464.3. Samples: 1510859000. Policy #0 lag: (min: 1.0, avg: 28.5, max: 77.0) [2024-03-21 07:46:05,522][03784] Avg episode reward: [(0, '1.141')] [2024-03-21 07:46:05,712][03995] Signal inference workers to stop experience collection... (30400 times) [2024-03-21 07:46:05,793][04017] InferenceWorker_p0-w0: stopping experience collection (30400 times) [2024-03-21 07:46:05,996][03995] Signal inference workers to resume experience collection... (30400 times) [2024-03-21 07:46:05,996][04017] InferenceWorker_p0-w0: resuming experience collection (30400 times) [2024-03-21 07:46:09,017][04017] Updated weights for policy 0, policy_version 46075 (0.0022) [2024-03-21 07:46:10,521][03784] Fps is (10 sec: 52429.6, 60 sec: 45329.1, 300 sec: 45431.6). Total num frames: 1509818368. Throughput: 0: 45880.0. Samples: 1511106100. Policy #0 lag: (min: 1.0, avg: 28.5, max: 77.0) [2024-03-21 07:46:10,522][03784] Avg episode reward: [(0, '0.922')] [2024-03-21 07:46:15,521][03784] Fps is (10 sec: 49151.7, 60 sec: 43144.4, 300 sec: 45097.6). Total num frames: 1509982208. Throughput: 0: 46250.8. Samples: 1511261700. Policy #0 lag: (min: 0.0, avg: 42.5, max: 89.0) [2024-03-21 07:46:15,523][03784] Avg episode reward: [(0, '0.833')] [2024-03-21 07:46:17,104][04017] Updated weights for policy 0, policy_version 46085 (0.0020) [2024-03-21 07:46:20,521][03784] Fps is (10 sec: 45874.8, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 1510277120. Throughput: 0: 46388.8. Samples: 1511551800. Policy #0 lag: (min: 0.0, avg: 42.5, max: 89.0) [2024-03-21 07:46:20,522][03784] Avg episode reward: [(0, '0.833')] [2024-03-21 07:46:23,710][04017] Updated weights for policy 0, policy_version 46095 (0.0023) [2024-03-21 07:46:25,521][03784] Fps is (10 sec: 55706.2, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1510539264. Throughput: 0: 46562.3. Samples: 1511816400. Policy #0 lag: (min: 0.0, avg: 42.5, max: 89.0) [2024-03-21 07:46:25,522][03784] Avg episode reward: [(0, '1.648')] [2024-03-21 07:46:29,381][04017] Updated weights for policy 0, policy_version 46105 (0.0016) [2024-03-21 07:46:30,521][03784] Fps is (10 sec: 55705.8, 60 sec: 48605.7, 300 sec: 45208.7). Total num frames: 1510834176. Throughput: 0: 46477.7. Samples: 1511949100. Policy #0 lag: (min: 0.0, avg: 42.5, max: 89.0) [2024-03-21 07:46:30,522][03784] Avg episode reward: [(0, '1.069')] [2024-03-21 07:46:34,882][04017] Updated weights for policy 0, policy_version 46115 (0.0021) [2024-03-21 07:46:35,521][03784] Fps is (10 sec: 55706.0, 60 sec: 49152.0, 300 sec: 44986.6). Total num frames: 1511096320. Throughput: 0: 46273.4. Samples: 1512213100. Policy #0 lag: (min: 0.0, avg: 42.5, max: 89.0) [2024-03-21 07:46:35,522][03784] Avg episode reward: [(0, '0.631')] [2024-03-21 07:46:40,523][03784] Fps is (10 sec: 52417.1, 60 sec: 48058.0, 300 sec: 45208.4). Total num frames: 1511358464. Throughput: 0: 45693.3. Samples: 1512484200. Policy #0 lag: (min: 0.0, avg: 42.5, max: 89.0) [2024-03-21 07:46:40,524][03784] Avg episode reward: [(0, '0.540')] [2024-03-21 07:46:41,004][04017] Updated weights for policy 0, policy_version 46125 (0.0015) [2024-03-21 07:46:45,521][03784] Fps is (10 sec: 52428.2, 60 sec: 48059.6, 300 sec: 45542.0). Total num frames: 1511620608. Throughput: 0: 45262.3. Samples: 1512616400. Policy #0 lag: (min: 1.0, avg: 50.8, max: 99.0) [2024-03-21 07:46:45,522][03784] Avg episode reward: [(0, '0.540')] [2024-03-21 07:46:50,521][03784] Fps is (10 sec: 29497.7, 60 sec: 46967.4, 300 sec: 45097.6). Total num frames: 1511653376. Throughput: 0: 45440.0. Samples: 1512903800. Policy #0 lag: (min: 1.0, avg: 50.8, max: 99.0) [2024-03-21 07:46:50,522][03784] Avg episode reward: [(0, '0.802')] [2024-03-21 07:46:52,531][04017] Updated weights for policy 0, policy_version 46135 (0.0017) [2024-03-21 07:46:55,521][03784] Fps is (10 sec: 19660.8, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1511817216. Throughput: 0: 45864.3. Samples: 1513170000. Policy #0 lag: (min: 1.0, avg: 50.8, max: 99.0) [2024-03-21 07:46:55,522][03784] Avg episode reward: [(0, '1.216')] [2024-03-21 07:47:00,521][03784] Fps is (10 sec: 26214.4, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 1511915520. Throughput: 0: 45448.9. Samples: 1513306900. Policy #0 lag: (min: 1.0, avg: 50.8, max: 99.0) [2024-03-21 07:47:00,522][03784] Avg episode reward: [(0, '0.938')] [2024-03-21 07:47:02,522][04017] Updated weights for policy 0, policy_version 46145 (0.0012) [2024-03-21 07:47:02,830][03995] Signal inference workers to stop experience collection... (30450 times) [2024-03-21 07:47:02,898][03995] Signal inference workers to resume experience collection... (30450 times) [2024-03-21 07:47:02,918][04017] InferenceWorker_p0-w0: stopping experience collection (30450 times) [2024-03-21 07:47:03,000][04017] InferenceWorker_p0-w0: resuming experience collection (30450 times) [2024-03-21 07:47:05,521][03784] Fps is (10 sec: 45875.4, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 1512275968. Throughput: 0: 44977.8. Samples: 1513575800. Policy #0 lag: (min: 1.0, avg: 50.8, max: 99.0) [2024-03-21 07:47:05,522][03784] Avg episode reward: [(0, '1.026')] [2024-03-21 07:47:10,521][03784] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 45542.0). Total num frames: 1512374272. Throughput: 0: 45555.6. Samples: 1513866400. Policy #0 lag: (min: 1.0, avg: 50.8, max: 99.0) [2024-03-21 07:47:10,522][03784] Avg episode reward: [(0, '1.111')] [2024-03-21 07:47:10,554][04017] Updated weights for policy 0, policy_version 46155 (0.0020) [2024-03-21 07:47:14,791][04017] Updated weights for policy 0, policy_version 46165 (0.0017) [2024-03-21 07:47:15,521][03784] Fps is (10 sec: 52428.9, 60 sec: 46967.6, 300 sec: 45764.1). Total num frames: 1512800256. Throughput: 0: 45637.8. Samples: 1514002800. Policy #0 lag: (min: 1.0, avg: 47.5, max: 110.0) [2024-03-21 07:47:15,522][03784] Avg episode reward: [(0, '1.210')] [2024-03-21 07:47:20,521][03784] Fps is (10 sec: 55705.0, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 1512931328. Throughput: 0: 46188.8. Samples: 1514291600. Policy #0 lag: (min: 1.0, avg: 47.5, max: 110.0) [2024-03-21 07:47:20,522][03784] Avg episode reward: [(0, '1.753')] [2024-03-21 07:47:22,252][04017] Updated weights for policy 0, policy_version 46175 (0.0019) [2024-03-21 07:47:25,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 1513259008. Throughput: 0: 45953.4. Samples: 1514552000. Policy #0 lag: (min: 1.0, avg: 47.5, max: 110.0) [2024-03-21 07:47:25,522][03784] Avg episode reward: [(0, '1.339')] [2024-03-21 07:47:28,352][04017] Updated weights for policy 0, policy_version 46185 (0.0016) [2024-03-21 07:47:30,521][03784] Fps is (10 sec: 58982.4, 60 sec: 44782.9, 300 sec: 45097.6). Total num frames: 1513521152. Throughput: 0: 46320.0. Samples: 1514700800. Policy #0 lag: (min: 1.0, avg: 47.5, max: 110.0) [2024-03-21 07:47:30,522][03784] Avg episode reward: [(0, '1.472')] [2024-03-21 07:47:35,521][03784] Fps is (10 sec: 45875.5, 60 sec: 43690.6, 300 sec: 44986.6). Total num frames: 1513717760. Throughput: 0: 46464.5. Samples: 1514994700. Policy #0 lag: (min: 1.0, avg: 47.5, max: 110.0) [2024-03-21 07:47:35,522][03784] Avg episode reward: [(0, '1.472')] [2024-03-21 07:47:35,539][04017] Updated weights for policy 0, policy_version 46195 (0.0012) [2024-03-21 07:47:40,521][03784] Fps is (10 sec: 49152.3, 60 sec: 44238.5, 300 sec: 45430.9). Total num frames: 1514012672. Throughput: 0: 46340.1. Samples: 1515255300. Policy #0 lag: (min: 1.0, avg: 47.5, max: 110.0) [2024-03-21 07:47:40,522][03784] Avg episode reward: [(0, '0.637')] [2024-03-21 07:47:40,991][04017] Updated weights for policy 0, policy_version 46205 (0.0016) [2024-03-21 07:47:45,521][03784] Fps is (10 sec: 58981.7, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 1514307584. Throughput: 0: 46317.7. Samples: 1515391200. Policy #0 lag: (min: 1.0, avg: 41.1, max: 76.0) [2024-03-21 07:47:45,522][03784] Avg episode reward: [(0, '1.005')] [2024-03-21 07:47:46,600][04017] Updated weights for policy 0, policy_version 46215 (0.0012) [2024-03-21 07:47:50,436][03995] Signal inference workers to stop experience collection... (30500 times) [2024-03-21 07:47:50,436][03995] Signal inference workers to resume experience collection... (30500 times) [2024-03-21 07:47:50,512][04017] InferenceWorker_p0-w0: stopping experience collection (30500 times) [2024-03-21 07:47:50,512][04017] InferenceWorker_p0-w0: resuming experience collection (30500 times) [2024-03-21 07:47:50,521][03784] Fps is (10 sec: 52428.6, 60 sec: 48059.8, 300 sec: 46097.4). Total num frames: 1514536960. Throughput: 0: 46093.4. Samples: 1515650000. Policy #0 lag: (min: 1.0, avg: 41.1, max: 76.0) [2024-03-21 07:47:50,522][03784] Avg episode reward: [(0, '1.530')] [2024-03-21 07:47:52,933][04017] Updated weights for policy 0, policy_version 46225 (0.0014) [2024-03-21 07:47:55,521][03784] Fps is (10 sec: 42598.7, 60 sec: 48605.9, 300 sec: 46430.6). Total num frames: 1514733568. Throughput: 0: 45335.5. Samples: 1515906500. Policy #0 lag: (min: 1.0, avg: 41.1, max: 76.0) [2024-03-21 07:47:55,522][03784] Avg episode reward: [(0, '1.170')] [2024-03-21 07:48:00,521][03784] Fps is (10 sec: 36045.0, 60 sec: 49698.2, 300 sec: 45986.3). Total num frames: 1514897408. Throughput: 0: 45442.3. Samples: 1516047700. Policy #0 lag: (min: 1.0, avg: 41.1, max: 76.0) [2024-03-21 07:48:00,522][03784] Avg episode reward: [(0, '1.286')] [2024-03-21 07:48:00,532][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000046231_1514897408.pth... [2024-03-21 07:48:00,675][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000045889_1503690752.pth [2024-03-21 07:48:04,755][04017] Updated weights for policy 0, policy_version 46235 (0.0030) [2024-03-21 07:48:05,521][03784] Fps is (10 sec: 36044.7, 60 sec: 46967.4, 300 sec: 45653.0). Total num frames: 1515094016. Throughput: 0: 45288.9. Samples: 1516329600. Policy #0 lag: (min: 1.0, avg: 41.1, max: 76.0) [2024-03-21 07:48:05,522][03784] Avg episode reward: [(0, '1.461')] [2024-03-21 07:48:09,564][04017] Updated weights for policy 0, policy_version 46245 (0.0011) [2024-03-21 07:48:10,521][03784] Fps is (10 sec: 45874.8, 60 sec: 49698.1, 300 sec: 45875.2). Total num frames: 1515356160. Throughput: 0: 44675.6. Samples: 1516562400. Policy #0 lag: (min: 1.0, avg: 41.1, max: 76.0) [2024-03-21 07:48:10,522][03784] Avg episode reward: [(0, '1.221')] [2024-03-21 07:48:15,521][03784] Fps is (10 sec: 42598.9, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1515520000. Throughput: 0: 44486.8. Samples: 1516702700. Policy #0 lag: (min: 0.0, avg: 35.9, max: 77.0) [2024-03-21 07:48:15,522][03784] Avg episode reward: [(0, '1.463')] [2024-03-21 07:48:20,521][03784] Fps is (10 sec: 29491.1, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 1515651072. Throughput: 0: 44111.0. Samples: 1516979700. Policy #0 lag: (min: 0.0, avg: 35.9, max: 77.0) [2024-03-21 07:48:20,522][03784] Avg episode reward: [(0, '1.463')] [2024-03-21 07:48:22,430][04017] Updated weights for policy 0, policy_version 46255 (0.0025) [2024-03-21 07:48:25,521][03784] Fps is (10 sec: 29490.6, 60 sec: 42598.3, 300 sec: 45097.6). Total num frames: 1515814912. Throughput: 0: 44435.4. Samples: 1517254900. Policy #0 lag: (min: 0.0, avg: 35.9, max: 77.0) [2024-03-21 07:48:25,522][03784] Avg episode reward: [(0, '1.401')] [2024-03-21 07:48:27,897][04017] Updated weights for policy 0, policy_version 46265 (0.0016) [2024-03-21 07:48:30,521][03784] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 45542.0). Total num frames: 1516077056. Throughput: 0: 44177.8. Samples: 1517379200. Policy #0 lag: (min: 0.0, avg: 35.9, max: 77.0) [2024-03-21 07:48:30,522][03784] Avg episode reward: [(0, '1.206')] [2024-03-21 07:48:35,521][03784] Fps is (10 sec: 45876.8, 60 sec: 42598.5, 300 sec: 45764.2). Total num frames: 1516273664. Throughput: 0: 44264.7. Samples: 1517641900. Policy #0 lag: (min: 0.0, avg: 35.9, max: 77.0) [2024-03-21 07:48:35,522][03784] Avg episode reward: [(0, '1.330')] [2024-03-21 07:48:37,073][04017] Updated weights for policy 0, policy_version 46275 (0.0015) [2024-03-21 07:48:40,521][03784] Fps is (10 sec: 49151.9, 60 sec: 42598.3, 300 sec: 45986.3). Total num frames: 1516568576. Throughput: 0: 44653.3. Samples: 1517915900. Policy #0 lag: (min: 2.0, avg: 39.7, max: 120.0) [2024-03-21 07:48:40,522][03784] Avg episode reward: [(0, '0.903')] [2024-03-21 07:48:42,973][04017] Updated weights for policy 0, policy_version 46285 (0.0012) [2024-03-21 07:48:45,521][03784] Fps is (10 sec: 58981.3, 60 sec: 42598.5, 300 sec: 46097.4). Total num frames: 1516863488. Throughput: 0: 44797.8. Samples: 1518063600. Policy #0 lag: (min: 2.0, avg: 39.7, max: 120.0) [2024-03-21 07:48:45,522][03784] Avg episode reward: [(0, '0.903')] [2024-03-21 07:48:47,007][04017] Updated weights for policy 0, policy_version 46295 (0.0012) [2024-03-21 07:48:49,453][03995] Signal inference workers to stop experience collection... (30550 times) [2024-03-21 07:48:49,526][03995] Signal inference workers to resume experience collection... (30550 times) [2024-03-21 07:48:49,567][04017] InferenceWorker_p0-w0: stopping experience collection (30550 times) [2024-03-21 07:48:49,604][04017] InferenceWorker_p0-w0: resuming experience collection (30550 times) [2024-03-21 07:48:50,521][03784] Fps is (10 sec: 62259.8, 60 sec: 44236.8, 300 sec: 46208.4). Total num frames: 1517191168. Throughput: 0: 44346.8. Samples: 1518325200. Policy #0 lag: (min: 2.0, avg: 39.7, max: 120.0) [2024-03-21 07:48:50,522][03784] Avg episode reward: [(0, '0.728')] [2024-03-21 07:48:52,376][04017] Updated weights for policy 0, policy_version 46305 (0.0019) [2024-03-21 07:48:55,521][03784] Fps is (10 sec: 52428.6, 60 sec: 44236.8, 300 sec: 46541.7). Total num frames: 1517387776. Throughput: 0: 45353.4. Samples: 1518603300. Policy #0 lag: (min: 2.0, avg: 39.7, max: 120.0) [2024-03-21 07:48:55,522][03784] Avg episode reward: [(0, '0.890')] [2024-03-21 07:49:00,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 1517551616. Throughput: 0: 45488.8. Samples: 1518749700. Policy #0 lag: (min: 2.0, avg: 39.7, max: 120.0) [2024-03-21 07:49:00,522][03784] Avg episode reward: [(0, '0.890')] [2024-03-21 07:49:01,553][04017] Updated weights for policy 0, policy_version 46315 (0.0011) [2024-03-21 07:49:05,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44236.9, 300 sec: 45319.8). Total num frames: 1517748224. Throughput: 0: 45295.6. Samples: 1519018000. Policy #0 lag: (min: 2.0, avg: 39.7, max: 120.0) [2024-03-21 07:49:05,522][03784] Avg episode reward: [(0, '0.836')] [2024-03-21 07:49:09,652][04017] Updated weights for policy 0, policy_version 46325 (0.0020) [2024-03-21 07:49:10,521][03784] Fps is (10 sec: 49151.7, 60 sec: 44782.9, 300 sec: 45542.0). Total num frames: 1518043136. Throughput: 0: 45053.4. Samples: 1519282300. Policy #0 lag: (min: 0.0, avg: 45.8, max: 116.0) [2024-03-21 07:49:10,522][03784] Avg episode reward: [(0, '0.914')] [2024-03-21 07:49:15,334][04017] Updated weights for policy 0, policy_version 46335 (0.0014) [2024-03-21 07:49:15,521][03784] Fps is (10 sec: 55705.5, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1518305280. Throughput: 0: 45111.2. Samples: 1519409200. Policy #0 lag: (min: 0.0, avg: 45.8, max: 116.0) [2024-03-21 07:49:15,522][03784] Avg episode reward: [(0, '1.655')] [2024-03-21 07:49:20,521][03784] Fps is (10 sec: 49151.9, 60 sec: 48059.7, 300 sec: 45319.8). Total num frames: 1518534656. Throughput: 0: 44813.0. Samples: 1519658500. Policy #0 lag: (min: 0.0, avg: 45.8, max: 116.0) [2024-03-21 07:49:20,522][03784] Avg episode reward: [(0, '0.769')] [2024-03-21 07:49:25,521][03784] Fps is (10 sec: 22937.7, 60 sec: 45329.2, 300 sec: 45208.7). Total num frames: 1518534656. Throughput: 0: 45609.0. Samples: 1519968300. Policy #0 lag: (min: 0.0, avg: 45.8, max: 116.0) [2024-03-21 07:49:25,522][03784] Avg episode reward: [(0, '1.000')] [2024-03-21 07:49:27,581][04017] Updated weights for policy 0, policy_version 46345 (0.0011) [2024-03-21 07:49:30,521][03784] Fps is (10 sec: 36044.9, 60 sec: 46967.5, 300 sec: 46097.3). Total num frames: 1518895104. Throughput: 0: 45351.0. Samples: 1520104400. Policy #0 lag: (min: 0.0, avg: 45.8, max: 116.0) [2024-03-21 07:49:30,522][03784] Avg episode reward: [(0, '1.189')] [2024-03-21 07:49:31,109][04017] Updated weights for policy 0, policy_version 46355 (0.0013) [2024-03-21 07:49:35,521][03784] Fps is (10 sec: 49151.8, 60 sec: 45875.1, 300 sec: 45653.1). Total num frames: 1519026176. Throughput: 0: 45724.4. Samples: 1520382800. Policy #0 lag: (min: 0.0, avg: 45.8, max: 116.0) [2024-03-21 07:49:35,522][03784] Avg episode reward: [(0, '0.862')] [2024-03-21 07:49:40,457][04017] Updated weights for policy 0, policy_version 46365 (0.0015) [2024-03-21 07:49:40,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 1519288320. Throughput: 0: 45817.8. Samples: 1520665100. Policy #0 lag: (min: 1.0, avg: 31.9, max: 72.0) [2024-03-21 07:49:40,522][03784] Avg episode reward: [(0, '0.692')] [2024-03-21 07:49:45,521][03784] Fps is (10 sec: 45875.3, 60 sec: 43690.7, 300 sec: 45542.0). Total num frames: 1519484928. Throughput: 0: 45560.0. Samples: 1520799900. Policy #0 lag: (min: 1.0, avg: 31.9, max: 72.0) [2024-03-21 07:49:45,522][03784] Avg episode reward: [(0, '1.450')] [2024-03-21 07:49:48,103][04017] Updated weights for policy 0, policy_version 46375 (0.0010) [2024-03-21 07:49:50,209][03995] Signal inference workers to stop experience collection... (30600 times) [2024-03-21 07:49:50,271][04017] InferenceWorker_p0-w0: stopping experience collection (30600 times) [2024-03-21 07:49:50,509][03995] Signal inference workers to resume experience collection... (30600 times) [2024-03-21 07:49:50,509][04017] InferenceWorker_p0-w0: resuming experience collection (30600 times) [2024-03-21 07:49:50,521][03784] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 45542.0). Total num frames: 1519681536. Throughput: 0: 45795.5. Samples: 1521078800. Policy #0 lag: (min: 1.0, avg: 31.9, max: 72.0) [2024-03-21 07:49:50,522][03784] Avg episode reward: [(0, '0.919')] [2024-03-21 07:49:55,100][04017] Updated weights for policy 0, policy_version 46385 (0.0014) [2024-03-21 07:49:55,521][03784] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 45319.8). Total num frames: 1519943680. Throughput: 0: 46120.1. Samples: 1521357700. Policy #0 lag: (min: 1.0, avg: 31.9, max: 72.0) [2024-03-21 07:49:55,522][03784] Avg episode reward: [(0, '1.626')] [2024-03-21 07:49:59,202][04017] Updated weights for policy 0, policy_version 46395 (0.0022) [2024-03-21 07:50:00,521][03784] Fps is (10 sec: 65535.9, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 1520336896. Throughput: 0: 46175.5. Samples: 1521487100. Policy #0 lag: (min: 1.0, avg: 31.9, max: 72.0) [2024-03-21 07:50:00,522][03784] Avg episode reward: [(0, '1.547')] [2024-03-21 07:50:00,560][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000046398_1520369664.pth... [2024-03-21 07:50:00,679][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000046061_1509326848.pth [2024-03-21 07:50:05,143][04017] Updated weights for policy 0, policy_version 46405 (0.0016) [2024-03-21 07:50:05,521][03784] Fps is (10 sec: 68813.6, 60 sec: 48059.8, 300 sec: 45875.2). Total num frames: 1520631808. Throughput: 0: 46858.0. Samples: 1521767100. Policy #0 lag: (min: 1.0, avg: 31.9, max: 72.0) [2024-03-21 07:50:05,522][03784] Avg episode reward: [(0, '1.477')] [2024-03-21 07:50:10,521][03784] Fps is (10 sec: 55705.6, 60 sec: 47513.6, 300 sec: 45764.1). Total num frames: 1520893952. Throughput: 0: 45928.8. Samples: 1522035100. Policy #0 lag: (min: 0.0, avg: 34.7, max: 95.0) [2024-03-21 07:50:10,522][03784] Avg episode reward: [(0, '1.491')] [2024-03-21 07:50:13,745][04017] Updated weights for policy 0, policy_version 46415 (0.0009) [2024-03-21 07:50:15,521][03784] Fps is (10 sec: 45874.6, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 1521090560. Throughput: 0: 46402.3. Samples: 1522192500. Policy #0 lag: (min: 0.0, avg: 34.7, max: 95.0) [2024-03-21 07:50:15,522][03784] Avg episode reward: [(0, '0.726')] [2024-03-21 07:50:20,109][04017] Updated weights for policy 0, policy_version 46425 (0.0013) [2024-03-21 07:50:20,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45329.2, 300 sec: 45542.0). Total num frames: 1521254400. Throughput: 0: 46328.9. Samples: 1522467600. Policy #0 lag: (min: 0.0, avg: 34.7, max: 95.0) [2024-03-21 07:50:20,522][03784] Avg episode reward: [(0, '1.667')] [2024-03-21 07:50:25,521][03784] Fps is (10 sec: 32767.6, 60 sec: 48059.6, 300 sec: 45764.1). Total num frames: 1521418240. Throughput: 0: 46026.6. Samples: 1522736300. Policy #0 lag: (min: 0.0, avg: 34.7, max: 95.0) [2024-03-21 07:50:25,522][03784] Avg episode reward: [(0, '1.003')] [2024-03-21 07:50:28,147][04017] Updated weights for policy 0, policy_version 46435 (0.0016) [2024-03-21 07:50:30,521][03784] Fps is (10 sec: 49152.1, 60 sec: 47513.7, 300 sec: 46097.4). Total num frames: 1521745920. Throughput: 0: 46053.3. Samples: 1522872300. Policy #0 lag: (min: 0.0, avg: 34.7, max: 95.0) [2024-03-21 07:50:30,522][03784] Avg episode reward: [(0, '1.545')] [2024-03-21 07:50:34,765][04017] Updated weights for policy 0, policy_version 46445 (0.0012) [2024-03-21 07:50:35,521][03784] Fps is (10 sec: 49152.7, 60 sec: 48059.8, 300 sec: 45542.0). Total num frames: 1521909760. Throughput: 0: 46002.3. Samples: 1523148900. Policy #0 lag: (min: 0.0, avg: 34.7, max: 95.0) [2024-03-21 07:50:35,522][03784] Avg episode reward: [(0, '1.230')] [2024-03-21 07:50:38,587][03995] Signal inference workers to stop experience collection... (30650 times) [2024-03-21 07:50:38,587][03995] Signal inference workers to resume experience collection... (30650 times) [2024-03-21 07:50:38,651][04017] InferenceWorker_p0-w0: stopping experience collection (30650 times) [2024-03-21 07:50:38,652][04017] InferenceWorker_p0-w0: resuming experience collection (30650 times) [2024-03-21 07:50:40,521][03784] Fps is (10 sec: 42598.2, 60 sec: 48059.7, 300 sec: 45542.0). Total num frames: 1522171904. Throughput: 0: 46140.0. Samples: 1523434000. Policy #0 lag: (min: 1.0, avg: 39.6, max: 81.0) [2024-03-21 07:50:40,522][03784] Avg episode reward: [(0, '1.130')] [2024-03-21 07:50:45,286][04017] Updated weights for policy 0, policy_version 46455 (0.0010) [2024-03-21 07:50:45,521][03784] Fps is (10 sec: 32767.7, 60 sec: 45875.1, 300 sec: 45430.9). Total num frames: 1522237440. Throughput: 0: 46644.4. Samples: 1523586100. Policy #0 lag: (min: 1.0, avg: 39.6, max: 81.0) [2024-03-21 07:50:45,522][03784] Avg episode reward: [(0, '1.311')] [2024-03-21 07:50:50,521][03784] Fps is (10 sec: 26214.3, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1522434048. Throughput: 0: 46968.7. Samples: 1523880700. Policy #0 lag: (min: 1.0, avg: 39.6, max: 81.0) [2024-03-21 07:50:50,522][03784] Avg episode reward: [(0, '1.304')] [2024-03-21 07:50:52,731][04017] Updated weights for policy 0, policy_version 46465 (0.0011) [2024-03-21 07:50:55,521][03784] Fps is (10 sec: 42598.1, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 1522663424. Throughput: 0: 47208.8. Samples: 1524159500. Policy #0 lag: (min: 1.0, avg: 39.6, max: 81.0) [2024-03-21 07:50:55,522][03784] Avg episode reward: [(0, '0.814')] [2024-03-21 07:51:00,412][04017] Updated weights for policy 0, policy_version 46475 (0.0011) [2024-03-21 07:51:00,521][03784] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 1522892800. Throughput: 0: 46733.3. Samples: 1524295500. Policy #0 lag: (min: 1.0, avg: 39.6, max: 81.0) [2024-03-21 07:51:00,522][03784] Avg episode reward: [(0, '0.910')] [2024-03-21 07:51:03,922][04017] Updated weights for policy 0, policy_version 46485 (0.0011) [2024-03-21 07:51:05,521][03784] Fps is (10 sec: 68812.8, 60 sec: 45328.9, 300 sec: 45875.2). Total num frames: 1523351552. Throughput: 0: 46806.5. Samples: 1524573900. Policy #0 lag: (min: 1.0, avg: 39.6, max: 81.0) [2024-03-21 07:51:05,522][03784] Avg episode reward: [(0, '0.910')] [2024-03-21 07:51:07,934][04017] Updated weights for policy 0, policy_version 46495 (0.0011) [2024-03-21 07:51:10,521][03784] Fps is (10 sec: 72089.8, 60 sec: 45329.1, 300 sec: 46208.5). Total num frames: 1523613696. Throughput: 0: 47155.6. Samples: 1524858300. Policy #0 lag: (min: 0.0, avg: 39.8, max: 77.0) [2024-03-21 07:51:10,522][03784] Avg episode reward: [(0, '0.910')] [2024-03-21 07:51:15,521][03784] Fps is (10 sec: 45875.6, 60 sec: 45329.1, 300 sec: 45875.2). Total num frames: 1523810304. Throughput: 0: 46922.2. Samples: 1524983800. Policy #0 lag: (min: 0.0, avg: 39.8, max: 77.0) [2024-03-21 07:51:15,522][03784] Avg episode reward: [(0, '0.808')] [2024-03-21 07:51:17,423][04017] Updated weights for policy 0, policy_version 46505 (0.0017) [2024-03-21 07:51:20,521][03784] Fps is (10 sec: 49151.3, 60 sec: 47513.5, 300 sec: 45986.3). Total num frames: 1524105216. Throughput: 0: 46690.9. Samples: 1525250000. Policy #0 lag: (min: 0.0, avg: 39.8, max: 77.0) [2024-03-21 07:51:20,522][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 07:51:21,623][04017] Updated weights for policy 0, policy_version 46515 (0.0011) [2024-03-21 07:51:25,521][03784] Fps is (10 sec: 52428.6, 60 sec: 48605.9, 300 sec: 45764.1). Total num frames: 1524334592. Throughput: 0: 46168.9. Samples: 1525511600. Policy #0 lag: (min: 0.0, avg: 39.8, max: 77.0) [2024-03-21 07:51:25,522][03784] Avg episode reward: [(0, '1.289')] [2024-03-21 07:51:26,186][03995] Signal inference workers to stop experience collection... (30700 times) [2024-03-21 07:51:26,253][03995] Signal inference workers to resume experience collection... (30700 times) [2024-03-21 07:51:26,261][04017] InferenceWorker_p0-w0: stopping experience collection (30700 times) [2024-03-21 07:51:26,435][04017] InferenceWorker_p0-w0: resuming experience collection (30700 times) [2024-03-21 07:51:28,315][04017] Updated weights for policy 0, policy_version 46525 (0.0017) [2024-03-21 07:51:30,521][03784] Fps is (10 sec: 45876.0, 60 sec: 46967.5, 300 sec: 45653.0). Total num frames: 1524563968. Throughput: 0: 45651.2. Samples: 1525640400. Policy #0 lag: (min: 0.0, avg: 39.8, max: 77.0) [2024-03-21 07:51:30,522][03784] Avg episode reward: [(0, '1.242')] [2024-03-21 07:51:35,521][03784] Fps is (10 sec: 36045.0, 60 sec: 46421.3, 300 sec: 45209.1). Total num frames: 1524695040. Throughput: 0: 45393.4. Samples: 1525923400. Policy #0 lag: (min: 0.0, avg: 39.8, max: 77.0) [2024-03-21 07:51:35,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 07:51:39,335][04017] Updated weights for policy 0, policy_version 46535 (0.0012) [2024-03-21 07:51:40,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45875.2, 300 sec: 45097.7). Total num frames: 1524924416. Throughput: 0: 45160.1. Samples: 1526191700. Policy #0 lag: (min: 0.0, avg: 33.7, max: 96.0) [2024-03-21 07:51:40,522][03784] Avg episode reward: [(0, '1.306')] [2024-03-21 07:51:45,521][03784] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 45875.2). Total num frames: 1525186560. Throughput: 0: 45286.7. Samples: 1526333400. Policy #0 lag: (min: 0.0, avg: 33.7, max: 96.0) [2024-03-21 07:51:45,522][03784] Avg episode reward: [(0, '0.888')] [2024-03-21 07:51:45,525][04017] Updated weights for policy 0, policy_version 46545 (0.0011) [2024-03-21 07:51:50,521][03784] Fps is (10 sec: 29490.4, 60 sec: 46421.2, 300 sec: 45430.9). Total num frames: 1525219328. Throughput: 0: 45675.4. Samples: 1526629300. Policy #0 lag: (min: 0.0, avg: 33.7, max: 96.0) [2024-03-21 07:51:50,522][03784] Avg episode reward: [(0, '1.536')] [2024-03-21 07:51:55,521][03784] Fps is (10 sec: 29491.3, 60 sec: 46967.6, 300 sec: 45986.3). Total num frames: 1525481472. Throughput: 0: 45348.9. Samples: 1526899000. Policy #0 lag: (min: 0.0, avg: 33.7, max: 96.0) [2024-03-21 07:51:55,522][03784] Avg episode reward: [(0, '1.040')] [2024-03-21 07:51:56,867][04017] Updated weights for policy 0, policy_version 46555 (0.0016) [2024-03-21 07:52:00,521][03784] Fps is (10 sec: 55706.7, 60 sec: 48059.7, 300 sec: 45764.1). Total num frames: 1525776384. Throughput: 0: 45713.3. Samples: 1527040900. Policy #0 lag: (min: 0.0, avg: 33.7, max: 96.0) [2024-03-21 07:52:00,522][03784] Avg episode reward: [(0, '0.800')] [2024-03-21 07:52:00,558][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000046564_1525809152.pth... [2024-03-21 07:52:00,694][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000046231_1514897408.pth [2024-03-21 07:52:01,119][04017] Updated weights for policy 0, policy_version 46565 (0.0016) [2024-03-21 07:52:05,521][03784] Fps is (10 sec: 62259.0, 60 sec: 45875.3, 300 sec: 46541.7). Total num frames: 1526104064. Throughput: 0: 45593.5. Samples: 1527301700. Policy #0 lag: (min: 0.0, avg: 33.7, max: 96.0) [2024-03-21 07:52:05,522][03784] Avg episode reward: [(0, '0.594')] [2024-03-21 07:52:06,692][04017] Updated weights for policy 0, policy_version 46575 (0.0011) [2024-03-21 07:52:10,521][03784] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 45430.9). Total num frames: 1526202368. Throughput: 0: 46480.0. Samples: 1527603200. Policy #0 lag: (min: 1.0, avg: 44.8, max: 90.0) [2024-03-21 07:52:10,522][03784] Avg episode reward: [(0, '1.642')] [2024-03-21 07:52:14,692][04017] Updated weights for policy 0, policy_version 46585 (0.0017) [2024-03-21 07:52:15,521][03784] Fps is (10 sec: 45874.7, 60 sec: 45875.1, 300 sec: 46208.4). Total num frames: 1526562816. Throughput: 0: 46282.1. Samples: 1527723100. Policy #0 lag: (min: 1.0, avg: 44.8, max: 90.0) [2024-03-21 07:52:15,522][03784] Avg episode reward: [(0, '1.413')] [2024-03-21 07:52:20,521][03784] Fps is (10 sec: 55705.8, 60 sec: 44236.9, 300 sec: 45764.1). Total num frames: 1526759424. Throughput: 0: 45848.9. Samples: 1527986600. Policy #0 lag: (min: 1.0, avg: 44.8, max: 90.0) [2024-03-21 07:52:20,522][03784] Avg episode reward: [(0, '1.036')] [2024-03-21 07:52:23,369][04017] Updated weights for policy 0, policy_version 46595 (0.0011) [2024-03-21 07:52:25,521][03784] Fps is (10 sec: 29491.7, 60 sec: 42052.3, 300 sec: 45208.7). Total num frames: 1526857728. Throughput: 0: 46288.9. Samples: 1528274700. Policy #0 lag: (min: 1.0, avg: 44.8, max: 90.0) [2024-03-21 07:52:25,522][03784] Avg episode reward: [(0, '1.542')] [2024-03-21 07:52:26,880][03995] Signal inference workers to stop experience collection... (30750 times) [2024-03-21 07:52:26,947][04017] InferenceWorker_p0-w0: stopping experience collection (30750 times) [2024-03-21 07:52:27,148][03995] Signal inference workers to resume experience collection... (30750 times) [2024-03-21 07:52:27,148][04017] InferenceWorker_p0-w0: resuming experience collection (30750 times) [2024-03-21 07:52:28,264][04017] Updated weights for policy 0, policy_version 46605 (0.0021) [2024-03-21 07:52:30,521][03784] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 45653.0). Total num frames: 1527185408. Throughput: 0: 45426.6. Samples: 1528377600. Policy #0 lag: (min: 1.0, avg: 44.8, max: 90.0) [2024-03-21 07:52:30,522][03784] Avg episode reward: [(0, '1.047')] [2024-03-21 07:52:35,249][04017] Updated weights for policy 0, policy_version 46615 (0.0014) [2024-03-21 07:52:35,521][03784] Fps is (10 sec: 62258.7, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 1527480320. Throughput: 0: 44909.1. Samples: 1528650200. Policy #0 lag: (min: 1.0, avg: 44.8, max: 90.0) [2024-03-21 07:52:35,522][03784] Avg episode reward: [(0, '1.084')] [2024-03-21 07:52:40,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 1527578624. Throughput: 0: 45620.0. Samples: 1528951900. Policy #0 lag: (min: 0.0, avg: 40.3, max: 80.0) [2024-03-21 07:52:40,522][03784] Avg episode reward: [(0, '1.513')] [2024-03-21 07:52:44,772][04017] Updated weights for policy 0, policy_version 46625 (0.0020) [2024-03-21 07:52:45,521][03784] Fps is (10 sec: 39320.8, 60 sec: 44782.7, 300 sec: 45208.7). Total num frames: 1527873536. Throughput: 0: 45470.9. Samples: 1529087100. Policy #0 lag: (min: 0.0, avg: 40.3, max: 80.0) [2024-03-21 07:52:45,523][03784] Avg episode reward: [(0, '1.513')] [2024-03-21 07:52:48,220][04017] Updated weights for policy 0, policy_version 46635 (0.0025) [2024-03-21 07:52:50,521][03784] Fps is (10 sec: 58982.5, 60 sec: 49152.2, 300 sec: 45542.0). Total num frames: 1528168448. Throughput: 0: 45537.8. Samples: 1529350900. Policy #0 lag: (min: 0.0, avg: 40.3, max: 80.0) [2024-03-21 07:52:50,522][03784] Avg episode reward: [(0, '1.010')] [2024-03-21 07:52:55,521][03784] Fps is (10 sec: 49153.3, 60 sec: 48059.7, 300 sec: 45653.0). Total num frames: 1528365056. Throughput: 0: 44517.8. Samples: 1529606500. Policy #0 lag: (min: 0.0, avg: 40.3, max: 80.0) [2024-03-21 07:52:55,522][03784] Avg episode reward: [(0, '0.513')] [2024-03-21 07:52:58,621][04017] Updated weights for policy 0, policy_version 46645 (0.0017) [2024-03-21 07:53:00,521][03784] Fps is (10 sec: 39322.1, 60 sec: 46421.5, 300 sec: 45653.1). Total num frames: 1528561664. Throughput: 0: 44862.5. Samples: 1529741900. Policy #0 lag: (min: 0.0, avg: 40.3, max: 80.0) [2024-03-21 07:53:00,521][03784] Avg episode reward: [(0, '1.175')] [2024-03-21 07:53:04,374][04017] Updated weights for policy 0, policy_version 46655 (0.0016) [2024-03-21 07:53:05,521][03784] Fps is (10 sec: 45875.0, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 1528823808. Throughput: 0: 45042.2. Samples: 1530013500. Policy #0 lag: (min: 0.0, avg: 40.3, max: 80.0) [2024-03-21 07:53:05,522][03784] Avg episode reward: [(0, '1.144')] [2024-03-21 07:53:10,521][03784] Fps is (10 sec: 39320.7, 60 sec: 45875.2, 300 sec: 45541.9). Total num frames: 1528954880. Throughput: 0: 44191.0. Samples: 1530263300. Policy #0 lag: (min: 0.0, avg: 30.3, max: 64.0) [2024-03-21 07:53:10,522][03784] Avg episode reward: [(0, '0.844')] [2024-03-21 07:53:14,085][04017] Updated weights for policy 0, policy_version 46665 (0.0014) [2024-03-21 07:53:15,521][03784] Fps is (10 sec: 29491.6, 60 sec: 42598.5, 300 sec: 45653.1). Total num frames: 1529118720. Throughput: 0: 44771.2. Samples: 1530392300. Policy #0 lag: (min: 0.0, avg: 30.3, max: 64.0) [2024-03-21 07:53:15,521][03784] Avg episode reward: [(0, '1.296')] [2024-03-21 07:53:19,747][04017] Updated weights for policy 0, policy_version 46675 (0.0011) [2024-03-21 07:53:20,521][03784] Fps is (10 sec: 52429.5, 60 sec: 45329.1, 300 sec: 46319.5). Total num frames: 1529479168. Throughput: 0: 44622.3. Samples: 1530658200. Policy #0 lag: (min: 0.0, avg: 30.3, max: 64.0) [2024-03-21 07:53:20,522][03784] Avg episode reward: [(0, '1.334')] [2024-03-21 07:53:22,037][03995] Signal inference workers to stop experience collection... (30800 times) [2024-03-21 07:53:22,103][03995] Signal inference workers to resume experience collection... (30800 times) [2024-03-21 07:53:22,109][04017] InferenceWorker_p0-w0: stopping experience collection (30800 times) [2024-03-21 07:53:22,161][04017] InferenceWorker_p0-w0: resuming experience collection (30800 times) [2024-03-21 07:53:25,521][03784] Fps is (10 sec: 55704.4, 60 sec: 46967.3, 300 sec: 46097.3). Total num frames: 1529675776. Throughput: 0: 44015.4. Samples: 1530932600. Policy #0 lag: (min: 0.0, avg: 30.3, max: 64.0) [2024-03-21 07:53:25,522][03784] Avg episode reward: [(0, '0.941')] [2024-03-21 07:53:30,457][04017] Updated weights for policy 0, policy_version 46685 (0.0011) [2024-03-21 07:53:30,521][03784] Fps is (10 sec: 29491.2, 60 sec: 43144.6, 300 sec: 45764.1). Total num frames: 1529774080. Throughput: 0: 44295.8. Samples: 1531080400. Policy #0 lag: (min: 0.0, avg: 30.3, max: 64.0) [2024-03-21 07:53:30,522][03784] Avg episode reward: [(0, '1.217')] [2024-03-21 07:53:35,521][03784] Fps is (10 sec: 36045.3, 60 sec: 42598.4, 300 sec: 45653.1). Total num frames: 1530036224. Throughput: 0: 45028.9. Samples: 1531377200. Policy #0 lag: (min: 0.0, avg: 30.3, max: 64.0) [2024-03-21 07:53:35,522][03784] Avg episode reward: [(0, '1.217')] [2024-03-21 07:53:36,142][04017] Updated weights for policy 0, policy_version 46695 (0.0011) [2024-03-21 07:53:40,521][03784] Fps is (10 sec: 58981.4, 60 sec: 46421.2, 300 sec: 45764.1). Total num frames: 1530363904. Throughput: 0: 44839.8. Samples: 1531624300. Policy #0 lag: (min: 3.0, avg: 34.8, max: 76.0) [2024-03-21 07:53:40,522][03784] Avg episode reward: [(0, '1.184')] [2024-03-21 07:53:41,645][04017] Updated weights for policy 0, policy_version 46705 (0.0015) [2024-03-21 07:53:45,521][03784] Fps is (10 sec: 55705.7, 60 sec: 45329.3, 300 sec: 45430.9). Total num frames: 1530593280. Throughput: 0: 44757.7. Samples: 1531756000. Policy #0 lag: (min: 3.0, avg: 34.8, max: 76.0) [2024-03-21 07:53:45,522][03784] Avg episode reward: [(0, '1.342')] [2024-03-21 07:53:49,632][04017] Updated weights for policy 0, policy_version 46715 (0.0017) [2024-03-21 07:53:50,521][03784] Fps is (10 sec: 42598.6, 60 sec: 43690.6, 300 sec: 45430.9). Total num frames: 1530789888. Throughput: 0: 44895.5. Samples: 1532033800. Policy #0 lag: (min: 3.0, avg: 34.8, max: 76.0) [2024-03-21 07:53:50,522][03784] Avg episode reward: [(0, '1.184')] [2024-03-21 07:53:54,937][04017] Updated weights for policy 0, policy_version 46725 (0.0019) [2024-03-21 07:53:55,521][03784] Fps is (10 sec: 49152.0, 60 sec: 45329.1, 300 sec: 45875.2). Total num frames: 1531084800. Throughput: 0: 45097.9. Samples: 1532292700. Policy #0 lag: (min: 3.0, avg: 34.8, max: 76.0) [2024-03-21 07:53:55,522][03784] Avg episode reward: [(0, '1.442')] [2024-03-21 07:54:00,521][03784] Fps is (10 sec: 49152.1, 60 sec: 45328.9, 300 sec: 45875.2). Total num frames: 1531281408. Throughput: 0: 45402.1. Samples: 1532435400. Policy #0 lag: (min: 3.0, avg: 34.8, max: 76.0) [2024-03-21 07:54:00,522][03784] Avg episode reward: [(0, '0.817')] [2024-03-21 07:54:00,536][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000046731_1531281408.pth... [2024-03-21 07:54:00,657][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000046398_1520369664.pth [2024-03-21 07:54:03,188][04017] Updated weights for policy 0, policy_version 46735 (0.0012) [2024-03-21 07:54:05,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 1531543552. Throughput: 0: 44833.3. Samples: 1532675700. Policy #0 lag: (min: 3.0, avg: 34.8, max: 76.0) [2024-03-21 07:54:05,522][03784] Avg episode reward: [(0, '1.366')] [2024-03-21 07:54:10,521][03784] Fps is (10 sec: 36045.0, 60 sec: 44783.0, 300 sec: 45208.7). Total num frames: 1531641856. Throughput: 0: 45511.2. Samples: 1532980600. Policy #0 lag: (min: 0.0, avg: 42.9, max: 109.0) [2024-03-21 07:54:10,522][03784] Avg episode reward: [(0, '1.068')] [2024-03-21 07:54:11,812][04017] Updated weights for policy 0, policy_version 46745 (0.0017) [2024-03-21 07:54:15,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46967.4, 300 sec: 45430.9). Total num frames: 1531936768. Throughput: 0: 45457.7. Samples: 1533126000. Policy #0 lag: (min: 0.0, avg: 42.9, max: 109.0) [2024-03-21 07:54:15,522][03784] Avg episode reward: [(0, '1.068')] [2024-03-21 07:54:19,160][04017] Updated weights for policy 0, policy_version 46755 (0.0013) [2024-03-21 07:54:20,521][03784] Fps is (10 sec: 49151.7, 60 sec: 44236.7, 300 sec: 46097.3). Total num frames: 1532133376. Throughput: 0: 45086.6. Samples: 1533406100. Policy #0 lag: (min: 0.0, avg: 42.9, max: 109.0) [2024-03-21 07:54:20,522][03784] Avg episode reward: [(0, '1.250')] [2024-03-21 07:54:25,521][03784] Fps is (10 sec: 42598.5, 60 sec: 44783.0, 300 sec: 45653.1). Total num frames: 1532362752. Throughput: 0: 46026.8. Samples: 1533695500. Policy #0 lag: (min: 0.0, avg: 42.9, max: 109.0) [2024-03-21 07:54:25,522][03784] Avg episode reward: [(0, '0.687')] [2024-03-21 07:54:27,211][04017] Updated weights for policy 0, policy_version 46765 (0.0019) [2024-03-21 07:54:30,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46967.4, 300 sec: 45986.3). Total num frames: 1532592128. Throughput: 0: 46177.6. Samples: 1533834000. Policy #0 lag: (min: 0.0, avg: 42.9, max: 109.0) [2024-03-21 07:54:30,522][03784] Avg episode reward: [(0, '1.539')] [2024-03-21 07:54:31,231][03995] Signal inference workers to stop experience collection... (30850 times) [2024-03-21 07:54:31,288][04017] InferenceWorker_p0-w0: stopping experience collection (30850 times) [2024-03-21 07:54:31,529][03995] Signal inference workers to resume experience collection... (30850 times) [2024-03-21 07:54:31,529][04017] InferenceWorker_p0-w0: resuming experience collection (30850 times) [2024-03-21 07:54:33,355][04017] Updated weights for policy 0, policy_version 46775 (0.0017) [2024-03-21 07:54:35,521][03784] Fps is (10 sec: 39321.4, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 1532755968. Throughput: 0: 45451.2. Samples: 1534079100. Policy #0 lag: (min: 0.0, avg: 42.9, max: 109.0) [2024-03-21 07:54:35,522][03784] Avg episode reward: [(0, '1.225')] [2024-03-21 07:54:38,880][04017] Updated weights for policy 0, policy_version 46785 (0.0011) [2024-03-21 07:54:40,521][03784] Fps is (10 sec: 52429.4, 60 sec: 45875.3, 300 sec: 46208.4). Total num frames: 1533116416. Throughput: 0: 45764.4. Samples: 1534352100. Policy #0 lag: (min: 2.0, avg: 33.7, max: 66.0) [2024-03-21 07:54:40,522][03784] Avg episode reward: [(0, '1.355')] [2024-03-21 07:54:44,189][04017] Updated weights for policy 0, policy_version 46795 (0.0014) [2024-03-21 07:54:45,521][03784] Fps is (10 sec: 62259.5, 60 sec: 46421.3, 300 sec: 46430.6). Total num frames: 1533378560. Throughput: 0: 45526.7. Samples: 1534484100. Policy #0 lag: (min: 2.0, avg: 33.7, max: 66.0) [2024-03-21 07:54:45,522][03784] Avg episode reward: [(0, '1.284')] [2024-03-21 07:54:50,521][03784] Fps is (10 sec: 36044.8, 60 sec: 44783.0, 300 sec: 45875.2). Total num frames: 1533476864. Throughput: 0: 47237.8. Samples: 1534801400. Policy #0 lag: (min: 2.0, avg: 33.7, max: 66.0) [2024-03-21 07:54:50,522][03784] Avg episode reward: [(0, '1.284')] [2024-03-21 07:54:55,521][03784] Fps is (10 sec: 26214.3, 60 sec: 42598.4, 300 sec: 45097.7). Total num frames: 1533640704. Throughput: 0: 46886.7. Samples: 1535090500. Policy #0 lag: (min: 2.0, avg: 33.7, max: 66.0) [2024-03-21 07:54:55,522][03784] Avg episode reward: [(0, '1.358')] [2024-03-21 07:54:55,980][04017] Updated weights for policy 0, policy_version 46805 (0.0017) [2024-03-21 07:55:00,527][03784] Fps is (10 sec: 49124.3, 60 sec: 44778.8, 300 sec: 45207.9). Total num frames: 1533968384. Throughput: 0: 46423.1. Samples: 1535215300. Policy #0 lag: (min: 2.0, avg: 33.7, max: 66.0) [2024-03-21 07:55:00,527][03784] Avg episode reward: [(0, '0.872')] [2024-03-21 07:55:00,990][04017] Updated weights for policy 0, policy_version 46815 (0.0016) [2024-03-21 07:55:05,521][03784] Fps is (10 sec: 58982.7, 60 sec: 44783.0, 300 sec: 45208.7). Total num frames: 1534230528. Throughput: 0: 46442.3. Samples: 1535496000. Policy #0 lag: (min: 2.0, avg: 33.7, max: 66.0) [2024-03-21 07:55:05,522][03784] Avg episode reward: [(0, '1.669')] [2024-03-21 07:55:07,333][04017] Updated weights for policy 0, policy_version 46825 (0.0014) [2024-03-21 07:55:10,521][03784] Fps is (10 sec: 52457.4, 60 sec: 47513.5, 300 sec: 45430.9). Total num frames: 1534492672. Throughput: 0: 45977.6. Samples: 1535764500. Policy #0 lag: (min: 0.0, avg: 36.8, max: 83.0) [2024-03-21 07:55:10,522][03784] Avg episode reward: [(0, '1.035')] [2024-03-21 07:55:13,710][04017] Updated weights for policy 0, policy_version 46835 (0.0013) [2024-03-21 07:55:15,521][03784] Fps is (10 sec: 52428.3, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 1534754816. Throughput: 0: 45726.7. Samples: 1535891700. Policy #0 lag: (min: 0.0, avg: 36.8, max: 83.0) [2024-03-21 07:55:15,522][03784] Avg episode reward: [(0, '0.768')] [2024-03-21 07:55:20,521][03784] Fps is (10 sec: 42599.1, 60 sec: 46421.4, 300 sec: 45764.1). Total num frames: 1534918656. Throughput: 0: 46924.5. Samples: 1536190700. Policy #0 lag: (min: 0.0, avg: 36.8, max: 83.0) [2024-03-21 07:55:20,522][03784] Avg episode reward: [(0, '0.873')] [2024-03-21 07:55:22,174][04017] Updated weights for policy 0, policy_version 46845 (0.0012) [2024-03-21 07:55:24,465][03995] Signal inference workers to stop experience collection... (30900 times) [2024-03-21 07:55:24,467][03995] Signal inference workers to resume experience collection... (30900 times) [2024-03-21 07:55:24,543][04017] InferenceWorker_p0-w0: stopping experience collection (30900 times) [2024-03-21 07:55:24,543][04017] InferenceWorker_p0-w0: resuming experience collection (30900 times) [2024-03-21 07:55:25,521][03784] Fps is (10 sec: 45875.7, 60 sec: 47513.6, 300 sec: 45653.0). Total num frames: 1535213568. Throughput: 0: 47117.8. Samples: 1536472400. Policy #0 lag: (min: 0.0, avg: 36.8, max: 83.0) [2024-03-21 07:55:25,522][03784] Avg episode reward: [(0, '1.338')] [2024-03-21 07:55:26,614][04017] Updated weights for policy 0, policy_version 46855 (0.0016) [2024-03-21 07:55:30,521][03784] Fps is (10 sec: 65535.8, 60 sec: 49698.2, 300 sec: 46319.5). Total num frames: 1535574016. Throughput: 0: 47102.2. Samples: 1536603700. Policy #0 lag: (min: 0.0, avg: 36.8, max: 83.0) [2024-03-21 07:55:30,522][03784] Avg episode reward: [(0, '1.480')] [2024-03-21 07:55:35,521][03784] Fps is (10 sec: 42597.9, 60 sec: 48059.7, 300 sec: 45653.0). Total num frames: 1535639552. Throughput: 0: 47131.0. Samples: 1536922300. Policy #0 lag: (min: 0.0, avg: 36.8, max: 83.0) [2024-03-21 07:55:35,522][03784] Avg episode reward: [(0, '1.238')] [2024-03-21 07:55:35,711][04017] Updated weights for policy 0, policy_version 46865 (0.0023) [2024-03-21 07:55:40,521][03784] Fps is (10 sec: 16384.0, 60 sec: 43690.6, 300 sec: 45764.1). Total num frames: 1535737856. Throughput: 0: 47126.6. Samples: 1537211200. Policy #0 lag: (min: 0.0, avg: 39.8, max: 89.0) [2024-03-21 07:55:40,522][03784] Avg episode reward: [(0, '1.611')] [2024-03-21 07:55:44,485][04017] Updated weights for policy 0, policy_version 46875 (0.0019) [2024-03-21 07:55:45,521][03784] Fps is (10 sec: 39321.8, 60 sec: 44236.8, 300 sec: 46097.4). Total num frames: 1536032768. Throughput: 0: 47392.6. Samples: 1537347700. Policy #0 lag: (min: 0.0, avg: 39.8, max: 89.0) [2024-03-21 07:55:45,522][03784] Avg episode reward: [(0, '1.480')] [2024-03-21 07:55:49,331][04017] Updated weights for policy 0, policy_version 46885 (0.0011) [2024-03-21 07:55:50,521][03784] Fps is (10 sec: 65535.8, 60 sec: 48605.8, 300 sec: 46541.7). Total num frames: 1536393216. Throughput: 0: 47173.2. Samples: 1537618800. Policy #0 lag: (min: 0.0, avg: 39.8, max: 89.0) [2024-03-21 07:55:50,522][03784] Avg episode reward: [(0, '0.494')] [2024-03-21 07:55:54,456][04017] Updated weights for policy 0, policy_version 46895 (0.0012) [2024-03-21 07:55:55,521][03784] Fps is (10 sec: 62259.7, 60 sec: 50244.3, 300 sec: 46652.8). Total num frames: 1536655360. Throughput: 0: 47435.8. Samples: 1537899100. Policy #0 lag: (min: 0.0, avg: 39.8, max: 89.0) [2024-03-21 07:55:55,522][03784] Avg episode reward: [(0, '0.633')] [2024-03-21 07:56:00,521][03784] Fps is (10 sec: 45875.2, 60 sec: 48064.2, 300 sec: 45764.1). Total num frames: 1536851968. Throughput: 0: 47942.2. Samples: 1538049100. Policy #0 lag: (min: 0.0, avg: 39.8, max: 89.0) [2024-03-21 07:56:00,522][03784] Avg episode reward: [(0, '1.004')] [2024-03-21 07:56:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000046901_1536851968.pth... [2024-03-21 07:56:00,691][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000046564_1525809152.pth [2024-03-21 07:56:03,853][04017] Updated weights for policy 0, policy_version 46905 (0.0011) [2024-03-21 07:56:05,521][03784] Fps is (10 sec: 42597.2, 60 sec: 47513.4, 300 sec: 45653.0). Total num frames: 1537081344. Throughput: 0: 47575.3. Samples: 1538331600. Policy #0 lag: (min: 0.0, avg: 39.8, max: 89.0) [2024-03-21 07:56:05,523][03784] Avg episode reward: [(0, '1.325')] [2024-03-21 07:56:08,465][04017] Updated weights for policy 0, policy_version 46915 (0.0017) [2024-03-21 07:56:10,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48059.8, 300 sec: 45986.3). Total num frames: 1537376256. Throughput: 0: 47457.6. Samples: 1538608000. Policy #0 lag: (min: 2.0, avg: 40.7, max: 77.0) [2024-03-21 07:56:10,522][03784] Avg episode reward: [(0, '0.471')] [2024-03-21 07:56:12,143][03995] Signal inference workers to stop experience collection... (30950 times) [2024-03-21 07:56:12,151][03995] Signal inference workers to resume experience collection... (30950 times) [2024-03-21 07:56:12,240][04017] InferenceWorker_p0-w0: stopping experience collection (30950 times) [2024-03-21 07:56:12,240][04017] InferenceWorker_p0-w0: resuming experience collection (30950 times) [2024-03-21 07:56:15,521][03784] Fps is (10 sec: 42599.3, 60 sec: 45875.3, 300 sec: 45430.9). Total num frames: 1537507328. Throughput: 0: 47926.7. Samples: 1538760400. Policy #0 lag: (min: 2.0, avg: 40.7, max: 77.0) [2024-03-21 07:56:15,522][03784] Avg episode reward: [(0, '0.621')] [2024-03-21 07:56:16,843][04017] Updated weights for policy 0, policy_version 46925 (0.0017) [2024-03-21 07:56:20,521][03784] Fps is (10 sec: 39321.7, 60 sec: 47513.6, 300 sec: 45542.0). Total num frames: 1537769472. Throughput: 0: 46995.6. Samples: 1539037100. Policy #0 lag: (min: 2.0, avg: 40.7, max: 77.0) [2024-03-21 07:56:20,522][03784] Avg episode reward: [(0, '0.621')] [2024-03-21 07:56:24,138][04017] Updated weights for policy 0, policy_version 46935 (0.0012) [2024-03-21 07:56:25,521][03784] Fps is (10 sec: 52428.4, 60 sec: 46967.4, 300 sec: 45653.0). Total num frames: 1538031616. Throughput: 0: 46331.1. Samples: 1539296100. Policy #0 lag: (min: 2.0, avg: 40.7, max: 77.0) [2024-03-21 07:56:25,522][03784] Avg episode reward: [(0, '0.811')] [2024-03-21 07:56:30,521][03784] Fps is (10 sec: 26214.3, 60 sec: 40960.0, 300 sec: 45208.7). Total num frames: 1538031616. Throughput: 0: 46315.5. Samples: 1539431900. Policy #0 lag: (min: 2.0, avg: 40.7, max: 77.0) [2024-03-21 07:56:30,522][03784] Avg episode reward: [(0, '1.534')] [2024-03-21 07:56:34,211][04017] Updated weights for policy 0, policy_version 46945 (0.0017) [2024-03-21 07:56:35,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 1538392064. Throughput: 0: 45448.9. Samples: 1539664000. Policy #0 lag: (min: 2.0, avg: 40.7, max: 77.0) [2024-03-21 07:56:35,522][03784] Avg episode reward: [(0, '0.919')] [2024-03-21 07:56:38,613][04017] Updated weights for policy 0, policy_version 46955 (0.0016) [2024-03-21 07:56:40,521][03784] Fps is (10 sec: 72090.2, 60 sec: 50244.3, 300 sec: 45986.3). Total num frames: 1538752512. Throughput: 0: 44775.5. Samples: 1539914000. Policy #0 lag: (min: 1.0, avg: 33.9, max: 67.0) [2024-03-21 07:56:40,522][03784] Avg episode reward: [(0, '0.945')] [2024-03-21 07:56:44,460][04017] Updated weights for policy 0, policy_version 46965 (0.0012) [2024-03-21 07:56:45,521][03784] Fps is (10 sec: 55705.9, 60 sec: 48605.9, 300 sec: 46541.7). Total num frames: 1538949120. Throughput: 0: 44582.3. Samples: 1540055300. Policy #0 lag: (min: 1.0, avg: 33.9, max: 67.0) [2024-03-21 07:56:45,522][03784] Avg episode reward: [(0, '1.226')] [2024-03-21 07:56:50,521][03784] Fps is (10 sec: 32767.9, 60 sec: 44783.0, 300 sec: 46097.3). Total num frames: 1539080192. Throughput: 0: 44982.4. Samples: 1540355800. Policy #0 lag: (min: 1.0, avg: 33.9, max: 67.0) [2024-03-21 07:56:50,522][03784] Avg episode reward: [(0, '1.068')] [2024-03-21 07:56:52,533][04017] Updated weights for policy 0, policy_version 46975 (0.0010) [2024-03-21 07:56:55,521][03784] Fps is (10 sec: 39321.1, 60 sec: 44782.8, 300 sec: 45986.3). Total num frames: 1539342336. Throughput: 0: 44886.6. Samples: 1540627900. Policy #0 lag: (min: 1.0, avg: 33.9, max: 67.0) [2024-03-21 07:56:55,522][03784] Avg episode reward: [(0, '1.339')] [2024-03-21 07:56:58,538][04017] Updated weights for policy 0, policy_version 46985 (0.0018) [2024-03-21 07:57:00,521][03784] Fps is (10 sec: 62259.8, 60 sec: 47513.7, 300 sec: 46097.4). Total num frames: 1539702784. Throughput: 0: 44257.8. Samples: 1540752000. Policy #0 lag: (min: 1.0, avg: 33.9, max: 67.0) [2024-03-21 07:57:00,522][03784] Avg episode reward: [(0, '1.004')] [2024-03-21 07:57:05,521][03784] Fps is (10 sec: 42599.1, 60 sec: 44783.1, 300 sec: 45986.3). Total num frames: 1539768320. Throughput: 0: 44237.9. Samples: 1541027800. Policy #0 lag: (min: 1.0, avg: 33.9, max: 67.0) [2024-03-21 07:57:05,522][03784] Avg episode reward: [(0, '1.479')] [2024-03-21 07:57:09,703][03995] Signal inference workers to stop experience collection... (31000 times) [2024-03-21 07:57:09,778][03995] Signal inference workers to resume experience collection... (31000 times) [2024-03-21 07:57:09,788][04017] InferenceWorker_p0-w0: stopping experience collection (31000 times) [2024-03-21 07:57:09,835][04017] InferenceWorker_p0-w0: resuming experience collection (31000 times) [2024-03-21 07:57:10,146][04017] Updated weights for policy 0, policy_version 46995 (0.0012) [2024-03-21 07:57:10,521][03784] Fps is (10 sec: 22937.2, 60 sec: 42598.4, 300 sec: 45319.8). Total num frames: 1539932160. Throughput: 0: 44713.3. Samples: 1541308200. Policy #0 lag: (min: 0.0, avg: 31.6, max: 65.0) [2024-03-21 07:57:10,522][03784] Avg episode reward: [(0, '1.336')] [2024-03-21 07:57:15,521][03784] Fps is (10 sec: 32767.8, 60 sec: 43144.5, 300 sec: 45208.7). Total num frames: 1540096000. Throughput: 0: 44726.7. Samples: 1541444600. Policy #0 lag: (min: 0.0, avg: 31.6, max: 65.0) [2024-03-21 07:57:15,522][03784] Avg episode reward: [(0, '0.735')] [2024-03-21 07:57:18,470][04017] Updated weights for policy 0, policy_version 47005 (0.0015) [2024-03-21 07:57:20,521][03784] Fps is (10 sec: 36045.0, 60 sec: 42052.3, 300 sec: 45542.0). Total num frames: 1540292608. Throughput: 0: 45488.9. Samples: 1541711000. Policy #0 lag: (min: 0.0, avg: 31.6, max: 65.0) [2024-03-21 07:57:20,522][03784] Avg episode reward: [(0, '1.400')] [2024-03-21 07:57:25,187][04017] Updated weights for policy 0, policy_version 47015 (0.0021) [2024-03-21 07:57:25,521][03784] Fps is (10 sec: 52428.6, 60 sec: 43144.5, 300 sec: 45542.0). Total num frames: 1540620288. Throughput: 0: 45884.4. Samples: 1541978800. Policy #0 lag: (min: 0.0, avg: 31.6, max: 65.0) [2024-03-21 07:57:25,522][03784] Avg episode reward: [(0, '1.180')] [2024-03-21 07:57:28,842][04017] Updated weights for policy 0, policy_version 47025 (0.0021) [2024-03-21 07:57:30,521][03784] Fps is (10 sec: 65536.5, 60 sec: 48606.0, 300 sec: 45653.1). Total num frames: 1540947968. Throughput: 0: 45155.6. Samples: 1542087300. Policy #0 lag: (min: 0.0, avg: 31.6, max: 65.0) [2024-03-21 07:57:30,522][03784] Avg episode reward: [(0, '1.547')] [2024-03-21 07:57:35,521][03784] Fps is (10 sec: 55705.7, 60 sec: 46421.3, 300 sec: 46097.3). Total num frames: 1541177344. Throughput: 0: 44382.2. Samples: 1542353000. Policy #0 lag: (min: 0.0, avg: 31.6, max: 65.0) [2024-03-21 07:57:35,522][03784] Avg episode reward: [(0, '1.565')] [2024-03-21 07:57:36,816][04017] Updated weights for policy 0, policy_version 47035 (0.0011) [2024-03-21 07:57:40,521][03784] Fps is (10 sec: 45875.0, 60 sec: 44236.8, 300 sec: 45875.2). Total num frames: 1541406720. Throughput: 0: 44451.2. Samples: 1542628200. Policy #0 lag: (min: 1.0, avg: 29.5, max: 58.0) [2024-03-21 07:57:40,522][03784] Avg episode reward: [(0, '1.493')] [2024-03-21 07:57:42,961][04017] Updated weights for policy 0, policy_version 47045 (0.0023) [2024-03-21 07:57:45,521][03784] Fps is (10 sec: 52429.3, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 1541701632. Throughput: 0: 44495.5. Samples: 1542754300. Policy #0 lag: (min: 1.0, avg: 29.5, max: 58.0) [2024-03-21 07:57:45,522][03784] Avg episode reward: [(0, '1.145')] [2024-03-21 07:57:47,452][04017] Updated weights for policy 0, policy_version 47055 (0.0023) [2024-03-21 07:57:50,521][03784] Fps is (10 sec: 52428.5, 60 sec: 47513.5, 300 sec: 45986.3). Total num frames: 1541931008. Throughput: 0: 44031.0. Samples: 1543009200. Policy #0 lag: (min: 1.0, avg: 29.5, max: 58.0) [2024-03-21 07:57:50,522][03784] Avg episode reward: [(0, '1.040')] [2024-03-21 07:57:55,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46421.4, 300 sec: 45986.3). Total num frames: 1542127616. Throughput: 0: 42851.2. Samples: 1543236500. Policy #0 lag: (min: 1.0, avg: 29.5, max: 58.0) [2024-03-21 07:57:55,522][03784] Avg episode reward: [(0, '1.288')] [2024-03-21 07:57:59,874][04017] Updated weights for policy 0, policy_version 47065 (0.0014) [2024-03-21 07:57:59,922][03995] Signal inference workers to stop experience collection... (31050 times) [2024-03-21 07:57:59,922][03995] Signal inference workers to resume experience collection... (31050 times) [2024-03-21 07:57:59,993][04017] InferenceWorker_p0-w0: stopping experience collection (31050 times) [2024-03-21 07:57:59,993][04017] InferenceWorker_p0-w0: resuming experience collection (31050 times) [2024-03-21 07:58:00,521][03784] Fps is (10 sec: 32768.2, 60 sec: 42598.3, 300 sec: 45542.0). Total num frames: 1542258688. Throughput: 0: 42755.5. Samples: 1543368600. Policy #0 lag: (min: 1.0, avg: 29.5, max: 58.0) [2024-03-21 07:58:00,522][03784] Avg episode reward: [(0, '1.729')] [2024-03-21 07:58:00,584][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000047067_1542291456.pth... [2024-03-21 07:58:00,719][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000046731_1531281408.pth [2024-03-21 07:58:05,521][03784] Fps is (10 sec: 29491.4, 60 sec: 44236.8, 300 sec: 45653.1). Total num frames: 1542422528. Throughput: 0: 43153.5. Samples: 1543652900. Policy #0 lag: (min: 1.0, avg: 29.5, max: 58.0) [2024-03-21 07:58:05,522][03784] Avg episode reward: [(0, '1.097')] [2024-03-21 07:58:06,547][04017] Updated weights for policy 0, policy_version 47075 (0.0017) [2024-03-21 07:58:10,521][03784] Fps is (10 sec: 32768.0, 60 sec: 44236.9, 300 sec: 45653.0). Total num frames: 1542586368. Throughput: 0: 43722.3. Samples: 1543946300. Policy #0 lag: (min: 0.0, avg: 48.0, max: 111.0) [2024-03-21 07:58:10,522][03784] Avg episode reward: [(0, '1.534')] [2024-03-21 07:58:15,521][03784] Fps is (10 sec: 32767.5, 60 sec: 44236.7, 300 sec: 44986.6). Total num frames: 1542750208. Throughput: 0: 44584.3. Samples: 1544093600. Policy #0 lag: (min: 0.0, avg: 48.0, max: 111.0) [2024-03-21 07:58:15,522][03784] Avg episode reward: [(0, '0.669')] [2024-03-21 07:58:17,511][04017] Updated weights for policy 0, policy_version 47085 (0.0015) [2024-03-21 07:58:20,521][03784] Fps is (10 sec: 42598.2, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 1543012352. Throughput: 0: 44577.7. Samples: 1544359000. Policy #0 lag: (min: 0.0, avg: 48.0, max: 111.0) [2024-03-21 07:58:20,522][03784] Avg episode reward: [(0, '0.854')] [2024-03-21 07:58:24,416][04017] Updated weights for policy 0, policy_version 47095 (0.0018) [2024-03-21 07:58:25,521][03784] Fps is (10 sec: 52429.0, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 1543274496. Throughput: 0: 44471.1. Samples: 1544629400. Policy #0 lag: (min: 0.0, avg: 48.0, max: 111.0) [2024-03-21 07:58:25,522][03784] Avg episode reward: [(0, '0.949')] [2024-03-21 07:58:30,521][03784] Fps is (10 sec: 45875.1, 60 sec: 42052.2, 300 sec: 45541.9). Total num frames: 1543471104. Throughput: 0: 44782.1. Samples: 1544769500. Policy #0 lag: (min: 0.0, avg: 48.0, max: 111.0) [2024-03-21 07:58:30,522][03784] Avg episode reward: [(0, '1.108')] [2024-03-21 07:58:31,033][04017] Updated weights for policy 0, policy_version 47105 (0.0012) [2024-03-21 07:58:35,524][03784] Fps is (10 sec: 45863.8, 60 sec: 42596.6, 300 sec: 45319.4). Total num frames: 1543733248. Throughput: 0: 45199.8. Samples: 1545043300. Policy #0 lag: (min: 0.0, avg: 48.0, max: 111.0) [2024-03-21 07:58:35,525][03784] Avg episode reward: [(0, '1.066')] [2024-03-21 07:58:37,173][04017] Updated weights for policy 0, policy_version 47115 (0.0017) [2024-03-21 07:58:40,521][03784] Fps is (10 sec: 55706.0, 60 sec: 43690.7, 300 sec: 45542.0). Total num frames: 1544028160. Throughput: 0: 46182.2. Samples: 1545314700. Policy #0 lag: (min: 2.0, avg: 32.2, max: 61.0) [2024-03-21 07:58:40,522][03784] Avg episode reward: [(0, '1.369')] [2024-03-21 07:58:43,273][04017] Updated weights for policy 0, policy_version 47125 (0.0017) [2024-03-21 07:58:45,521][03784] Fps is (10 sec: 49164.7, 60 sec: 42052.3, 300 sec: 45542.0). Total num frames: 1544224768. Throughput: 0: 46273.4. Samples: 1545450900. Policy #0 lag: (min: 2.0, avg: 32.2, max: 61.0) [2024-03-21 07:58:45,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 07:58:49,676][04017] Updated weights for policy 0, policy_version 47135 (0.0017) [2024-03-21 07:58:50,521][03784] Fps is (10 sec: 55705.5, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 1544585216. Throughput: 0: 45866.6. Samples: 1545716900. Policy #0 lag: (min: 2.0, avg: 32.2, max: 61.0) [2024-03-21 07:58:50,522][03784] Avg episode reward: [(0, '0.640')] [2024-03-21 07:58:51,238][03995] Signal inference workers to stop experience collection... (31100 times) [2024-03-21 07:58:51,239][03995] Signal inference workers to resume experience collection... (31100 times) [2024-03-21 07:58:51,308][04017] InferenceWorker_p0-w0: stopping experience collection (31100 times) [2024-03-21 07:58:51,309][04017] InferenceWorker_p0-w0: resuming experience collection (31100 times) [2024-03-21 07:58:54,208][04017] Updated weights for policy 0, policy_version 47145 (0.0017) [2024-03-21 07:58:55,521][03784] Fps is (10 sec: 72089.7, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 1544945664. Throughput: 0: 45213.4. Samples: 1545980900. Policy #0 lag: (min: 2.0, avg: 32.2, max: 61.0) [2024-03-21 07:58:55,522][03784] Avg episode reward: [(0, '0.833')] [2024-03-21 07:59:00,521][03784] Fps is (10 sec: 42598.6, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 1545011200. Throughput: 0: 45246.8. Samples: 1546129700. Policy #0 lag: (min: 2.0, avg: 32.2, max: 61.0) [2024-03-21 07:59:00,522][03784] Avg episode reward: [(0, '0.653')] [2024-03-21 07:59:04,602][04017] Updated weights for policy 0, policy_version 47155 (0.0017) [2024-03-21 07:59:05,521][03784] Fps is (10 sec: 26214.3, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 1545207808. Throughput: 0: 45993.4. Samples: 1546428700. Policy #0 lag: (min: 2.0, avg: 32.2, max: 61.0) [2024-03-21 07:59:05,522][03784] Avg episode reward: [(0, '1.438')] [2024-03-21 07:59:10,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46967.4, 300 sec: 45653.0). Total num frames: 1545404416. Throughput: 0: 46620.0. Samples: 1546727300. Policy #0 lag: (min: 2.0, avg: 46.1, max: 107.0) [2024-03-21 07:59:10,522][03784] Avg episode reward: [(0, '0.693')] [2024-03-21 07:59:11,617][04017] Updated weights for policy 0, policy_version 47165 (0.0011) [2024-03-21 07:59:15,521][03784] Fps is (10 sec: 39321.3, 60 sec: 47513.6, 300 sec: 45653.0). Total num frames: 1545601024. Throughput: 0: 46431.1. Samples: 1546858900. Policy #0 lag: (min: 2.0, avg: 46.1, max: 107.0) [2024-03-21 07:59:15,522][03784] Avg episode reward: [(0, '0.693')] [2024-03-21 07:59:18,425][04017] Updated weights for policy 0, policy_version 47175 (0.0019) [2024-03-21 07:59:20,521][03784] Fps is (10 sec: 58982.9, 60 sec: 49698.2, 300 sec: 46208.4). Total num frames: 1545994240. Throughput: 0: 46376.0. Samples: 1547130100. Policy #0 lag: (min: 2.0, avg: 46.1, max: 107.0) [2024-03-21 07:59:20,522][03784] Avg episode reward: [(0, '0.973')] [2024-03-21 07:59:25,521][03784] Fps is (10 sec: 45875.7, 60 sec: 46421.4, 300 sec: 45653.1). Total num frames: 1546059776. Throughput: 0: 46997.8. Samples: 1547429600. Policy #0 lag: (min: 2.0, avg: 46.1, max: 107.0) [2024-03-21 07:59:25,522][03784] Avg episode reward: [(0, '1.380')] [2024-03-21 07:59:26,469][04017] Updated weights for policy 0, policy_version 47185 (0.0012) [2024-03-21 07:59:30,521][03784] Fps is (10 sec: 26214.3, 60 sec: 46421.4, 300 sec: 45764.1). Total num frames: 1546256384. Throughput: 0: 47088.8. Samples: 1547569900. Policy #0 lag: (min: 2.0, avg: 46.1, max: 107.0) [2024-03-21 07:59:30,522][03784] Avg episode reward: [(0, '0.993')] [2024-03-21 07:59:32,940][04017] Updated weights for policy 0, policy_version 47195 (0.0021) [2024-03-21 07:59:35,521][03784] Fps is (10 sec: 45874.9, 60 sec: 46423.3, 300 sec: 45430.9). Total num frames: 1546518528. Throughput: 0: 47528.9. Samples: 1547855700. Policy #0 lag: (min: 2.0, avg: 46.1, max: 107.0) [2024-03-21 07:59:35,522][03784] Avg episode reward: [(0, '0.993')] [2024-03-21 07:59:40,139][04017] Updated weights for policy 0, policy_version 47205 (0.0014) [2024-03-21 07:59:40,521][03784] Fps is (10 sec: 58982.5, 60 sec: 46967.5, 300 sec: 45653.0). Total num frames: 1546846208. Throughput: 0: 47842.2. Samples: 1548133800. Policy #0 lag: (min: 3.0, avg: 46.8, max: 88.0) [2024-03-21 07:59:40,522][03784] Avg episode reward: [(0, '1.255')] [2024-03-21 07:59:41,677][03995] Signal inference workers to stop experience collection... (31150 times) [2024-03-21 07:59:41,678][03995] Signal inference workers to resume experience collection... (31150 times) [2024-03-21 07:59:41,866][04017] InferenceWorker_p0-w0: stopping experience collection (31150 times) [2024-03-21 07:59:41,870][04017] InferenceWorker_p0-w0: resuming experience collection (31150 times) [2024-03-21 07:59:45,521][03784] Fps is (10 sec: 49151.6, 60 sec: 46421.2, 300 sec: 45875.2). Total num frames: 1547010048. Throughput: 0: 47571.0. Samples: 1548270400. Policy #0 lag: (min: 3.0, avg: 46.8, max: 88.0) [2024-03-21 07:59:45,522][03784] Avg episode reward: [(0, '1.213')] [2024-03-21 07:59:49,758][04017] Updated weights for policy 0, policy_version 47215 (0.0010) [2024-03-21 07:59:50,521][03784] Fps is (10 sec: 32767.9, 60 sec: 43144.6, 300 sec: 45875.2). Total num frames: 1547173888. Throughput: 0: 47457.8. Samples: 1548564300. Policy #0 lag: (min: 3.0, avg: 46.8, max: 88.0) [2024-03-21 07:59:50,522][03784] Avg episode reward: [(0, '1.297')] [2024-03-21 07:59:55,010][04017] Updated weights for policy 0, policy_version 47225 (0.0014) [2024-03-21 07:59:55,521][03784] Fps is (10 sec: 49152.0, 60 sec: 42598.3, 300 sec: 45876.1). Total num frames: 1547501568. Throughput: 0: 46451.0. Samples: 1548817600. Policy #0 lag: (min: 3.0, avg: 46.8, max: 88.0) [2024-03-21 07:59:55,522][03784] Avg episode reward: [(0, '1.284')] [2024-03-21 07:59:58,774][04017] Updated weights for policy 0, policy_version 47235 (0.0023) [2024-03-21 08:00:00,521][03784] Fps is (10 sec: 68812.1, 60 sec: 47513.5, 300 sec: 46208.4). Total num frames: 1547862016. Throughput: 0: 46417.8. Samples: 1548947700. Policy #0 lag: (min: 3.0, avg: 46.8, max: 88.0) [2024-03-21 08:00:00,522][03784] Avg episode reward: [(0, '1.055')] [2024-03-21 08:00:00,656][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000047238_1547894784.pth... [2024-03-21 08:00:00,795][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000046901_1536851968.pth [2024-03-21 08:00:05,521][03784] Fps is (10 sec: 52429.4, 60 sec: 46967.5, 300 sec: 45875.2). Total num frames: 1548025856. Throughput: 0: 47075.5. Samples: 1549248500. Policy #0 lag: (min: 3.0, avg: 46.8, max: 88.0) [2024-03-21 08:00:05,522][03784] Avg episode reward: [(0, '1.055')] [2024-03-21 08:00:06,228][04017] Updated weights for policy 0, policy_version 47245 (0.0020) [2024-03-21 08:00:10,521][03784] Fps is (10 sec: 52429.4, 60 sec: 49698.2, 300 sec: 46208.4). Total num frames: 1548386304. Throughput: 0: 45913.3. Samples: 1549495700. Policy #0 lag: (min: 0.0, avg: 50.3, max: 130.0) [2024-03-21 08:00:10,522][03784] Avg episode reward: [(0, '1.099')] [2024-03-21 08:00:14,350][04017] Updated weights for policy 0, policy_version 47255 (0.0019) [2024-03-21 08:00:15,521][03784] Fps is (10 sec: 49152.2, 60 sec: 48606.0, 300 sec: 46097.4). Total num frames: 1548517376. Throughput: 0: 46331.1. Samples: 1549654800. Policy #0 lag: (min: 0.0, avg: 50.3, max: 130.0) [2024-03-21 08:00:15,522][03784] Avg episode reward: [(0, '1.097')] [2024-03-21 08:00:20,521][03784] Fps is (10 sec: 36044.6, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 1548746752. Throughput: 0: 46411.1. Samples: 1549944200. Policy #0 lag: (min: 0.0, avg: 50.3, max: 130.0) [2024-03-21 08:00:20,523][03784] Avg episode reward: [(0, '0.648')] [2024-03-21 08:00:22,191][04017] Updated weights for policy 0, policy_version 47265 (0.0011) [2024-03-21 08:00:25,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46421.3, 300 sec: 44986.6). Total num frames: 1548845056. Throughput: 0: 46308.9. Samples: 1550217700. Policy #0 lag: (min: 0.0, avg: 50.3, max: 130.0) [2024-03-21 08:00:25,522][03784] Avg episode reward: [(0, '0.827')] [2024-03-21 08:00:30,521][03784] Fps is (10 sec: 32768.2, 60 sec: 46967.5, 300 sec: 45542.0). Total num frames: 1549074432. Throughput: 0: 46304.6. Samples: 1550354100. Policy #0 lag: (min: 0.0, avg: 50.3, max: 130.0) [2024-03-21 08:00:30,522][03784] Avg episode reward: [(0, '1.804')] [2024-03-21 08:00:30,645][04017] Updated weights for policy 0, policy_version 47275 (0.0011) [2024-03-21 08:00:35,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46967.5, 300 sec: 46097.4). Total num frames: 1549336576. Throughput: 0: 45882.2. Samples: 1550629000. Policy #0 lag: (min: 0.0, avg: 50.3, max: 130.0) [2024-03-21 08:00:35,522][03784] Avg episode reward: [(0, '1.612')] [2024-03-21 08:00:36,378][04017] Updated weights for policy 0, policy_version 47285 (0.0013) [2024-03-21 08:00:38,694][03995] Signal inference workers to stop experience collection... (31200 times) [2024-03-21 08:00:38,761][04017] InferenceWorker_p0-w0: stopping experience collection (31200 times) [2024-03-21 08:00:38,970][03995] Signal inference workers to resume experience collection... (31200 times) [2024-03-21 08:00:38,971][04017] InferenceWorker_p0-w0: resuming experience collection (31200 times) [2024-03-21 08:00:40,521][03784] Fps is (10 sec: 58981.5, 60 sec: 46967.3, 300 sec: 46208.4). Total num frames: 1549664256. Throughput: 0: 46086.7. Samples: 1550891500. Policy #0 lag: (min: 3.0, avg: 28.2, max: 58.0) [2024-03-21 08:00:40,522][03784] Avg episode reward: [(0, '1.272')] [2024-03-21 08:00:45,521][03784] Fps is (10 sec: 36044.2, 60 sec: 44782.9, 300 sec: 45097.6). Total num frames: 1549697024. Throughput: 0: 46362.2. Samples: 1551034000. Policy #0 lag: (min: 3.0, avg: 28.2, max: 58.0) [2024-03-21 08:00:45,522][03784] Avg episode reward: [(0, '0.992')] [2024-03-21 08:00:46,183][04017] Updated weights for policy 0, policy_version 47295 (0.0016) [2024-03-21 08:00:50,521][03784] Fps is (10 sec: 29491.8, 60 sec: 46421.4, 300 sec: 45097.7). Total num frames: 1549959168. Throughput: 0: 45813.4. Samples: 1551310100. Policy #0 lag: (min: 3.0, avg: 28.2, max: 58.0) [2024-03-21 08:00:50,522][03784] Avg episode reward: [(0, '0.927')] [2024-03-21 08:00:51,641][04017] Updated weights for policy 0, policy_version 47305 (0.0018) [2024-03-21 08:00:55,521][03784] Fps is (10 sec: 55705.8, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1550254080. Throughput: 0: 46255.4. Samples: 1551577200. Policy #0 lag: (min: 3.0, avg: 28.2, max: 58.0) [2024-03-21 08:00:55,522][03784] Avg episode reward: [(0, '1.586')] [2024-03-21 08:00:57,023][04017] Updated weights for policy 0, policy_version 47315 (0.0013) [2024-03-21 08:01:00,521][03784] Fps is (10 sec: 58982.4, 60 sec: 44783.1, 300 sec: 45653.1). Total num frames: 1550548992. Throughput: 0: 45386.7. Samples: 1551697200. Policy #0 lag: (min: 3.0, avg: 28.2, max: 58.0) [2024-03-21 08:01:00,522][03784] Avg episode reward: [(0, '0.774')] [2024-03-21 08:01:03,648][04017] Updated weights for policy 0, policy_version 47325 (0.0011) [2024-03-21 08:01:05,521][03784] Fps is (10 sec: 62259.9, 60 sec: 47513.6, 300 sec: 45764.1). Total num frames: 1550876672. Throughput: 0: 44851.1. Samples: 1551962500. Policy #0 lag: (min: 3.0, avg: 28.2, max: 58.0) [2024-03-21 08:01:05,522][03784] Avg episode reward: [(0, '1.276')] [2024-03-21 08:01:10,521][03784] Fps is (10 sec: 42597.4, 60 sec: 43144.4, 300 sec: 45653.0). Total num frames: 1550974976. Throughput: 0: 44913.2. Samples: 1552238800. Policy #0 lag: (min: 3.0, avg: 28.2, max: 58.0) [2024-03-21 08:01:10,522][03784] Avg episode reward: [(0, '1.162')] [2024-03-21 08:01:12,600][04017] Updated weights for policy 0, policy_version 47335 (0.0011) [2024-03-21 08:01:15,521][03784] Fps is (10 sec: 36045.0, 60 sec: 45329.1, 300 sec: 45653.1). Total num frames: 1551237120. Throughput: 0: 44742.2. Samples: 1552367500. Policy #0 lag: (min: 0.0, avg: 44.8, max: 81.0) [2024-03-21 08:01:15,522][03784] Avg episode reward: [(0, '1.373')] [2024-03-21 08:01:20,521][03784] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 45208.7). Total num frames: 1551368192. Throughput: 0: 44468.8. Samples: 1552630100. Policy #0 lag: (min: 0.0, avg: 44.8, max: 81.0) [2024-03-21 08:01:20,522][03784] Avg episode reward: [(0, '1.446')] [2024-03-21 08:01:20,846][04017] Updated weights for policy 0, policy_version 47345 (0.0016) [2024-03-21 08:01:25,440][03995] Signal inference workers to stop experience collection... (31250 times) [2024-03-21 08:01:25,501][03995] Signal inference workers to resume experience collection... (31250 times) [2024-03-21 08:01:25,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46421.3, 300 sec: 46097.4). Total num frames: 1551630336. Throughput: 0: 44702.3. Samples: 1552903100. Policy #0 lag: (min: 0.0, avg: 44.8, max: 81.0) [2024-03-21 08:01:25,523][04017] InferenceWorker_p0-w0: stopping experience collection (31250 times) [2024-03-21 08:01:25,522][03784] Avg episode reward: [(0, '0.628')] [2024-03-21 08:01:25,592][04017] InferenceWorker_p0-w0: resuming experience collection (31250 times) [2024-03-21 08:01:28,461][04017] Updated weights for policy 0, policy_version 47355 (0.0016) [2024-03-21 08:01:30,521][03784] Fps is (10 sec: 52428.7, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 1551892480. Throughput: 0: 44860.1. Samples: 1553052700. Policy #0 lag: (min: 0.0, avg: 44.8, max: 81.0) [2024-03-21 08:01:30,522][03784] Avg episode reward: [(0, '0.758')] [2024-03-21 08:01:35,521][03784] Fps is (10 sec: 32768.0, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 1551958016. Throughput: 0: 45122.1. Samples: 1553340600. Policy #0 lag: (min: 0.0, avg: 44.8, max: 81.0) [2024-03-21 08:01:35,522][03784] Avg episode reward: [(0, '1.571')] [2024-03-21 08:01:36,269][04017] Updated weights for policy 0, policy_version 47365 (0.0014) [2024-03-21 08:01:40,521][03784] Fps is (10 sec: 45875.0, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 1552351232. Throughput: 0: 44744.5. Samples: 1553590700. Policy #0 lag: (min: 0.0, avg: 44.8, max: 81.0) [2024-03-21 08:01:40,522][03784] Avg episode reward: [(0, '1.571')] [2024-03-21 08:01:40,806][04017] Updated weights for policy 0, policy_version 47375 (0.0018) [2024-03-21 08:01:45,521][03784] Fps is (10 sec: 65535.6, 60 sec: 48605.9, 300 sec: 45875.2). Total num frames: 1552613376. Throughput: 0: 44786.5. Samples: 1553712600. Policy #0 lag: (min: 2.0, avg: 38.6, max: 79.0) [2024-03-21 08:01:45,522][03784] Avg episode reward: [(0, '1.238')] [2024-03-21 08:01:48,366][04017] Updated weights for policy 0, policy_version 47385 (0.0011) [2024-03-21 08:01:50,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46967.3, 300 sec: 45542.0). Total num frames: 1552777216. Throughput: 0: 45148.8. Samples: 1553994200. Policy #0 lag: (min: 2.0, avg: 38.6, max: 79.0) [2024-03-21 08:01:50,522][03784] Avg episode reward: [(0, '1.396')] [2024-03-21 08:01:55,521][03784] Fps is (10 sec: 26214.5, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 1552875520. Throughput: 0: 45786.8. Samples: 1554299200. Policy #0 lag: (min: 2.0, avg: 38.6, max: 79.0) [2024-03-21 08:01:55,522][03784] Avg episode reward: [(0, '0.773')] [2024-03-21 08:01:57,618][04017] Updated weights for policy 0, policy_version 47395 (0.0022) [2024-03-21 08:02:00,521][03784] Fps is (10 sec: 45875.6, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 1553235968. Throughput: 0: 45897.7. Samples: 1554432900. Policy #0 lag: (min: 2.0, avg: 38.6, max: 79.0) [2024-03-21 08:02:00,522][03784] Avg episode reward: [(0, '1.324')] [2024-03-21 08:02:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000047401_1553235968.pth... [2024-03-21 08:02:00,659][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000047067_1542291456.pth [2024-03-21 08:02:03,570][04017] Updated weights for policy 0, policy_version 47405 (0.0011) [2024-03-21 08:02:05,521][03784] Fps is (10 sec: 62259.2, 60 sec: 43690.6, 300 sec: 45986.3). Total num frames: 1553498112. Throughput: 0: 46024.4. Samples: 1554701200. Policy #0 lag: (min: 2.0, avg: 38.6, max: 79.0) [2024-03-21 08:02:05,522][03784] Avg episode reward: [(0, '1.324')] [2024-03-21 08:02:10,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44783.0, 300 sec: 45986.3). Total num frames: 1553661952. Throughput: 0: 45995.5. Samples: 1554972900. Policy #0 lag: (min: 2.0, avg: 38.6, max: 79.0) [2024-03-21 08:02:10,522][03784] Avg episode reward: [(0, '1.324')] [2024-03-21 08:02:11,824][04017] Updated weights for policy 0, policy_version 47415 (0.0012) [2024-03-21 08:02:15,521][03784] Fps is (10 sec: 39322.3, 60 sec: 44236.9, 300 sec: 46097.4). Total num frames: 1553891328. Throughput: 0: 45469.1. Samples: 1555098800. Policy #0 lag: (min: 0.0, avg: 30.3, max: 63.0) [2024-03-21 08:02:15,521][03784] Avg episode reward: [(0, '0.799')] [2024-03-21 08:02:19,884][04017] Updated weights for policy 0, policy_version 47425 (0.0011) [2024-03-21 08:02:20,521][03784] Fps is (10 sec: 39321.6, 60 sec: 44782.9, 300 sec: 45542.0). Total num frames: 1554055168. Throughput: 0: 45440.0. Samples: 1555385400. Policy #0 lag: (min: 0.0, avg: 30.3, max: 63.0) [2024-03-21 08:02:20,522][03784] Avg episode reward: [(0, '1.208')] [2024-03-21 08:02:21,508][03995] Signal inference workers to stop experience collection... (31300 times) [2024-03-21 08:02:21,508][03995] Signal inference workers to resume experience collection... (31300 times) [2024-03-21 08:02:21,584][04017] InferenceWorker_p0-w0: stopping experience collection (31300 times) [2024-03-21 08:02:21,585][04017] InferenceWorker_p0-w0: resuming experience collection (31300 times) [2024-03-21 08:02:23,414][04017] Updated weights for policy 0, policy_version 47435 (0.0021) [2024-03-21 08:02:25,521][03784] Fps is (10 sec: 58981.8, 60 sec: 47513.6, 300 sec: 45875.2). Total num frames: 1554481152. Throughput: 0: 45431.2. Samples: 1555635100. Policy #0 lag: (min: 0.0, avg: 30.3, max: 63.0) [2024-03-21 08:02:25,522][03784] Avg episode reward: [(0, '1.018')] [2024-03-21 08:02:30,521][03784] Fps is (10 sec: 55705.5, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 1554612224. Throughput: 0: 45591.2. Samples: 1555764200. Policy #0 lag: (min: 0.0, avg: 30.3, max: 63.0) [2024-03-21 08:02:30,522][03784] Avg episode reward: [(0, '1.134')] [2024-03-21 08:02:31,651][04017] Updated weights for policy 0, policy_version 47445 (0.0011) [2024-03-21 08:02:35,521][03784] Fps is (10 sec: 39321.3, 60 sec: 48605.8, 300 sec: 45653.0). Total num frames: 1554874368. Throughput: 0: 45268.9. Samples: 1556031300. Policy #0 lag: (min: 0.0, avg: 30.3, max: 63.0) [2024-03-21 08:02:35,522][03784] Avg episode reward: [(0, '1.685')] [2024-03-21 08:02:39,499][04017] Updated weights for policy 0, policy_version 47455 (0.0016) [2024-03-21 08:02:40,521][03784] Fps is (10 sec: 45875.6, 60 sec: 45329.2, 300 sec: 45319.8). Total num frames: 1555070976. Throughput: 0: 44360.1. Samples: 1556295400. Policy #0 lag: (min: 0.0, avg: 30.3, max: 63.0) [2024-03-21 08:02:40,522][03784] Avg episode reward: [(0, '1.416')] [2024-03-21 08:02:43,996][04017] Updated weights for policy 0, policy_version 47465 (0.0022) [2024-03-21 08:02:45,521][03784] Fps is (10 sec: 52428.2, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 1555398656. Throughput: 0: 44075.4. Samples: 1556416300. Policy #0 lag: (min: 3.0, avg: 45.9, max: 89.0) [2024-03-21 08:02:45,522][03784] Avg episode reward: [(0, '0.574')] [2024-03-21 08:02:50,521][03784] Fps is (10 sec: 45874.9, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1555529728. Throughput: 0: 44304.5. Samples: 1556694900. Policy #0 lag: (min: 3.0, avg: 45.9, max: 89.0) [2024-03-21 08:02:50,522][03784] Avg episode reward: [(0, '1.269')] [2024-03-21 08:02:51,810][04017] Updated weights for policy 0, policy_version 47475 (0.0011) [2024-03-21 08:02:55,521][03784] Fps is (10 sec: 36045.4, 60 sec: 48059.8, 300 sec: 45764.1). Total num frames: 1555759104. Throughput: 0: 44575.6. Samples: 1556978800. Policy #0 lag: (min: 3.0, avg: 45.9, max: 89.0) [2024-03-21 08:02:55,522][03784] Avg episode reward: [(0, '1.308')] [2024-03-21 08:03:00,521][03784] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 1555988480. Throughput: 0: 45019.9. Samples: 1557124700. Policy #0 lag: (min: 3.0, avg: 45.9, max: 89.0) [2024-03-21 08:03:00,522][03784] Avg episode reward: [(0, '0.931')] [2024-03-21 08:03:00,533][04017] Updated weights for policy 0, policy_version 47485 (0.0013) [2024-03-21 08:03:05,521][03784] Fps is (10 sec: 39321.8, 60 sec: 44236.9, 300 sec: 45986.3). Total num frames: 1556152320. Throughput: 0: 44251.2. Samples: 1557376700. Policy #0 lag: (min: 3.0, avg: 45.9, max: 89.0) [2024-03-21 08:03:05,522][03784] Avg episode reward: [(0, '1.402')] [2024-03-21 08:03:09,108][04017] Updated weights for policy 0, policy_version 47495 (0.0015) [2024-03-21 08:03:10,521][03784] Fps is (10 sec: 36044.6, 60 sec: 44782.9, 300 sec: 46097.4). Total num frames: 1556348928. Throughput: 0: 44862.2. Samples: 1557653900. Policy #0 lag: (min: 3.0, avg: 45.9, max: 89.0) [2024-03-21 08:03:10,522][03784] Avg episode reward: [(0, '1.372')] [2024-03-21 08:03:15,521][03784] Fps is (10 sec: 26214.4, 60 sec: 42052.2, 300 sec: 45430.9). Total num frames: 1556414464. Throughput: 0: 45544.5. Samples: 1557813700. Policy #0 lag: (min: 0.0, avg: 46.5, max: 109.0) [2024-03-21 08:03:15,522][03784] Avg episode reward: [(0, '0.707')] [2024-03-21 08:03:20,223][04017] Updated weights for policy 0, policy_version 47505 (0.0017) [2024-03-21 08:03:20,521][03784] Fps is (10 sec: 29491.2, 60 sec: 43144.5, 300 sec: 45319.8). Total num frames: 1556643840. Throughput: 0: 45700.0. Samples: 1558087800. Policy #0 lag: (min: 0.0, avg: 46.5, max: 109.0) [2024-03-21 08:03:20,522][03784] Avg episode reward: [(0, '1.201')] [2024-03-21 08:03:22,719][03995] Signal inference workers to stop experience collection... (31350 times) [2024-03-21 08:03:22,788][03995] Signal inference workers to resume experience collection... (31350 times) [2024-03-21 08:03:22,794][04017] InferenceWorker_p0-w0: stopping experience collection (31350 times) [2024-03-21 08:03:22,838][04017] InferenceWorker_p0-w0: resuming experience collection (31350 times) [2024-03-21 08:03:25,521][03784] Fps is (10 sec: 52428.9, 60 sec: 40960.0, 300 sec: 45653.1). Total num frames: 1556938752. Throughput: 0: 45031.1. Samples: 1558321800. Policy #0 lag: (min: 0.0, avg: 46.5, max: 109.0) [2024-03-21 08:03:25,522][03784] Avg episode reward: [(0, '0.592')] [2024-03-21 08:03:26,595][04017] Updated weights for policy 0, policy_version 47515 (0.0016) [2024-03-21 08:03:30,521][03784] Fps is (10 sec: 58982.6, 60 sec: 43690.7, 300 sec: 45764.5). Total num frames: 1557233664. Throughput: 0: 45311.3. Samples: 1558455300. Policy #0 lag: (min: 0.0, avg: 46.5, max: 109.0) [2024-03-21 08:03:30,522][03784] Avg episode reward: [(0, '1.219')] [2024-03-21 08:03:31,656][04017] Updated weights for policy 0, policy_version 47525 (0.0011) [2024-03-21 08:03:35,521][03784] Fps is (10 sec: 49152.0, 60 sec: 42598.5, 300 sec: 45430.9). Total num frames: 1557430272. Throughput: 0: 44746.7. Samples: 1558708500. Policy #0 lag: (min: 0.0, avg: 46.5, max: 109.0) [2024-03-21 08:03:35,522][03784] Avg episode reward: [(0, '1.240')] [2024-03-21 08:03:38,464][04017] Updated weights for policy 0, policy_version 47535 (0.0016) [2024-03-21 08:03:40,521][03784] Fps is (10 sec: 49152.3, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 1557725184. Throughput: 0: 44535.6. Samples: 1558982900. Policy #0 lag: (min: 0.0, avg: 46.5, max: 109.0) [2024-03-21 08:03:40,522][03784] Avg episode reward: [(0, '0.704')] [2024-03-21 08:03:44,281][04017] Updated weights for policy 0, policy_version 47545 (0.0015) [2024-03-21 08:03:45,521][03784] Fps is (10 sec: 55705.6, 60 sec: 43144.7, 300 sec: 45430.9). Total num frames: 1557987328. Throughput: 0: 44324.5. Samples: 1559119300. Policy #0 lag: (min: 0.0, avg: 46.5, max: 109.0) [2024-03-21 08:03:45,522][03784] Avg episode reward: [(0, '1.545')] [2024-03-21 08:03:49,094][04017] Updated weights for policy 0, policy_version 47555 (0.0017) [2024-03-21 08:03:50,521][03784] Fps is (10 sec: 62258.4, 60 sec: 46967.5, 300 sec: 45430.9). Total num frames: 1558347776. Throughput: 0: 44531.0. Samples: 1559380600. Policy #0 lag: (min: 0.0, avg: 46.7, max: 115.0) [2024-03-21 08:03:50,522][03784] Avg episode reward: [(0, '1.419')] [2024-03-21 08:03:55,521][03784] Fps is (10 sec: 62258.9, 60 sec: 47513.6, 300 sec: 46097.3). Total num frames: 1558609920. Throughput: 0: 44348.9. Samples: 1559649600. Policy #0 lag: (min: 0.0, avg: 46.7, max: 115.0) [2024-03-21 08:03:55,522][03784] Avg episode reward: [(0, '1.482')] [2024-03-21 08:04:00,390][04017] Updated weights for policy 0, policy_version 47566 (0.0010) [2024-03-21 08:04:00,521][03784] Fps is (10 sec: 29491.1, 60 sec: 44236.7, 300 sec: 45541.9). Total num frames: 1558642688. Throughput: 0: 44239.9. Samples: 1559804500. Policy #0 lag: (min: 0.0, avg: 46.7, max: 115.0) [2024-03-21 08:04:00,522][03784] Avg episode reward: [(0, '1.060')] [2024-03-21 08:04:00,687][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000047567_1558675456.pth... [2024-03-21 08:04:00,794][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000047238_1547894784.pth [2024-03-21 08:04:05,521][03784] Fps is (10 sec: 13107.2, 60 sec: 43144.5, 300 sec: 45208.7). Total num frames: 1558740992. Throughput: 0: 44302.2. Samples: 1560081400. Policy #0 lag: (min: 0.0, avg: 46.7, max: 115.0) [2024-03-21 08:04:05,522][03784] Avg episode reward: [(0, '1.060')] [2024-03-21 08:04:10,521][03784] Fps is (10 sec: 22937.8, 60 sec: 42052.3, 300 sec: 44986.6). Total num frames: 1558872064. Throughput: 0: 45293.3. Samples: 1560360000. Policy #0 lag: (min: 0.0, avg: 46.7, max: 115.0) [2024-03-21 08:04:10,522][03784] Avg episode reward: [(0, '0.862')] [2024-03-21 08:04:13,041][04017] Updated weights for policy 0, policy_version 47576 (0.0015) [2024-03-21 08:04:15,521][03784] Fps is (10 sec: 36045.4, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 1559101440. Throughput: 0: 45735.7. Samples: 1560513400. Policy #0 lag: (min: 0.0, avg: 46.7, max: 115.0) [2024-03-21 08:04:15,522][03784] Avg episode reward: [(0, '0.588')] [2024-03-21 08:04:16,543][03995] Signal inference workers to stop experience collection... (31400 times) [2024-03-21 08:04:16,613][04017] InferenceWorker_p0-w0: stopping experience collection (31400 times) [2024-03-21 08:04:16,621][03995] Signal inference workers to resume experience collection... (31400 times) [2024-03-21 08:04:16,659][04017] InferenceWorker_p0-w0: resuming experience collection (31400 times) [2024-03-21 08:04:17,281][04017] Updated weights for policy 0, policy_version 47586 (0.0015) [2024-03-21 08:04:20,521][03784] Fps is (10 sec: 52428.9, 60 sec: 45875.3, 300 sec: 45208.7). Total num frames: 1559396352. Throughput: 0: 45817.8. Samples: 1560770300. Policy #0 lag: (min: 2.0, avg: 29.2, max: 72.0) [2024-03-21 08:04:20,522][03784] Avg episode reward: [(0, '0.765')] [2024-03-21 08:04:23,180][04017] Updated weights for policy 0, policy_version 47596 (0.0011) [2024-03-21 08:04:25,521][03784] Fps is (10 sec: 58981.8, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 1559691264. Throughput: 0: 45951.1. Samples: 1561050700. Policy #0 lag: (min: 2.0, avg: 29.2, max: 72.0) [2024-03-21 08:04:25,522][03784] Avg episode reward: [(0, '1.047')] [2024-03-21 08:04:28,119][04017] Updated weights for policy 0, policy_version 47606 (0.0013) [2024-03-21 08:04:30,521][03784] Fps is (10 sec: 68812.8, 60 sec: 47513.6, 300 sec: 45986.3). Total num frames: 1560084480. Throughput: 0: 45775.6. Samples: 1561179200. Policy #0 lag: (min: 2.0, avg: 29.2, max: 72.0) [2024-03-21 08:04:30,522][03784] Avg episode reward: [(0, '0.690')] [2024-03-21 08:04:34,186][04017] Updated weights for policy 0, policy_version 47616 (0.0015) [2024-03-21 08:04:35,521][03784] Fps is (10 sec: 68812.8, 60 sec: 49152.0, 300 sec: 45875.2). Total num frames: 1560379392. Throughput: 0: 46097.9. Samples: 1561455000. Policy #0 lag: (min: 2.0, avg: 29.2, max: 72.0) [2024-03-21 08:04:35,522][03784] Avg episode reward: [(0, '1.332')] [2024-03-21 08:04:40,521][03784] Fps is (10 sec: 49151.8, 60 sec: 47513.5, 300 sec: 45986.3). Total num frames: 1560576000. Throughput: 0: 46493.3. Samples: 1561741800. Policy #0 lag: (min: 2.0, avg: 29.2, max: 72.0) [2024-03-21 08:04:40,522][03784] Avg episode reward: [(0, '1.004')] [2024-03-21 08:04:41,842][04017] Updated weights for policy 0, policy_version 47626 (0.0018) [2024-03-21 08:04:45,521][03784] Fps is (10 sec: 32767.9, 60 sec: 45329.0, 300 sec: 45875.2). Total num frames: 1560707072. Throughput: 0: 46042.3. Samples: 1561876400. Policy #0 lag: (min: 2.0, avg: 29.2, max: 72.0) [2024-03-21 08:04:45,522][03784] Avg episode reward: [(0, '0.950')] [2024-03-21 08:04:50,521][03784] Fps is (10 sec: 22937.5, 60 sec: 40960.0, 300 sec: 45097.7). Total num frames: 1560805376. Throughput: 0: 46513.3. Samples: 1562174500. Policy #0 lag: (min: 0.0, avg: 39.6, max: 92.0) [2024-03-21 08:04:50,522][03784] Avg episode reward: [(0, '1.271')] [2024-03-21 08:04:53,362][04017] Updated weights for policy 0, policy_version 47636 (0.0009) [2024-03-21 08:04:55,521][03784] Fps is (10 sec: 32767.6, 60 sec: 40413.8, 300 sec: 44653.3). Total num frames: 1561034752. Throughput: 0: 46735.4. Samples: 1562463100. Policy #0 lag: (min: 0.0, avg: 39.6, max: 92.0) [2024-03-21 08:04:55,522][03784] Avg episode reward: [(0, '1.229')] [2024-03-21 08:05:00,521][03784] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 44653.3). Total num frames: 1561198592. Throughput: 0: 46604.3. Samples: 1562610600. Policy #0 lag: (min: 0.0, avg: 39.6, max: 92.0) [2024-03-21 08:05:00,522][03784] Avg episode reward: [(0, '1.229')] [2024-03-21 08:05:01,786][04017] Updated weights for policy 0, policy_version 47646 (0.0018) [2024-03-21 08:05:05,304][04017] Updated weights for policy 0, policy_version 47656 (0.0016) [2024-03-21 08:05:05,521][03784] Fps is (10 sec: 55705.6, 60 sec: 47513.5, 300 sec: 44764.4). Total num frames: 1561591808. Throughput: 0: 46942.0. Samples: 1562882700. Policy #0 lag: (min: 0.0, avg: 39.6, max: 92.0) [2024-03-21 08:05:05,522][03784] Avg episode reward: [(0, '0.672')] [2024-03-21 08:05:06,292][03995] Signal inference workers to stop experience collection... (31450 times) [2024-03-21 08:05:06,370][03995] Signal inference workers to resume experience collection... (31450 times) [2024-03-21 08:05:06,375][04017] InferenceWorker_p0-w0: stopping experience collection (31450 times) [2024-03-21 08:05:06,418][04017] InferenceWorker_p0-w0: resuming experience collection (31450 times) [2024-03-21 08:05:10,521][03784] Fps is (10 sec: 65535.4, 60 sec: 49698.0, 300 sec: 45208.7). Total num frames: 1561853952. Throughput: 0: 46637.6. Samples: 1563149400. Policy #0 lag: (min: 0.0, avg: 39.6, max: 92.0) [2024-03-21 08:05:10,522][03784] Avg episode reward: [(0, '1.228')] [2024-03-21 08:05:12,951][04017] Updated weights for policy 0, policy_version 47666 (0.0030) [2024-03-21 08:05:15,521][03784] Fps is (10 sec: 45875.5, 60 sec: 49151.8, 300 sec: 45097.6). Total num frames: 1562050560. Throughput: 0: 46799.9. Samples: 1563285200. Policy #0 lag: (min: 0.0, avg: 39.6, max: 92.0) [2024-03-21 08:05:15,522][03784] Avg episode reward: [(0, '1.452')] [2024-03-21 08:05:17,922][04017] Updated weights for policy 0, policy_version 47676 (0.0011) [2024-03-21 08:05:20,521][03784] Fps is (10 sec: 58983.0, 60 sec: 50790.4, 300 sec: 46097.3). Total num frames: 1562443776. Throughput: 0: 46262.2. Samples: 1563536800. Policy #0 lag: (min: 0.0, avg: 33.6, max: 70.0) [2024-03-21 08:05:20,522][03784] Avg episode reward: [(0, '1.297')] [2024-03-21 08:05:22,795][04017] Updated weights for policy 0, policy_version 47686 (0.0011) [2024-03-21 08:05:25,521][03784] Fps is (10 sec: 62259.6, 60 sec: 49698.1, 300 sec: 46097.3). Total num frames: 1562673152. Throughput: 0: 46411.1. Samples: 1563830300. Policy #0 lag: (min: 0.0, avg: 33.6, max: 70.0) [2024-03-21 08:05:25,522][03784] Avg episode reward: [(0, '0.846')] [2024-03-21 08:05:30,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45329.1, 300 sec: 45653.1). Total num frames: 1562804224. Throughput: 0: 46664.5. Samples: 1563976300. Policy #0 lag: (min: 0.0, avg: 33.6, max: 70.0) [2024-03-21 08:05:30,522][03784] Avg episode reward: [(0, '0.846')] [2024-03-21 08:05:31,258][04017] Updated weights for policy 0, policy_version 47696 (0.0018) [2024-03-21 08:05:35,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 1563033600. Throughput: 0: 46269.0. Samples: 1564256600. Policy #0 lag: (min: 0.0, avg: 33.6, max: 70.0) [2024-03-21 08:05:35,522][03784] Avg episode reward: [(0, '0.618')] [2024-03-21 08:05:40,196][04017] Updated weights for policy 0, policy_version 47706 (0.0023) [2024-03-21 08:05:40,521][03784] Fps is (10 sec: 42597.4, 60 sec: 44236.7, 300 sec: 45875.2). Total num frames: 1563230208. Throughput: 0: 46506.6. Samples: 1564555900. Policy #0 lag: (min: 0.0, avg: 33.6, max: 70.0) [2024-03-21 08:05:40,522][03784] Avg episode reward: [(0, '1.319')] [2024-03-21 08:05:45,521][03784] Fps is (10 sec: 39321.5, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 1563426816. Throughput: 0: 46813.4. Samples: 1564717200. Policy #0 lag: (min: 0.0, avg: 33.6, max: 70.0) [2024-03-21 08:05:45,522][03784] Avg episode reward: [(0, '1.458')] [2024-03-21 08:05:48,828][04017] Updated weights for policy 0, policy_version 47716 (0.0010) [2024-03-21 08:05:50,521][03784] Fps is (10 sec: 42599.2, 60 sec: 47513.6, 300 sec: 45430.9). Total num frames: 1563656192. Throughput: 0: 46784.6. Samples: 1564988000. Policy #0 lag: (min: 0.0, avg: 47.0, max: 94.0) [2024-03-21 08:05:50,522][03784] Avg episode reward: [(0, '1.229')] [2024-03-21 08:05:55,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 1563787264. Throughput: 0: 46864.6. Samples: 1565258300. Policy #0 lag: (min: 0.0, avg: 47.0, max: 94.0) [2024-03-21 08:05:55,522][03784] Avg episode reward: [(0, '1.592')] [2024-03-21 08:05:57,315][04017] Updated weights for policy 0, policy_version 47726 (0.0019) [2024-03-21 08:06:00,521][03784] Fps is (10 sec: 36044.5, 60 sec: 46967.4, 300 sec: 44542.3). Total num frames: 1564016640. Throughput: 0: 46773.3. Samples: 1565390000. Policy #0 lag: (min: 0.0, avg: 47.0, max: 94.0) [2024-03-21 08:06:00,522][03784] Avg episode reward: [(0, '0.827')] [2024-03-21 08:06:00,803][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000047731_1564049408.pth... [2024-03-21 08:06:00,871][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000047401_1553235968.pth [2024-03-21 08:06:01,863][03995] Signal inference workers to stop experience collection... (31500 times) [2024-03-21 08:06:01,931][04017] InferenceWorker_p0-w0: stopping experience collection (31500 times) [2024-03-21 08:06:02,143][03995] Signal inference workers to resume experience collection... (31500 times) [2024-03-21 08:06:02,144][04017] InferenceWorker_p0-w0: resuming experience collection (31500 times) [2024-03-21 08:06:02,453][04017] Updated weights for policy 0, policy_version 47736 (0.0012) [2024-03-21 08:06:05,521][03784] Fps is (10 sec: 65535.7, 60 sec: 47513.7, 300 sec: 45653.1). Total num frames: 1564442624. Throughput: 0: 46515.5. Samples: 1565630000. Policy #0 lag: (min: 0.0, avg: 47.0, max: 94.0) [2024-03-21 08:06:05,522][03784] Avg episode reward: [(0, '1.207')] [2024-03-21 08:06:06,506][04017] Updated weights for policy 0, policy_version 47746 (0.0016) [2024-03-21 08:06:10,521][03784] Fps is (10 sec: 78643.3, 60 sec: 49152.0, 300 sec: 45986.3). Total num frames: 1564803072. Throughput: 0: 45677.7. Samples: 1565885800. Policy #0 lag: (min: 0.0, avg: 47.0, max: 94.0) [2024-03-21 08:06:10,522][03784] Avg episode reward: [(0, '1.326')] [2024-03-21 08:06:11,820][04017] Updated weights for policy 0, policy_version 47756 (0.0027) [2024-03-21 08:06:15,521][03784] Fps is (10 sec: 65536.4, 60 sec: 50790.5, 300 sec: 46541.7). Total num frames: 1565097984. Throughput: 0: 45766.6. Samples: 1566035800. Policy #0 lag: (min: 0.0, avg: 47.0, max: 94.0) [2024-03-21 08:06:15,522][03784] Avg episode reward: [(0, '1.326')] [2024-03-21 08:06:20,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45329.0, 300 sec: 45875.2). Total num frames: 1565163520. Throughput: 0: 45864.4. Samples: 1566320500. Policy #0 lag: (min: 0.0, avg: 39.0, max: 71.0) [2024-03-21 08:06:20,522][03784] Avg episode reward: [(0, '0.570')] [2024-03-21 08:06:22,083][04017] Updated weights for policy 0, policy_version 47766 (0.0016) [2024-03-21 08:06:25,521][03784] Fps is (10 sec: 26214.3, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 1565360128. Throughput: 0: 45411.3. Samples: 1566599400. Policy #0 lag: (min: 0.0, avg: 39.0, max: 71.0) [2024-03-21 08:06:25,522][03784] Avg episode reward: [(0, '1.489')] [2024-03-21 08:06:29,775][04017] Updated weights for policy 0, policy_version 47776 (0.0011) [2024-03-21 08:06:30,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46421.2, 300 sec: 46208.4). Total num frames: 1565589504. Throughput: 0: 44904.4. Samples: 1566737900. Policy #0 lag: (min: 0.0, avg: 39.0, max: 71.0) [2024-03-21 08:06:30,522][03784] Avg episode reward: [(0, '1.576')] [2024-03-21 08:06:35,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 1565818880. Throughput: 0: 44615.5. Samples: 1566995700. Policy #0 lag: (min: 0.0, avg: 39.0, max: 71.0) [2024-03-21 08:06:35,522][03784] Avg episode reward: [(0, '0.778')] [2024-03-21 08:06:35,719][04017] Updated weights for policy 0, policy_version 47786 (0.0013) [2024-03-21 08:06:40,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.5, 300 sec: 45542.0). Total num frames: 1566048256. Throughput: 0: 44582.2. Samples: 1567264500. Policy #0 lag: (min: 0.0, avg: 39.0, max: 71.0) [2024-03-21 08:06:40,522][03784] Avg episode reward: [(0, '1.472')] [2024-03-21 08:06:45,521][03784] Fps is (10 sec: 32768.3, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1566146560. Throughput: 0: 44475.7. Samples: 1567391400. Policy #0 lag: (min: 0.0, avg: 39.0, max: 71.0) [2024-03-21 08:06:45,522][03784] Avg episode reward: [(0, '1.450')] [2024-03-21 08:06:45,818][04017] Updated weights for policy 0, policy_version 47796 (0.0011) [2024-03-21 08:06:50,521][03784] Fps is (10 sec: 26214.4, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 1566310400. Throughput: 0: 45148.9. Samples: 1567661700. Policy #0 lag: (min: 0.0, avg: 39.0, max: 71.0) [2024-03-21 08:06:50,522][03784] Avg episode reward: [(0, '1.349')] [2024-03-21 08:06:54,548][04017] Updated weights for policy 0, policy_version 47806 (0.0016) [2024-03-21 08:06:55,521][03784] Fps is (10 sec: 42598.7, 60 sec: 46421.4, 300 sec: 45208.7). Total num frames: 1566572544. Throughput: 0: 45522.4. Samples: 1567934300. Policy #0 lag: (min: 0.0, avg: 30.0, max: 70.0) [2024-03-21 08:06:55,522][03784] Avg episode reward: [(0, '1.613')] [2024-03-21 08:06:56,108][03995] Signal inference workers to stop experience collection... (31550 times) [2024-03-21 08:06:56,109][03995] Signal inference workers to resume experience collection... (31550 times) [2024-03-21 08:06:56,167][04017] InferenceWorker_p0-w0: stopping experience collection (31550 times) [2024-03-21 08:06:56,167][04017] InferenceWorker_p0-w0: resuming experience collection (31550 times) [2024-03-21 08:07:00,521][03784] Fps is (10 sec: 49152.0, 60 sec: 46421.3, 300 sec: 45097.7). Total num frames: 1566801920. Throughput: 0: 45119.9. Samples: 1568066200. Policy #0 lag: (min: 0.0, avg: 30.0, max: 70.0) [2024-03-21 08:07:00,522][03784] Avg episode reward: [(0, '1.436')] [2024-03-21 08:07:00,885][04017] Updated weights for policy 0, policy_version 47816 (0.0017) [2024-03-21 08:07:05,521][03784] Fps is (10 sec: 32767.8, 60 sec: 40960.1, 300 sec: 44875.5). Total num frames: 1566900224. Throughput: 0: 45104.5. Samples: 1568350200. Policy #0 lag: (min: 0.0, avg: 30.0, max: 70.0) [2024-03-21 08:07:05,522][03784] Avg episode reward: [(0, '1.344')] [2024-03-21 08:07:08,606][04017] Updated weights for policy 0, policy_version 47826 (0.0014) [2024-03-21 08:07:10,521][03784] Fps is (10 sec: 52428.8, 60 sec: 42052.3, 300 sec: 45541.9). Total num frames: 1567326208. Throughput: 0: 44757.7. Samples: 1568613500. Policy #0 lag: (min: 0.0, avg: 30.0, max: 70.0) [2024-03-21 08:07:10,522][03784] Avg episode reward: [(0, '1.373')] [2024-03-21 08:07:13,332][04017] Updated weights for policy 0, policy_version 47836 (0.0025) [2024-03-21 08:07:15,521][03784] Fps is (10 sec: 78642.6, 60 sec: 43144.5, 300 sec: 46208.4). Total num frames: 1567686656. Throughput: 0: 44853.4. Samples: 1568756300. Policy #0 lag: (min: 0.0, avg: 30.0, max: 70.0) [2024-03-21 08:07:15,522][03784] Avg episode reward: [(0, '1.252')] [2024-03-21 08:07:18,842][04017] Updated weights for policy 0, policy_version 47846 (0.0019) [2024-03-21 08:07:20,521][03784] Fps is (10 sec: 62259.2, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 1567948800. Throughput: 0: 44288.9. Samples: 1568988700. Policy #0 lag: (min: 0.0, avg: 30.0, max: 70.0) [2024-03-21 08:07:20,522][03784] Avg episode reward: [(0, '0.981')] [2024-03-21 08:07:25,521][03784] Fps is (10 sec: 42598.8, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 1568112640. Throughput: 0: 44086.8. Samples: 1569248400. Policy #0 lag: (min: 0.0, avg: 51.5, max: 121.0) [2024-03-21 08:07:25,522][03784] Avg episode reward: [(0, '1.690')] [2024-03-21 08:07:25,538][04017] Updated weights for policy 0, policy_version 47856 (0.0012) [2024-03-21 08:07:30,521][03784] Fps is (10 sec: 49152.5, 60 sec: 47513.7, 300 sec: 45986.3). Total num frames: 1568440320. Throughput: 0: 44111.1. Samples: 1569376400. Policy #0 lag: (min: 0.0, avg: 51.5, max: 121.0) [2024-03-21 08:07:30,522][03784] Avg episode reward: [(0, '0.907')] [2024-03-21 08:07:34,936][04017] Updated weights for policy 0, policy_version 47866 (0.0011) [2024-03-21 08:07:35,521][03784] Fps is (10 sec: 39321.6, 60 sec: 44783.0, 300 sec: 45542.0). Total num frames: 1568505856. Throughput: 0: 44515.7. Samples: 1569664900. Policy #0 lag: (min: 0.0, avg: 51.5, max: 121.0) [2024-03-21 08:07:35,522][03784] Avg episode reward: [(0, '1.342')] [2024-03-21 08:07:40,521][03784] Fps is (10 sec: 19660.8, 60 sec: 43144.6, 300 sec: 44875.5). Total num frames: 1568636928. Throughput: 0: 44448.8. Samples: 1569934500. Policy #0 lag: (min: 0.0, avg: 51.5, max: 121.0) [2024-03-21 08:07:40,522][03784] Avg episode reward: [(0, '0.778')] [2024-03-21 08:07:44,830][04017] Updated weights for policy 0, policy_version 47876 (0.0011) [2024-03-21 08:07:45,521][03784] Fps is (10 sec: 29491.1, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 1568800768. Throughput: 0: 44533.4. Samples: 1570070200. Policy #0 lag: (min: 0.0, avg: 51.5, max: 121.0) [2024-03-21 08:07:45,522][03784] Avg episode reward: [(0, '0.878')] [2024-03-21 08:07:49,971][03995] Signal inference workers to stop experience collection... (31600 times) [2024-03-21 08:07:50,040][04017] InferenceWorker_p0-w0: stopping experience collection (31600 times) [2024-03-21 08:07:50,209][03995] Signal inference workers to resume experience collection... (31600 times) [2024-03-21 08:07:50,209][04017] InferenceWorker_p0-w0: resuming experience collection (31600 times) [2024-03-21 08:07:50,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46421.4, 300 sec: 45208.7). Total num frames: 1569095680. Throughput: 0: 43891.1. Samples: 1570325300. Policy #0 lag: (min: 0.0, avg: 51.5, max: 121.0) [2024-03-21 08:07:50,522][03784] Avg episode reward: [(0, '1.408')] [2024-03-21 08:07:50,832][04017] Updated weights for policy 0, policy_version 47886 (0.0013) [2024-03-21 08:07:55,521][03784] Fps is (10 sec: 49151.5, 60 sec: 45328.9, 300 sec: 45097.6). Total num frames: 1569292288. Throughput: 0: 43855.5. Samples: 1570587000. Policy #0 lag: (min: 0.0, avg: 42.8, max: 115.0) [2024-03-21 08:07:55,522][03784] Avg episode reward: [(0, '1.186')] [2024-03-21 08:07:59,203][04017] Updated weights for policy 0, policy_version 47896 (0.0013) [2024-03-21 08:08:00,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1569554432. Throughput: 0: 43953.4. Samples: 1570734200. Policy #0 lag: (min: 0.0, avg: 42.8, max: 115.0) [2024-03-21 08:08:00,522][03784] Avg episode reward: [(0, '1.201')] [2024-03-21 08:08:00,928][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000047901_1569619968.pth... [2024-03-21 08:08:01,051][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000047567_1558675456.pth [2024-03-21 08:08:05,521][03784] Fps is (10 sec: 42598.7, 60 sec: 46967.4, 300 sec: 45319.8). Total num frames: 1569718272. Throughput: 0: 44308.9. Samples: 1570982600. Policy #0 lag: (min: 0.0, avg: 42.8, max: 115.0) [2024-03-21 08:08:05,522][03784] Avg episode reward: [(0, '1.123')] [2024-03-21 08:08:06,449][04017] Updated weights for policy 0, policy_version 47906 (0.0015) [2024-03-21 08:08:10,521][03784] Fps is (10 sec: 52428.6, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 1570078720. Throughput: 0: 44337.7. Samples: 1571243600. Policy #0 lag: (min: 0.0, avg: 42.8, max: 115.0) [2024-03-21 08:08:10,522][03784] Avg episode reward: [(0, '0.790')] [2024-03-21 08:08:10,647][04017] Updated weights for policy 0, policy_version 47916 (0.0015) [2024-03-21 08:08:15,521][03784] Fps is (10 sec: 62259.7, 60 sec: 44236.9, 300 sec: 46430.6). Total num frames: 1570340864. Throughput: 0: 44655.6. Samples: 1571385900. Policy #0 lag: (min: 0.0, avg: 42.8, max: 115.0) [2024-03-21 08:08:15,522][03784] Avg episode reward: [(0, '0.790')] [2024-03-21 08:08:18,318][04017] Updated weights for policy 0, policy_version 47926 (0.0010) [2024-03-21 08:08:20,521][03784] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 45986.3). Total num frames: 1570504704. Throughput: 0: 44362.2. Samples: 1571661200. Policy #0 lag: (min: 0.0, avg: 42.8, max: 115.0) [2024-03-21 08:08:20,522][03784] Avg episode reward: [(0, '1.457')] [2024-03-21 08:08:25,521][03784] Fps is (10 sec: 26214.3, 60 sec: 41506.1, 300 sec: 45319.8). Total num frames: 1570603008. Throughput: 0: 44862.2. Samples: 1571953300. Policy #0 lag: (min: 0.0, avg: 42.8, max: 115.0) [2024-03-21 08:08:25,522][03784] Avg episode reward: [(0, '1.095')] [2024-03-21 08:08:30,086][04017] Updated weights for policy 0, policy_version 47936 (0.0015) [2024-03-21 08:08:30,521][03784] Fps is (10 sec: 29491.3, 60 sec: 39321.6, 300 sec: 45319.8). Total num frames: 1570799616. Throughput: 0: 44915.6. Samples: 1572091400. Policy #0 lag: (min: 0.0, avg: 41.7, max: 109.0) [2024-03-21 08:08:30,522][03784] Avg episode reward: [(0, '1.269')] [2024-03-21 08:08:35,521][03784] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 45208.7). Total num frames: 1571061760. Throughput: 0: 45342.2. Samples: 1572365700. Policy #0 lag: (min: 0.0, avg: 41.7, max: 109.0) [2024-03-21 08:08:35,522][03784] Avg episode reward: [(0, '0.885')] [2024-03-21 08:08:36,355][04017] Updated weights for policy 0, policy_version 47946 (0.0016) [2024-03-21 08:08:40,521][03784] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 44875.5). Total num frames: 1571225600. Throughput: 0: 45797.9. Samples: 1572647900. Policy #0 lag: (min: 0.0, avg: 41.7, max: 109.0) [2024-03-21 08:08:40,522][03784] Avg episode reward: [(0, '0.583')] [2024-03-21 08:08:43,182][04017] Updated weights for policy 0, policy_version 47956 (0.0010) [2024-03-21 08:08:44,580][03995] Signal inference workers to stop experience collection... (31650 times) [2024-03-21 08:08:44,671][04017] InferenceWorker_p0-w0: stopping experience collection (31650 times) [2024-03-21 08:08:44,830][03995] Signal inference workers to resume experience collection... (31650 times) [2024-03-21 08:08:44,831][04017] InferenceWorker_p0-w0: resuming experience collection (31650 times) [2024-03-21 08:08:45,521][03784] Fps is (10 sec: 49151.4, 60 sec: 45875.1, 300 sec: 44764.4). Total num frames: 1571553280. Throughput: 0: 45331.0. Samples: 1572774100. Policy #0 lag: (min: 0.0, avg: 41.7, max: 109.0) [2024-03-21 08:08:45,522][03784] Avg episode reward: [(0, '1.183')] [2024-03-21 08:08:50,144][04017] Updated weights for policy 0, policy_version 47966 (0.0018) [2024-03-21 08:08:50,521][03784] Fps is (10 sec: 55705.7, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 1571782656. Throughput: 0: 46142.3. Samples: 1573059000. Policy #0 lag: (min: 0.0, avg: 41.7, max: 109.0) [2024-03-21 08:08:50,522][03784] Avg episode reward: [(0, '1.643')] [2024-03-21 08:08:55,521][03784] Fps is (10 sec: 49152.4, 60 sec: 45875.3, 300 sec: 45430.9). Total num frames: 1572044800. Throughput: 0: 46364.5. Samples: 1573330000. Policy #0 lag: (min: 0.0, avg: 41.7, max: 109.0) [2024-03-21 08:08:55,522][03784] Avg episode reward: [(0, '1.643')] [2024-03-21 08:08:55,758][04017] Updated weights for policy 0, policy_version 47976 (0.0016) [2024-03-21 08:09:00,162][04017] Updated weights for policy 0, policy_version 47986 (0.0015) [2024-03-21 08:09:00,521][03784] Fps is (10 sec: 62259.2, 60 sec: 47513.6, 300 sec: 46319.5). Total num frames: 1572405248. Throughput: 0: 46415.5. Samples: 1573474600. Policy #0 lag: (min: 0.0, avg: 38.5, max: 82.0) [2024-03-21 08:09:00,522][03784] Avg episode reward: [(0, '1.167')] [2024-03-21 08:09:05,521][03784] Fps is (10 sec: 58982.0, 60 sec: 48605.8, 300 sec: 46652.7). Total num frames: 1572634624. Throughput: 0: 46519.9. Samples: 1573754600. Policy #0 lag: (min: 0.0, avg: 38.5, max: 82.0) [2024-03-21 08:09:05,522][03784] Avg episode reward: [(0, '1.009')] [2024-03-21 08:09:07,799][04017] Updated weights for policy 0, policy_version 47996 (0.0024) [2024-03-21 08:09:10,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46967.5, 300 sec: 46763.8). Total num frames: 1572896768. Throughput: 0: 46373.3. Samples: 1574040100. Policy #0 lag: (min: 0.0, avg: 38.5, max: 82.0) [2024-03-21 08:09:10,522][03784] Avg episode reward: [(0, '1.009')] [2024-03-21 08:09:15,521][03784] Fps is (10 sec: 39321.9, 60 sec: 44782.9, 300 sec: 46208.4). Total num frames: 1573027840. Throughput: 0: 46766.7. Samples: 1574195900. Policy #0 lag: (min: 0.0, avg: 38.5, max: 82.0) [2024-03-21 08:09:15,522][03784] Avg episode reward: [(0, '0.585')] [2024-03-21 08:09:15,619][04017] Updated weights for policy 0, policy_version 48006 (0.0023) [2024-03-21 08:09:20,521][03784] Fps is (10 sec: 26214.3, 60 sec: 44236.8, 300 sec: 45653.0). Total num frames: 1573158912. Throughput: 0: 47279.9. Samples: 1574493300. Policy #0 lag: (min: 0.0, avg: 38.5, max: 82.0) [2024-03-21 08:09:20,522][03784] Avg episode reward: [(0, '1.655')] [2024-03-21 08:09:24,617][04017] Updated weights for policy 0, policy_version 48016 (0.0022) [2024-03-21 08:09:25,521][03784] Fps is (10 sec: 39321.4, 60 sec: 46967.4, 300 sec: 45208.7). Total num frames: 1573421056. Throughput: 0: 47482.2. Samples: 1574784600. Policy #0 lag: (min: 0.0, avg: 38.5, max: 82.0) [2024-03-21 08:09:25,522][03784] Avg episode reward: [(0, '1.655')] [2024-03-21 08:09:30,521][03784] Fps is (10 sec: 42598.6, 60 sec: 46421.3, 300 sec: 44764.4). Total num frames: 1573584896. Throughput: 0: 47780.1. Samples: 1574924200. Policy #0 lag: (min: 0.0, avg: 28.3, max: 69.0) [2024-03-21 08:09:30,522][03784] Avg episode reward: [(0, '1.500')] [2024-03-21 08:09:33,863][04017] Updated weights for policy 0, policy_version 48026 (0.0021) [2024-03-21 08:09:35,521][03784] Fps is (10 sec: 32768.2, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 1573748736. Throughput: 0: 47495.6. Samples: 1575196300. Policy #0 lag: (min: 0.0, avg: 28.3, max: 69.0) [2024-03-21 08:09:35,522][03784] Avg episode reward: [(0, '0.754')] [2024-03-21 08:09:37,768][03995] Signal inference workers to stop experience collection... (31700 times) [2024-03-21 08:09:37,826][03995] Signal inference workers to resume experience collection... (31700 times) [2024-03-21 08:09:37,857][04017] InferenceWorker_p0-w0: stopping experience collection (31700 times) [2024-03-21 08:09:37,900][04017] InferenceWorker_p0-w0: resuming experience collection (31700 times) [2024-03-21 08:09:39,832][04017] Updated weights for policy 0, policy_version 48036 (0.0012) [2024-03-21 08:09:40,521][03784] Fps is (10 sec: 52428.3, 60 sec: 48059.7, 300 sec: 45430.9). Total num frames: 1574109184. Throughput: 0: 47157.7. Samples: 1575452100. Policy #0 lag: (min: 0.0, avg: 28.3, max: 69.0) [2024-03-21 08:09:40,522][03784] Avg episode reward: [(0, '1.516')] [2024-03-21 08:09:45,521][03784] Fps is (10 sec: 55705.4, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 1574305792. Throughput: 0: 47153.3. Samples: 1575596500. Policy #0 lag: (min: 0.0, avg: 28.3, max: 69.0) [2024-03-21 08:09:45,522][03784] Avg episode reward: [(0, '1.175')] [2024-03-21 08:09:45,932][04017] Updated weights for policy 0, policy_version 48046 (0.0014) [2024-03-21 08:09:50,521][03784] Fps is (10 sec: 55706.1, 60 sec: 48059.7, 300 sec: 46208.5). Total num frames: 1574666240. Throughput: 0: 47197.9. Samples: 1575878500. Policy #0 lag: (min: 0.0, avg: 28.3, max: 69.0) [2024-03-21 08:09:50,522][03784] Avg episode reward: [(0, '1.181')] [2024-03-21 08:09:50,910][04017] Updated weights for policy 0, policy_version 48056 (0.0013) [2024-03-21 08:09:55,521][03784] Fps is (10 sec: 62258.3, 60 sec: 48059.6, 300 sec: 46541.6). Total num frames: 1574928384. Throughput: 0: 47130.9. Samples: 1576161000. Policy #0 lag: (min: 0.0, avg: 28.3, max: 69.0) [2024-03-21 08:09:55,522][03784] Avg episode reward: [(0, '1.181')] [2024-03-21 08:09:56,740][04017] Updated weights for policy 0, policy_version 48066 (0.0011) [2024-03-21 08:10:00,521][03784] Fps is (10 sec: 49151.9, 60 sec: 45875.2, 300 sec: 45986.3). Total num frames: 1575157760. Throughput: 0: 46613.3. Samples: 1576293500. Policy #0 lag: (min: 0.0, avg: 48.6, max: 98.0) [2024-03-21 08:10:00,522][03784] Avg episode reward: [(0, '1.179')] [2024-03-21 08:10:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000048070_1575157760.pth... [2024-03-21 08:10:00,701][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000047731_1564049408.pth [2024-03-21 08:10:05,521][03784] Fps is (10 sec: 39321.9, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 1575321600. Throughput: 0: 46397.7. Samples: 1576581200. Policy #0 lag: (min: 0.0, avg: 48.6, max: 98.0) [2024-03-21 08:10:05,522][03784] Avg episode reward: [(0, '1.287')] [2024-03-21 08:10:05,812][04017] Updated weights for policy 0, policy_version 48076 (0.0012) [2024-03-21 08:10:10,521][03784] Fps is (10 sec: 36044.9, 60 sec: 43690.7, 300 sec: 45653.1). Total num frames: 1575518208. Throughput: 0: 46237.8. Samples: 1576865300. Policy #0 lag: (min: 0.0, avg: 48.6, max: 98.0) [2024-03-21 08:10:10,522][03784] Avg episode reward: [(0, '0.842')] [2024-03-21 08:10:15,433][04017] Updated weights for policy 0, policy_version 48086 (0.0011) [2024-03-21 08:10:15,521][03784] Fps is (10 sec: 36044.4, 60 sec: 44236.6, 300 sec: 44875.5). Total num frames: 1575682048. Throughput: 0: 46195.3. Samples: 1577003000. Policy #0 lag: (min: 0.0, avg: 48.6, max: 98.0) [2024-03-21 08:10:15,523][03784] Avg episode reward: [(0, '1.381')] [2024-03-21 08:10:19,729][04017] Updated weights for policy 0, policy_version 48096 (0.0012) [2024-03-21 08:10:20,521][03784] Fps is (10 sec: 49152.0, 60 sec: 47513.6, 300 sec: 45208.7). Total num frames: 1576009728. Throughput: 0: 45904.4. Samples: 1577262000. Policy #0 lag: (min: 0.0, avg: 48.6, max: 98.0) [2024-03-21 08:10:20,522][03784] Avg episode reward: [(0, '1.219')] [2024-03-21 08:10:24,262][03995] Signal inference workers to stop experience collection... (31750 times) [2024-03-21 08:10:24,262][03995] Signal inference workers to resume experience collection... (31750 times) [2024-03-21 08:10:24,332][04017] InferenceWorker_p0-w0: stopping experience collection (31750 times) [2024-03-21 08:10:24,333][04017] InferenceWorker_p0-w0: resuming experience collection (31750 times) [2024-03-21 08:10:25,521][03784] Fps is (10 sec: 52430.0, 60 sec: 46421.4, 300 sec: 45430.9). Total num frames: 1576206336. Throughput: 0: 46524.5. Samples: 1577545700. Policy #0 lag: (min: 0.0, avg: 48.6, max: 98.0) [2024-03-21 08:10:25,522][03784] Avg episode reward: [(0, '0.865')] [2024-03-21 08:10:30,521][03784] Fps is (10 sec: 26214.4, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 1576271872. Throughput: 0: 46340.0. Samples: 1577681800. Policy #0 lag: (min: 0.0, avg: 28.1, max: 84.0) [2024-03-21 08:10:30,522][03784] Avg episode reward: [(0, '0.504')] [2024-03-21 08:10:31,031][04017] Updated weights for policy 0, policy_version 48106 (0.0021) [2024-03-21 08:10:35,521][03784] Fps is (10 sec: 32767.9, 60 sec: 46421.3, 300 sec: 45097.7). Total num frames: 1576534016. Throughput: 0: 46177.7. Samples: 1577956500. Policy #0 lag: (min: 0.0, avg: 28.1, max: 84.0) [2024-03-21 08:10:35,522][03784] Avg episode reward: [(0, '0.989')] [2024-03-21 08:10:38,981][04017] Updated weights for policy 0, policy_version 48116 (0.0018) [2024-03-21 08:10:40,521][03784] Fps is (10 sec: 52428.5, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 1576796160. Throughput: 0: 45509.0. Samples: 1578208900. Policy #0 lag: (min: 0.0, avg: 28.1, max: 84.0) [2024-03-21 08:10:40,522][03784] Avg episode reward: [(0, '1.538')] [2024-03-21 08:10:43,498][04017] Updated weights for policy 0, policy_version 48126 (0.0019) [2024-03-21 08:10:45,521][03784] Fps is (10 sec: 55705.3, 60 sec: 46421.3, 300 sec: 45541.9). Total num frames: 1577091072. Throughput: 0: 45571.0. Samples: 1578344200. Policy #0 lag: (min: 0.0, avg: 28.1, max: 84.0) [2024-03-21 08:10:45,522][03784] Avg episode reward: [(0, '1.334')] [2024-03-21 08:10:49,757][04017] Updated weights for policy 0, policy_version 48136 (0.0016) [2024-03-21 08:10:50,521][03784] Fps is (10 sec: 55705.9, 60 sec: 44782.9, 300 sec: 45986.3). Total num frames: 1577353216. Throughput: 0: 44780.1. Samples: 1578596300. Policy #0 lag: (min: 0.0, avg: 28.1, max: 84.0) [2024-03-21 08:10:50,522][03784] Avg episode reward: [(0, '0.922')] [2024-03-21 08:10:54,955][04017] Updated weights for policy 0, policy_version 48146 (0.0016) [2024-03-21 08:10:55,521][03784] Fps is (10 sec: 55706.4, 60 sec: 45329.2, 300 sec: 46208.5). Total num frames: 1577648128. Throughput: 0: 44146.7. Samples: 1578851900. Policy #0 lag: (min: 0.0, avg: 28.1, max: 84.0) [2024-03-21 08:10:55,522][03784] Avg episode reward: [(0, '1.246')] [2024-03-21 08:11:00,521][03784] Fps is (10 sec: 52429.2, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 1577877504. Throughput: 0: 43940.3. Samples: 1578980300. Policy #0 lag: (min: 0.0, avg: 28.1, max: 84.0) [2024-03-21 08:11:00,522][03784] Avg episode reward: [(0, '0.926')] [2024-03-21 08:11:01,608][04017] Updated weights for policy 0, policy_version 48156 (0.0011) [2024-03-21 08:11:05,521][03784] Fps is (10 sec: 32767.5, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 1577975808. Throughput: 0: 44444.3. Samples: 1579262000. Policy #0 lag: (min: 0.0, avg: 36.7, max: 72.0) [2024-03-21 08:11:05,523][03784] Avg episode reward: [(0, '1.271')] [2024-03-21 08:11:10,521][03784] Fps is (10 sec: 22937.5, 60 sec: 43144.6, 300 sec: 44098.0). Total num frames: 1578106880. Throughput: 0: 44311.2. Samples: 1579539700. Policy #0 lag: (min: 0.0, avg: 36.7, max: 72.0) [2024-03-21 08:11:10,522][03784] Avg episode reward: [(0, '0.990')] [2024-03-21 08:11:15,485][04017] Updated weights for policy 0, policy_version 48166 (0.0019) [2024-03-21 08:11:15,521][03784] Fps is (10 sec: 32767.8, 60 sec: 43690.7, 300 sec: 44542.2). Total num frames: 1578303488. Throughput: 0: 43946.5. Samples: 1579659400. Policy #0 lag: (min: 0.0, avg: 36.7, max: 72.0) [2024-03-21 08:11:15,522][03784] Avg episode reward: [(0, '0.773')] [2024-03-21 08:11:20,521][03784] Fps is (10 sec: 32767.8, 60 sec: 40413.9, 300 sec: 44320.1). Total num frames: 1578434560. Throughput: 0: 44426.7. Samples: 1579955700. Policy #0 lag: (min: 0.0, avg: 36.7, max: 72.0) [2024-03-21 08:11:20,522][03784] Avg episode reward: [(0, '0.773')] [2024-03-21 08:11:23,052][04017] Updated weights for policy 0, policy_version 48176 (0.0017) [2024-03-21 08:11:24,085][03995] Signal inference workers to stop experience collection... (31800 times) [2024-03-21 08:11:24,158][03995] Signal inference workers to resume experience collection... (31800 times) [2024-03-21 08:11:24,167][04017] InferenceWorker_p0-w0: stopping experience collection (31800 times) [2024-03-21 08:11:24,227][04017] InferenceWorker_p0-w0: resuming experience collection (31800 times) [2024-03-21 08:11:25,521][03784] Fps is (10 sec: 45876.3, 60 sec: 42598.5, 300 sec: 44653.4). Total num frames: 1578762240. Throughput: 0: 44351.2. Samples: 1580204700. Policy #0 lag: (min: 0.0, avg: 36.7, max: 72.0) [2024-03-21 08:11:25,522][03784] Avg episode reward: [(0, '1.391')] [2024-03-21 08:11:28,154][04017] Updated weights for policy 0, policy_version 48186 (0.0011) [2024-03-21 08:11:30,521][03784] Fps is (10 sec: 62259.3, 60 sec: 46421.3, 300 sec: 44875.5). Total num frames: 1579057152. Throughput: 0: 44137.9. Samples: 1580330400. Policy #0 lag: (min: 0.0, avg: 36.7, max: 72.0) [2024-03-21 08:11:30,522][03784] Avg episode reward: [(0, '0.898')] [2024-03-21 08:11:35,521][03784] Fps is (10 sec: 49152.0, 60 sec: 45329.2, 300 sec: 44764.4). Total num frames: 1579253760. Throughput: 0: 44900.1. Samples: 1580616800. Policy #0 lag: (min: 0.0, avg: 28.1, max: 66.0) [2024-03-21 08:11:35,522][03784] Avg episode reward: [(0, '1.347')] [2024-03-21 08:11:36,984][04017] Updated weights for policy 0, policy_version 48196 (0.0017) [2024-03-21 08:11:40,469][04017] Updated weights for policy 0, policy_version 48206 (0.0015) [2024-03-21 08:11:40,521][03784] Fps is (10 sec: 55705.3, 60 sec: 46967.5, 300 sec: 45653.0). Total num frames: 1579614208. Throughput: 0: 45362.1. Samples: 1580893200. Policy #0 lag: (min: 0.0, avg: 28.1, max: 66.0) [2024-03-21 08:11:40,522][03784] Avg episode reward: [(0, '1.347')] [2024-03-21 08:11:45,521][03784] Fps is (10 sec: 62258.8, 60 sec: 46421.4, 300 sec: 45986.3). Total num frames: 1579876352. Throughput: 0: 45431.0. Samples: 1581024700. Policy #0 lag: (min: 0.0, avg: 28.1, max: 66.0) [2024-03-21 08:11:45,522][03784] Avg episode reward: [(0, '0.688')] [2024-03-21 08:11:48,253][04017] Updated weights for policy 0, policy_version 48216 (0.0022) [2024-03-21 08:11:50,521][03784] Fps is (10 sec: 49152.0, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 1580105728. Throughput: 0: 45457.9. Samples: 1581307600. Policy #0 lag: (min: 0.0, avg: 28.1, max: 66.0) [2024-03-21 08:11:50,522][03784] Avg episode reward: [(0, '1.565')] [2024-03-21 08:11:55,521][03784] Fps is (10 sec: 29491.0, 60 sec: 42052.2, 300 sec: 45319.8). Total num frames: 1580171264. Throughput: 0: 45351.0. Samples: 1581580500. Policy #0 lag: (min: 0.0, avg: 28.1, max: 66.0) [2024-03-21 08:11:55,522][03784] Avg episode reward: [(0, '0.888')] [2024-03-21 08:11:59,038][04017] Updated weights for policy 0, policy_version 48226 (0.0012) [2024-03-21 08:12:00,521][03784] Fps is (10 sec: 22937.7, 60 sec: 40960.0, 300 sec: 45542.0). Total num frames: 1580335104. Throughput: 0: 45686.9. Samples: 1581715300. Policy #0 lag: (min: 0.0, avg: 28.1, max: 66.0) [2024-03-21 08:12:00,522][03784] Avg episode reward: [(0, '0.650')] [2024-03-21 08:12:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000048229_1580367872.pth... [2024-03-21 08:12:00,655][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000047901_1569619968.pth [2024-03-21 08:12:05,521][03784] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 44875.5). Total num frames: 1580564480. Throughput: 0: 44808.8. Samples: 1581972100. Policy #0 lag: (min: 0.0, avg: 35.8, max: 75.0) [2024-03-21 08:12:05,522][03784] Avg episode reward: [(0, '1.041')] [2024-03-21 08:12:05,695][04017] Updated weights for policy 0, policy_version 48236 (0.0024) [2024-03-21 08:12:10,521][03784] Fps is (10 sec: 52428.0, 60 sec: 45875.1, 300 sec: 44653.3). Total num frames: 1580859392. Throughput: 0: 45046.5. Samples: 1582231800. Policy #0 lag: (min: 0.0, avg: 35.8, max: 75.0) [2024-03-21 08:12:10,524][03784] Avg episode reward: [(0, '0.788')] [2024-03-21 08:12:11,172][04017] Updated weights for policy 0, policy_version 48246 (0.0012) [2024-03-21 08:12:11,885][03995] Signal inference workers to stop experience collection... (31850 times) [2024-03-21 08:12:11,886][03995] Signal inference workers to resume experience collection... (31850 times) [2024-03-21 08:12:11,928][04017] InferenceWorker_p0-w0: stopping experience collection (31850 times) [2024-03-21 08:12:11,929][04017] InferenceWorker_p0-w0: resuming experience collection (31850 times) [2024-03-21 08:12:15,521][03784] Fps is (10 sec: 62260.1, 60 sec: 48059.9, 300 sec: 44875.5). Total num frames: 1581187072. Throughput: 0: 45091.2. Samples: 1582359500. Policy #0 lag: (min: 0.0, avg: 35.8, max: 75.0) [2024-03-21 08:12:15,522][03784] Avg episode reward: [(0, '0.788')] [2024-03-21 08:12:16,487][04017] Updated weights for policy 0, policy_version 48256 (0.0013) [2024-03-21 08:12:20,521][03784] Fps is (10 sec: 39321.9, 60 sec: 46967.4, 300 sec: 44542.2). Total num frames: 1581252608. Throughput: 0: 45773.2. Samples: 1582676600. Policy #0 lag: (min: 0.0, avg: 35.8, max: 75.0) [2024-03-21 08:12:20,522][03784] Avg episode reward: [(0, '0.788')] [2024-03-21 08:12:25,521][03784] Fps is (10 sec: 29490.5, 60 sec: 45328.9, 300 sec: 44209.0). Total num frames: 1581481984. Throughput: 0: 45815.4. Samples: 1582954900. Policy #0 lag: (min: 0.0, avg: 35.8, max: 75.0) [2024-03-21 08:12:25,523][03784] Avg episode reward: [(0, '1.158')] [2024-03-21 08:12:26,614][04017] Updated weights for policy 0, policy_version 48266 (0.0012) [2024-03-21 08:12:30,526][03784] Fps is (10 sec: 58955.0, 60 sec: 46417.7, 300 sec: 45208.0). Total num frames: 1581842432. Throughput: 0: 45710.8. Samples: 1583081900. Policy #0 lag: (min: 0.0, avg: 35.8, max: 75.0) [2024-03-21 08:12:30,526][03784] Avg episode reward: [(0, '1.158')] [2024-03-21 08:12:30,954][04017] Updated weights for policy 0, policy_version 48276 (0.0020) [2024-03-21 08:12:35,521][03784] Fps is (10 sec: 58983.0, 60 sec: 46967.3, 300 sec: 45541.9). Total num frames: 1582071808. Throughput: 0: 45306.6. Samples: 1583346400. Policy #0 lag: (min: 0.0, avg: 34.1, max: 83.0) [2024-03-21 08:12:35,522][03784] Avg episode reward: [(0, '1.158')] [2024-03-21 08:12:39,481][04017] Updated weights for policy 0, policy_version 48286 (0.0011) [2024-03-21 08:12:40,521][03784] Fps is (10 sec: 45896.9, 60 sec: 44783.0, 300 sec: 45764.1). Total num frames: 1582301184. Throughput: 0: 45244.5. Samples: 1583616500. Policy #0 lag: (min: 0.0, avg: 34.1, max: 83.0) [2024-03-21 08:12:40,522][03784] Avg episode reward: [(0, '1.343')] [2024-03-21 08:12:45,521][03784] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 45319.8). Total num frames: 1582465024. Throughput: 0: 45071.1. Samples: 1583743500. Policy #0 lag: (min: 0.0, avg: 34.1, max: 83.0) [2024-03-21 08:12:45,522][03784] Avg episode reward: [(0, '0.715')] [2024-03-21 08:12:46,439][04017] Updated weights for policy 0, policy_version 48296 (0.0019) [2024-03-21 08:12:50,521][03784] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 45430.9). Total num frames: 1582694400. Throughput: 0: 45417.9. Samples: 1584015900. Policy #0 lag: (min: 0.0, avg: 34.1, max: 83.0) [2024-03-21 08:12:50,522][03784] Avg episode reward: [(0, '0.735')] [2024-03-21 08:12:54,651][04017] Updated weights for policy 0, policy_version 48306 (0.0011) [2024-03-21 08:12:55,521][03784] Fps is (10 sec: 42598.8, 60 sec: 45329.2, 300 sec: 45208.7). Total num frames: 1582891008. Throughput: 0: 45351.3. Samples: 1584272600. Policy #0 lag: (min: 0.0, avg: 34.1, max: 83.0) [2024-03-21 08:12:55,522][03784] Avg episode reward: [(0, '1.340')] [2024-03-21 08:13:00,521][03784] Fps is (10 sec: 39321.4, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1583087616. Throughput: 0: 45704.4. Samples: 1584416200. Policy #0 lag: (min: 0.0, avg: 34.1, max: 83.0) [2024-03-21 08:13:00,522][03784] Avg episode reward: [(0, '1.655')] [2024-03-21 08:13:02,672][04017] Updated weights for policy 0, policy_version 48316 (0.0019) [2024-03-21 08:13:05,521][03784] Fps is (10 sec: 49151.7, 60 sec: 46967.5, 300 sec: 45097.7). Total num frames: 1583382528. Throughput: 0: 44755.6. Samples: 1584690600. Policy #0 lag: (min: 0.0, avg: 34.1, max: 83.0) [2024-03-21 08:13:05,522][03784] Avg episode reward: [(0, '0.574')] [2024-03-21 08:13:08,584][03995] Signal inference workers to stop experience collection... (31900 times) [2024-03-21 08:13:08,644][04017] InferenceWorker_p0-w0: stopping experience collection (31900 times) [2024-03-21 08:13:08,909][03995] Signal inference workers to resume experience collection... (31900 times) [2024-03-21 08:13:08,909][04017] InferenceWorker_p0-w0: resuming experience collection (31900 times) [2024-03-21 08:13:10,413][04017] Updated weights for policy 0, policy_version 48326 (0.0013) [2024-03-21 08:13:10,521][03784] Fps is (10 sec: 45875.6, 60 sec: 44783.1, 300 sec: 44764.4). Total num frames: 1583546368. Throughput: 0: 44644.7. Samples: 1584963900. Policy #0 lag: (min: 0.0, avg: 31.7, max: 71.0) [2024-03-21 08:13:10,522][03784] Avg episode reward: [(0, '0.535')] [2024-03-21 08:13:15,210][04017] Updated weights for policy 0, policy_version 48336 (0.0012) [2024-03-21 08:13:15,521][03784] Fps is (10 sec: 52429.0, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1583906816. Throughput: 0: 44944.7. Samples: 1585104200. Policy #0 lag: (min: 0.0, avg: 31.7, max: 71.0) [2024-03-21 08:13:15,522][03784] Avg episode reward: [(0, '1.326')] [2024-03-21 08:13:19,179][04017] Updated weights for policy 0, policy_version 48346 (0.0013) [2024-03-21 08:13:20,521][03784] Fps is (10 sec: 72088.8, 60 sec: 50244.3, 300 sec: 46319.5). Total num frames: 1584267264. Throughput: 0: 44306.7. Samples: 1585340200. Policy #0 lag: (min: 0.0, avg: 31.7, max: 71.0) [2024-03-21 08:13:20,522][03784] Avg episode reward: [(0, '1.497')] [2024-03-21 08:13:25,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49152.2, 300 sec: 46208.4). Total num frames: 1584431104. Throughput: 0: 44795.6. Samples: 1585632300. Policy #0 lag: (min: 0.0, avg: 31.7, max: 71.0) [2024-03-21 08:13:25,522][03784] Avg episode reward: [(0, '1.246')] [2024-03-21 08:13:27,203][04017] Updated weights for policy 0, policy_version 48356 (0.0014) [2024-03-21 08:13:30,521][03784] Fps is (10 sec: 29491.4, 60 sec: 45332.6, 300 sec: 45764.1). Total num frames: 1584562176. Throughput: 0: 44984.5. Samples: 1585767800. Policy #0 lag: (min: 0.0, avg: 31.7, max: 71.0) [2024-03-21 08:13:30,522][03784] Avg episode reward: [(0, '1.428')] [2024-03-21 08:13:35,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 1584791552. Throughput: 0: 45357.8. Samples: 1586057000. Policy #0 lag: (min: 0.0, avg: 31.7, max: 71.0) [2024-03-21 08:13:35,522][03784] Avg episode reward: [(0, '0.738')] [2024-03-21 08:13:37,397][04017] Updated weights for policy 0, policy_version 48366 (0.0019) [2024-03-21 08:13:40,521][03784] Fps is (10 sec: 45874.8, 60 sec: 45329.0, 300 sec: 45653.1). Total num frames: 1585020928. Throughput: 0: 45722.1. Samples: 1586330100. Policy #0 lag: (min: 1.0, avg: 34.7, max: 78.0) [2024-03-21 08:13:40,522][03784] Avg episode reward: [(0, '1.520')] [2024-03-21 08:13:44,060][04017] Updated weights for policy 0, policy_version 48376 (0.0011) [2024-03-21 08:13:45,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1585184768. Throughput: 0: 45820.1. Samples: 1586478100. Policy #0 lag: (min: 1.0, avg: 34.7, max: 78.0) [2024-03-21 08:13:45,522][03784] Avg episode reward: [(0, '1.252')] [2024-03-21 08:13:50,521][03784] Fps is (10 sec: 26214.3, 60 sec: 43144.5, 300 sec: 44875.5). Total num frames: 1585283072. Throughput: 0: 46035.5. Samples: 1586762200. Policy #0 lag: (min: 1.0, avg: 34.7, max: 78.0) [2024-03-21 08:13:50,522][03784] Avg episode reward: [(0, '1.431')] [2024-03-21 08:13:55,521][03784] Fps is (10 sec: 26214.1, 60 sec: 42598.3, 300 sec: 44209.0). Total num frames: 1585446912. Throughput: 0: 45604.3. Samples: 1587016100. Policy #0 lag: (min: 1.0, avg: 34.7, max: 78.0) [2024-03-21 08:13:55,522][03784] Avg episode reward: [(0, '1.560')] [2024-03-21 08:13:57,472][04017] Updated weights for policy 0, policy_version 48386 (0.0016) [2024-03-21 08:14:00,084][03995] Signal inference workers to stop experience collection... (31950 times) [2024-03-21 08:14:00,195][04017] InferenceWorker_p0-w0: stopping experience collection (31950 times) [2024-03-21 08:14:00,319][03995] Signal inference workers to resume experience collection... (31950 times) [2024-03-21 08:14:00,320][04017] InferenceWorker_p0-w0: resuming experience collection (31950 times) [2024-03-21 08:14:00,521][03784] Fps is (10 sec: 49152.4, 60 sec: 44783.0, 300 sec: 44542.3). Total num frames: 1585774592. Throughput: 0: 45182.2. Samples: 1587137400. Policy #0 lag: (min: 1.0, avg: 34.7, max: 78.0) [2024-03-21 08:14:00,522][03784] Avg episode reward: [(0, '1.269')] [2024-03-21 08:14:00,943][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000048396_1585840128.pth... [2024-03-21 08:14:00,990][04017] Updated weights for policy 0, policy_version 48396 (0.0025) [2024-03-21 08:14:01,066][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000048070_1575157760.pth [2024-03-21 08:14:05,521][03784] Fps is (10 sec: 58982.4, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1586036736. Throughput: 0: 45282.2. Samples: 1587377900. Policy #0 lag: (min: 1.0, avg: 34.7, max: 78.0) [2024-03-21 08:14:05,522][03784] Avg episode reward: [(0, '1.084')] [2024-03-21 08:14:06,814][04017] Updated weights for policy 0, policy_version 48406 (0.0011) [2024-03-21 08:14:10,521][03784] Fps is (10 sec: 68812.5, 60 sec: 48605.8, 300 sec: 45542.0). Total num frames: 1586462720. Throughput: 0: 44622.2. Samples: 1587640300. Policy #0 lag: (min: 1.0, avg: 44.3, max: 93.0) [2024-03-21 08:14:10,522][03784] Avg episode reward: [(0, '0.698')] [2024-03-21 08:14:10,982][04017] Updated weights for policy 0, policy_version 48416 (0.0016) [2024-03-21 08:14:15,521][03784] Fps is (10 sec: 52429.1, 60 sec: 44236.8, 300 sec: 45430.9). Total num frames: 1586561024. Throughput: 0: 44928.9. Samples: 1587789600. Policy #0 lag: (min: 1.0, avg: 44.3, max: 93.0) [2024-03-21 08:14:15,522][03784] Avg episode reward: [(0, '0.976')] [2024-03-21 08:14:20,521][03784] Fps is (10 sec: 29491.0, 60 sec: 41506.1, 300 sec: 45208.7). Total num frames: 1586757632. Throughput: 0: 45082.1. Samples: 1588085700. Policy #0 lag: (min: 1.0, avg: 44.3, max: 93.0) [2024-03-21 08:14:20,522][03784] Avg episode reward: [(0, '0.766')] [2024-03-21 08:14:21,130][04017] Updated weights for policy 0, policy_version 48426 (0.0010) [2024-03-21 08:14:25,521][03784] Fps is (10 sec: 55705.3, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 1587118080. Throughput: 0: 45033.3. Samples: 1588356600. Policy #0 lag: (min: 1.0, avg: 44.3, max: 93.0) [2024-03-21 08:14:25,522][03784] Avg episode reward: [(0, '0.766')] [2024-03-21 08:14:25,752][04017] Updated weights for policy 0, policy_version 48436 (0.0012) [2024-03-21 08:14:30,521][03784] Fps is (10 sec: 68812.2, 60 sec: 48059.6, 300 sec: 46430.6). Total num frames: 1587445760. Throughput: 0: 44393.1. Samples: 1588475800. Policy #0 lag: (min: 1.0, avg: 44.3, max: 93.0) [2024-03-21 08:14:30,522][03784] Avg episode reward: [(0, '1.215')] [2024-03-21 08:14:30,814][04017] Updated weights for policy 0, policy_version 48446 (0.0011) [2024-03-21 08:14:35,521][03784] Fps is (10 sec: 52428.7, 60 sec: 47513.5, 300 sec: 45875.2). Total num frames: 1587642368. Throughput: 0: 43928.9. Samples: 1588739000. Policy #0 lag: (min: 1.0, avg: 44.3, max: 93.0) [2024-03-21 08:14:35,522][03784] Avg episode reward: [(0, '0.650')] [2024-03-21 08:14:40,521][03784] Fps is (10 sec: 26214.9, 60 sec: 44783.0, 300 sec: 45430.9). Total num frames: 1587707904. Throughput: 0: 45095.6. Samples: 1589045400. Policy #0 lag: (min: 1.0, avg: 44.3, max: 93.0) [2024-03-21 08:14:40,522][03784] Avg episode reward: [(0, '1.333')] [2024-03-21 08:14:41,994][04017] Updated weights for policy 0, policy_version 48456 (0.0010) [2024-03-21 08:14:45,521][03784] Fps is (10 sec: 19660.9, 60 sec: 44236.7, 300 sec: 44653.3). Total num frames: 1587838976. Throughput: 0: 45673.3. Samples: 1589192700. Policy #0 lag: (min: 0.0, avg: 30.5, max: 86.0) [2024-03-21 08:14:45,522][03784] Avg episode reward: [(0, '0.883')] [2024-03-21 08:14:47,667][03995] Signal inference workers to stop experience collection... (32000 times) [2024-03-21 08:14:47,773][04017] InferenceWorker_p0-w0: stopping experience collection (32000 times) [2024-03-21 08:14:47,856][03995] Signal inference workers to resume experience collection... (32000 times) [2024-03-21 08:14:47,856][04017] InferenceWorker_p0-w0: resuming experience collection (32000 times) [2024-03-21 08:14:49,441][04017] Updated weights for policy 0, policy_version 48466 (0.0018) [2024-03-21 08:14:50,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48059.8, 300 sec: 44875.5). Total num frames: 1588166656. Throughput: 0: 46793.4. Samples: 1589483600. Policy #0 lag: (min: 0.0, avg: 30.5, max: 86.0) [2024-03-21 08:14:50,522][03784] Avg episode reward: [(0, '1.171')] [2024-03-21 08:14:55,521][03784] Fps is (10 sec: 49151.9, 60 sec: 48059.7, 300 sec: 44653.3). Total num frames: 1588330496. Throughput: 0: 47240.0. Samples: 1589766100. Policy #0 lag: (min: 0.0, avg: 30.5, max: 86.0) [2024-03-21 08:14:55,522][03784] Avg episode reward: [(0, '1.171')] [2024-03-21 08:14:56,923][04017] Updated weights for policy 0, policy_version 48476 (0.0011) [2024-03-21 08:15:00,521][03784] Fps is (10 sec: 49152.1, 60 sec: 48059.7, 300 sec: 45208.7). Total num frames: 1588658176. Throughput: 0: 46808.9. Samples: 1589896000. Policy #0 lag: (min: 0.0, avg: 30.5, max: 86.0) [2024-03-21 08:15:00,522][03784] Avg episode reward: [(0, '0.747')] [2024-03-21 08:15:03,499][04017] Updated weights for policy 0, policy_version 48486 (0.0010) [2024-03-21 08:15:05,521][03784] Fps is (10 sec: 55705.7, 60 sec: 47513.6, 300 sec: 45319.8). Total num frames: 1588887552. Throughput: 0: 45864.5. Samples: 1590149600. Policy #0 lag: (min: 0.0, avg: 30.5, max: 86.0) [2024-03-21 08:15:05,522][03784] Avg episode reward: [(0, '1.755')] [2024-03-21 08:15:09,314][04017] Updated weights for policy 0, policy_version 48496 (0.0021) [2024-03-21 08:15:10,521][03784] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 1589116928. Throughput: 0: 46331.2. Samples: 1590441500. Policy #0 lag: (min: 0.0, avg: 30.5, max: 86.0) [2024-03-21 08:15:10,522][03784] Avg episode reward: [(0, '1.755')] [2024-03-21 08:15:15,521][03784] Fps is (10 sec: 49152.3, 60 sec: 46967.5, 300 sec: 45319.8). Total num frames: 1589379072. Throughput: 0: 46529.1. Samples: 1590569600. Policy #0 lag: (min: 0.0, avg: 40.9, max: 83.0) [2024-03-21 08:15:15,522][03784] Avg episode reward: [(0, '1.301')] [2024-03-21 08:15:16,987][04017] Updated weights for policy 0, policy_version 48506 (0.0014) [2024-03-21 08:15:20,521][03784] Fps is (10 sec: 52428.3, 60 sec: 48059.8, 300 sec: 45542.0). Total num frames: 1589641216. Throughput: 0: 46868.9. Samples: 1590848100. Policy #0 lag: (min: 0.0, avg: 40.9, max: 83.0) [2024-03-21 08:15:20,522][03784] Avg episode reward: [(0, '1.789')] [2024-03-21 08:15:23,928][04017] Updated weights for policy 0, policy_version 48516 (0.0012) [2024-03-21 08:15:25,521][03784] Fps is (10 sec: 45874.9, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 1589837824. Throughput: 0: 45748.8. Samples: 1591104100. Policy #0 lag: (min: 0.0, avg: 40.9, max: 83.0) [2024-03-21 08:15:25,522][03784] Avg episode reward: [(0, '0.866')] [2024-03-21 08:15:30,521][03784] Fps is (10 sec: 36045.1, 60 sec: 42598.5, 300 sec: 45653.1). Total num frames: 1590001664. Throughput: 0: 45591.2. Samples: 1591244300. Policy #0 lag: (min: 0.0, avg: 40.9, max: 83.0) [2024-03-21 08:15:30,522][03784] Avg episode reward: [(0, '0.734')] [2024-03-21 08:15:35,299][04017] Updated weights for policy 0, policy_version 48526 (0.0019) [2024-03-21 08:15:35,521][03784] Fps is (10 sec: 26214.4, 60 sec: 40960.0, 300 sec: 45097.7). Total num frames: 1590099968. Throughput: 0: 45586.7. Samples: 1591535000. Policy #0 lag: (min: 0.0, avg: 40.9, max: 83.0) [2024-03-21 08:15:35,522][03784] Avg episode reward: [(0, '0.759')] [2024-03-21 08:15:40,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 1590362112. Throughput: 0: 45353.5. Samples: 1591807000. Policy #0 lag: (min: 0.0, avg: 40.9, max: 83.0) [2024-03-21 08:15:40,522][03784] Avg episode reward: [(0, '1.451')] [2024-03-21 08:15:41,302][04017] Updated weights for policy 0, policy_version 48536 (0.0017) [2024-03-21 08:15:42,359][03995] Signal inference workers to stop experience collection... (32050 times) [2024-03-21 08:15:42,420][04017] InferenceWorker_p0-w0: stopping experience collection (32050 times) [2024-03-21 08:15:42,626][03995] Signal inference workers to resume experience collection... (32050 times) [2024-03-21 08:15:42,626][04017] InferenceWorker_p0-w0: resuming experience collection (32050 times) [2024-03-21 08:15:45,521][03784] Fps is (10 sec: 62258.8, 60 sec: 48059.7, 300 sec: 45319.8). Total num frames: 1590722560. Throughput: 0: 44979.9. Samples: 1591920100. Policy #0 lag: (min: 2.0, avg: 55.5, max: 108.0) [2024-03-21 08:15:45,522][03784] Avg episode reward: [(0, '1.259')] [2024-03-21 08:15:45,852][04017] Updated weights for policy 0, policy_version 48546 (0.0011) [2024-03-21 08:15:50,521][03784] Fps is (10 sec: 65535.4, 60 sec: 47513.6, 300 sec: 45319.8). Total num frames: 1591017472. Throughput: 0: 45384.5. Samples: 1592191900. Policy #0 lag: (min: 2.0, avg: 55.5, max: 108.0) [2024-03-21 08:15:50,522][03784] Avg episode reward: [(0, '1.259')] [2024-03-21 08:15:51,671][04017] Updated weights for policy 0, policy_version 48556 (0.0011) [2024-03-21 08:15:55,521][03784] Fps is (10 sec: 49152.0, 60 sec: 48059.7, 300 sec: 45208.7). Total num frames: 1591214080. Throughput: 0: 45279.9. Samples: 1592479100. Policy #0 lag: (min: 2.0, avg: 55.5, max: 108.0) [2024-03-21 08:15:55,522][03784] Avg episode reward: [(0, '1.538')] [2024-03-21 08:16:00,521][03784] Fps is (10 sec: 36044.5, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 1591377920. Throughput: 0: 45493.2. Samples: 1592616800. Policy #0 lag: (min: 2.0, avg: 55.5, max: 108.0) [2024-03-21 08:16:00,523][03784] Avg episode reward: [(0, '1.538')] [2024-03-21 08:16:00,698][04017] Updated weights for policy 0, policy_version 48566 (0.0012) [2024-03-21 08:16:01,024][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000048567_1591443456.pth... [2024-03-21 08:16:01,157][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000048229_1580367872.pth [2024-03-21 08:16:05,521][03784] Fps is (10 sec: 32768.5, 60 sec: 44236.9, 300 sec: 45542.0). Total num frames: 1591541760. Throughput: 0: 45149.0. Samples: 1592879800. Policy #0 lag: (min: 2.0, avg: 55.5, max: 108.0) [2024-03-21 08:16:05,522][03784] Avg episode reward: [(0, '0.603')] [2024-03-21 08:16:09,580][04017] Updated weights for policy 0, policy_version 48576 (0.0017) [2024-03-21 08:16:10,521][03784] Fps is (10 sec: 42598.8, 60 sec: 44782.9, 300 sec: 45764.1). Total num frames: 1591803904. Throughput: 0: 45057.8. Samples: 1593131700. Policy #0 lag: (min: 2.0, avg: 55.5, max: 108.0) [2024-03-21 08:16:10,522][03784] Avg episode reward: [(0, '1.004')] [2024-03-21 08:16:15,521][03784] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 45875.2). Total num frames: 1591967744. Throughput: 0: 45195.5. Samples: 1593278100. Policy #0 lag: (min: 0.0, avg: 35.1, max: 67.0) [2024-03-21 08:16:15,522][03784] Avg episode reward: [(0, '0.512')] [2024-03-21 08:16:16,651][04017] Updated weights for policy 0, policy_version 48586 (0.0015) [2024-03-21 08:16:20,521][03784] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 45541.9). Total num frames: 1592197120. Throughput: 0: 44444.4. Samples: 1593535000. Policy #0 lag: (min: 0.0, avg: 35.1, max: 67.0) [2024-03-21 08:16:20,522][03784] Avg episode reward: [(0, '1.131')] [2024-03-21 08:16:24,567][04017] Updated weights for policy 0, policy_version 48596 (0.0020) [2024-03-21 08:16:25,521][03784] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 45208.7). Total num frames: 1592393728. Throughput: 0: 44633.3. Samples: 1593815500. Policy #0 lag: (min: 0.0, avg: 35.1, max: 67.0) [2024-03-21 08:16:25,522][03784] Avg episode reward: [(0, '1.443')] [2024-03-21 08:16:29,766][04017] Updated weights for policy 0, policy_version 48606 (0.0010) [2024-03-21 08:16:30,521][03784] Fps is (10 sec: 55705.8, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 1592754176. Throughput: 0: 45293.4. Samples: 1593958300. Policy #0 lag: (min: 0.0, avg: 35.1, max: 67.0) [2024-03-21 08:16:30,523][03784] Avg episode reward: [(0, '0.861')] [2024-03-21 08:16:35,521][03784] Fps is (10 sec: 62258.8, 60 sec: 48605.8, 300 sec: 45430.9). Total num frames: 1593016320. Throughput: 0: 45284.4. Samples: 1594229700. Policy #0 lag: (min: 0.0, avg: 35.1, max: 67.0) [2024-03-21 08:16:35,522][03784] Avg episode reward: [(0, '1.034')] [2024-03-21 08:16:36,855][04017] Updated weights for policy 0, policy_version 48616 (0.0015) [2024-03-21 08:16:40,521][03784] Fps is (10 sec: 42598.6, 60 sec: 46967.4, 300 sec: 45097.7). Total num frames: 1593180160. Throughput: 0: 45466.8. Samples: 1594525100. Policy #0 lag: (min: 0.0, avg: 35.1, max: 67.0) [2024-03-21 08:16:40,522][03784] Avg episode reward: [(0, '1.034')] [2024-03-21 08:16:41,683][03995] Signal inference workers to stop experience collection... (32100 times) [2024-03-21 08:16:41,805][03995] Signal inference workers to resume experience collection... (32100 times) [2024-03-21 08:16:41,848][04017] InferenceWorker_p0-w0: stopping experience collection (32100 times) [2024-03-21 08:16:41,902][04017] InferenceWorker_p0-w0: resuming experience collection (32100 times) [2024-03-21 08:16:43,132][04017] Updated weights for policy 0, policy_version 48626 (0.0013) [2024-03-21 08:16:45,521][03784] Fps is (10 sec: 42598.8, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 1593442304. Throughput: 0: 45191.2. Samples: 1594650400. Policy #0 lag: (min: 0.0, avg: 35.1, max: 67.0) [2024-03-21 08:16:45,522][03784] Avg episode reward: [(0, '1.604')] [2024-03-21 08:16:50,523][03784] Fps is (10 sec: 39313.1, 60 sec: 42596.9, 300 sec: 45430.6). Total num frames: 1593573376. Throughput: 0: 45386.7. Samples: 1594922300. Policy #0 lag: (min: 0.0, avg: 37.0, max: 74.0) [2024-03-21 08:16:50,524][03784] Avg episode reward: [(0, '1.604')] [2024-03-21 08:16:55,272][04017] Updated weights for policy 0, policy_version 48636 (0.0016) [2024-03-21 08:16:55,521][03784] Fps is (10 sec: 26214.1, 60 sec: 41506.1, 300 sec: 45319.8). Total num frames: 1593704448. Throughput: 0: 45622.1. Samples: 1595184700. Policy #0 lag: (min: 0.0, avg: 37.0, max: 74.0) [2024-03-21 08:16:55,522][03784] Avg episode reward: [(0, '1.017')] [2024-03-21 08:17:00,521][03784] Fps is (10 sec: 39329.6, 60 sec: 43144.6, 300 sec: 45430.9). Total num frames: 1593966592. Throughput: 0: 44826.6. Samples: 1595295300. Policy #0 lag: (min: 0.0, avg: 37.0, max: 74.0) [2024-03-21 08:17:00,522][03784] Avg episode reward: [(0, '1.284')] [2024-03-21 08:17:01,212][04017] Updated weights for policy 0, policy_version 48646 (0.0012) [2024-03-21 08:17:05,521][03784] Fps is (10 sec: 45874.6, 60 sec: 43690.4, 300 sec: 45097.6). Total num frames: 1594163200. Throughput: 0: 44586.5. Samples: 1595541400. Policy #0 lag: (min: 0.0, avg: 37.0, max: 74.0) [2024-03-21 08:17:05,522][03784] Avg episode reward: [(0, '1.151')] [2024-03-21 08:17:08,503][04017] Updated weights for policy 0, policy_version 48656 (0.0011) [2024-03-21 08:17:10,521][03784] Fps is (10 sec: 55705.2, 60 sec: 45329.0, 300 sec: 45208.7). Total num frames: 1594523648. Throughput: 0: 44668.8. Samples: 1595825600. Policy #0 lag: (min: 0.0, avg: 37.0, max: 74.0) [2024-03-21 08:17:10,523][03784] Avg episode reward: [(0, '0.716')] [2024-03-21 08:17:12,713][04017] Updated weights for policy 0, policy_version 48666 (0.0016) [2024-03-21 08:17:15,521][03784] Fps is (10 sec: 62260.6, 60 sec: 46967.5, 300 sec: 45875.2). Total num frames: 1594785792. Throughput: 0: 44542.2. Samples: 1595962700. Policy #0 lag: (min: 0.0, avg: 37.0, max: 74.0) [2024-03-21 08:17:15,522][03784] Avg episode reward: [(0, '1.456')] [2024-03-21 08:17:20,521][03784] Fps is (10 sec: 39322.5, 60 sec: 45329.2, 300 sec: 45542.0). Total num frames: 1594916864. Throughput: 0: 44769.1. Samples: 1596244300. Policy #0 lag: (min: 0.0, avg: 40.4, max: 82.0) [2024-03-21 08:17:20,522][03784] Avg episode reward: [(0, '1.456')] [2024-03-21 08:17:22,604][04017] Updated weights for policy 0, policy_version 48676 (0.0016) [2024-03-21 08:17:25,521][03784] Fps is (10 sec: 29491.2, 60 sec: 44782.9, 300 sec: 44876.2). Total num frames: 1595080704. Throughput: 0: 43842.2. Samples: 1596498000. Policy #0 lag: (min: 0.0, avg: 40.4, max: 82.0) [2024-03-21 08:17:25,522][03784] Avg episode reward: [(0, '0.967')] [2024-03-21 08:17:30,521][03784] Fps is (10 sec: 36044.6, 60 sec: 42052.3, 300 sec: 44764.4). Total num frames: 1595277312. Throughput: 0: 43760.0. Samples: 1596619600. Policy #0 lag: (min: 0.0, avg: 40.4, max: 82.0) [2024-03-21 08:17:30,522][03784] Avg episode reward: [(0, '0.638')] [2024-03-21 08:17:31,551][04017] Updated weights for policy 0, policy_version 48686 (0.0012) [2024-03-21 08:17:35,521][03784] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 44764.4). Total num frames: 1595506688. Throughput: 0: 43462.1. Samples: 1596878000. Policy #0 lag: (min: 0.0, avg: 40.4, max: 82.0) [2024-03-21 08:17:35,522][03784] Avg episode reward: [(0, '1.442')] [2024-03-21 08:17:39,320][04017] Updated weights for policy 0, policy_version 48696 (0.0012) [2024-03-21 08:17:40,521][03784] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 44875.5). Total num frames: 1595703296. Throughput: 0: 43657.8. Samples: 1597149300. Policy #0 lag: (min: 0.0, avg: 40.4, max: 82.0) [2024-03-21 08:17:40,522][03784] Avg episode reward: [(0, '0.531')] [2024-03-21 08:17:45,431][04017] Updated weights for policy 0, policy_version 48706 (0.0015) [2024-03-21 08:17:45,521][03784] Fps is (10 sec: 49151.4, 60 sec: 42598.3, 300 sec: 45097.6). Total num frames: 1595998208. Throughput: 0: 44148.9. Samples: 1597282000. Policy #0 lag: (min: 0.0, avg: 40.4, max: 82.0) [2024-03-21 08:17:45,522][03784] Avg episode reward: [(0, '1.273')] [2024-03-21 08:17:49,455][03995] Signal inference workers to stop experience collection... (32150 times) [2024-03-21 08:17:49,455][03995] Signal inference workers to resume experience collection... (32150 times) [2024-03-21 08:17:49,539][04017] InferenceWorker_p0-w0: stopping experience collection (32150 times) [2024-03-21 08:17:49,539][04017] InferenceWorker_p0-w0: resuming experience collection (32150 times) [2024-03-21 08:17:50,521][03784] Fps is (10 sec: 58982.5, 60 sec: 45330.6, 300 sec: 45430.9). Total num frames: 1596293120. Throughput: 0: 44351.3. Samples: 1597537200. Policy #0 lag: (min: 1.0, avg: 65.5, max: 118.0) [2024-03-21 08:17:50,522][03784] Avg episode reward: [(0, '1.040')] [2024-03-21 08:17:50,890][04017] Updated weights for policy 0, policy_version 48716 (0.0021) [2024-03-21 08:17:55,361][04017] Updated weights for policy 0, policy_version 48726 (0.0011) [2024-03-21 08:17:55,521][03784] Fps is (10 sec: 65537.0, 60 sec: 49152.1, 300 sec: 45986.3). Total num frames: 1596653568. Throughput: 0: 43862.4. Samples: 1597799400. Policy #0 lag: (min: 1.0, avg: 65.5, max: 118.0) [2024-03-21 08:17:55,522][03784] Avg episode reward: [(0, '1.230')] [2024-03-21 08:18:00,521][03784] Fps is (10 sec: 45875.5, 60 sec: 46421.4, 300 sec: 45319.8). Total num frames: 1596751872. Throughput: 0: 44097.8. Samples: 1597947100. Policy #0 lag: (min: 1.0, avg: 65.5, max: 118.0) [2024-03-21 08:18:00,522][03784] Avg episode reward: [(0, '1.043')] [2024-03-21 08:18:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000048729_1596751872.pth... [2024-03-21 08:18:00,677][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000048396_1585840128.pth [2024-03-21 08:18:05,521][03784] Fps is (10 sec: 29491.3, 60 sec: 46421.6, 300 sec: 45430.9). Total num frames: 1596948480. Throughput: 0: 44164.4. Samples: 1598231700. Policy #0 lag: (min: 1.0, avg: 65.5, max: 118.0) [2024-03-21 08:18:05,522][03784] Avg episode reward: [(0, '1.126')] [2024-03-21 08:18:05,685][04017] Updated weights for policy 0, policy_version 48736 (0.0015) [2024-03-21 08:18:10,521][03784] Fps is (10 sec: 45875.2, 60 sec: 44783.0, 300 sec: 45097.6). Total num frames: 1597210624. Throughput: 0: 44831.1. Samples: 1598515400. Policy #0 lag: (min: 1.0, avg: 65.5, max: 118.0) [2024-03-21 08:18:10,531][03784] Avg episode reward: [(0, '1.126')] [2024-03-21 08:18:15,129][04017] Updated weights for policy 0, policy_version 48746 (0.0012) [2024-03-21 08:18:15,521][03784] Fps is (10 sec: 39320.5, 60 sec: 42598.3, 300 sec: 44320.1). Total num frames: 1597341696. Throughput: 0: 45677.5. Samples: 1598675100. Policy #0 lag: (min: 1.0, avg: 65.5, max: 118.0) [2024-03-21 08:18:15,523][03784] Avg episode reward: [(0, '0.929')] [2024-03-21 08:18:19,138][04017] Updated weights for policy 0, policy_version 48756 (0.0013) [2024-03-21 08:18:20,521][03784] Fps is (10 sec: 49151.8, 60 sec: 46421.2, 300 sec: 44986.6). Total num frames: 1597702144. Throughput: 0: 45524.3. Samples: 1598926600. Policy #0 lag: (min: 1.0, avg: 65.5, max: 118.0) [2024-03-21 08:18:20,522][03784] Avg episode reward: [(0, '1.273')] [2024-03-21 08:18:24,194][04017] Updated weights for policy 0, policy_version 48766 (0.0031) [2024-03-21 08:18:25,521][03784] Fps is (10 sec: 68814.6, 60 sec: 49152.1, 300 sec: 45653.0). Total num frames: 1598029824. Throughput: 0: 44966.8. Samples: 1599172800. Policy #0 lag: (min: 1.0, avg: 52.1, max: 101.0) [2024-03-21 08:18:25,522][03784] Avg episode reward: [(0, '1.035')] [2024-03-21 08:18:30,521][03784] Fps is (10 sec: 42598.7, 60 sec: 47513.6, 300 sec: 45208.7). Total num frames: 1598128128. Throughput: 0: 45460.1. Samples: 1599327700. Policy #0 lag: (min: 1.0, avg: 52.1, max: 101.0) [2024-03-21 08:18:30,522][03784] Avg episode reward: [(0, '1.181')] [2024-03-21 08:18:32,890][04017] Updated weights for policy 0, policy_version 48776 (0.0017) [2024-03-21 08:18:35,521][03784] Fps is (10 sec: 32767.8, 60 sec: 47513.6, 300 sec: 45208.7). Total num frames: 1598357504. Throughput: 0: 46511.2. Samples: 1599630200. Policy #0 lag: (min: 1.0, avg: 52.1, max: 101.0) [2024-03-21 08:18:35,522][03784] Avg episode reward: [(0, '1.181')] [2024-03-21 08:18:40,239][03995] Signal inference workers to stop experience collection... (32200 times) [2024-03-21 08:18:40,240][03995] Signal inference workers to resume experience collection... (32200 times) [2024-03-21 08:18:40,304][04017] InferenceWorker_p0-w0: stopping experience collection (32200 times) [2024-03-21 08:18:40,311][04017] InferenceWorker_p0-w0: resuming experience collection (32200 times) [2024-03-21 08:18:40,521][03784] Fps is (10 sec: 39321.6, 60 sec: 46967.6, 300 sec: 45208.7). Total num frames: 1598521344. Throughput: 0: 47184.4. Samples: 1599922700. Policy #0 lag: (min: 1.0, avg: 52.1, max: 101.0) [2024-03-21 08:18:40,522][03784] Avg episode reward: [(0, '1.181')] [2024-03-21 08:18:41,415][04017] Updated weights for policy 0, policy_version 48786 (0.0016) [2024-03-21 08:18:45,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46967.6, 300 sec: 45875.2). Total num frames: 1598816256. Throughput: 0: 47111.1. Samples: 1600067100. Policy #0 lag: (min: 1.0, avg: 52.1, max: 101.0) [2024-03-21 08:18:45,522][03784] Avg episode reward: [(0, '1.269')] [2024-03-21 08:18:47,992][04017] Updated weights for policy 0, policy_version 48796 (0.0016) [2024-03-21 08:18:50,521][03784] Fps is (10 sec: 55704.7, 60 sec: 46421.3, 300 sec: 46208.4). Total num frames: 1599078400. Throughput: 0: 46928.6. Samples: 1600343500. Policy #0 lag: (min: 1.0, avg: 52.1, max: 101.0) [2024-03-21 08:18:50,522][03784] Avg episode reward: [(0, '0.479')] [2024-03-21 08:18:55,521][03784] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 45542.0). Total num frames: 1599209472. Throughput: 0: 47097.8. Samples: 1600634800. Policy #0 lag: (min: 0.0, avg: 47.5, max: 99.0) [2024-03-21 08:18:55,522][03784] Avg episode reward: [(0, '1.188')] [2024-03-21 08:18:56,406][04017] Updated weights for policy 0, policy_version 48806 (0.0011) [2024-03-21 08:19:00,521][03784] Fps is (10 sec: 49152.6, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 1599569920. Throughput: 0: 46575.7. Samples: 1600771000. Policy #0 lag: (min: 0.0, avg: 47.5, max: 99.0) [2024-03-21 08:19:00,522][03784] Avg episode reward: [(0, '1.188')] [2024-03-21 08:19:01,509][04017] Updated weights for policy 0, policy_version 48816 (0.0012) [2024-03-21 08:19:05,521][03784] Fps is (10 sec: 52428.8, 60 sec: 46421.3, 300 sec: 44986.6). Total num frames: 1599733760. Throughput: 0: 46777.9. Samples: 1601031600. Policy #0 lag: (min: 0.0, avg: 47.5, max: 99.0) [2024-03-21 08:19:05,522][03784] Avg episode reward: [(0, '0.609')] [2024-03-21 08:19:08,437][04017] Updated weights for policy 0, policy_version 48826 (0.0015) [2024-03-21 08:19:10,521][03784] Fps is (10 sec: 42598.3, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1599995904. Throughput: 0: 47422.1. Samples: 1601306800. Policy #0 lag: (min: 0.0, avg: 47.5, max: 99.0) [2024-03-21 08:19:10,522][03784] Avg episode reward: [(0, '0.794')] [2024-03-21 08:19:15,521][03784] Fps is (10 sec: 49152.0, 60 sec: 48059.9, 300 sec: 45653.1). Total num frames: 1600225280. Throughput: 0: 47235.6. Samples: 1601453300. Policy #0 lag: (min: 0.0, avg: 47.5, max: 99.0) [2024-03-21 08:19:15,522][03784] Avg episode reward: [(0, '1.191')] [2024-03-21 08:19:16,072][04017] Updated weights for policy 0, policy_version 48836 (0.0012) [2024-03-21 08:19:20,521][03784] Fps is (10 sec: 49152.0, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1600487424. Throughput: 0: 46488.8. Samples: 1601722200. Policy #0 lag: (min: 0.0, avg: 47.5, max: 99.0) [2024-03-21 08:19:20,522][03784] Avg episode reward: [(0, '1.202')] [2024-03-21 08:19:22,788][04017] Updated weights for policy 0, policy_version 48846 (0.0014) [2024-03-21 08:19:25,521][03784] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 1600651264. Throughput: 0: 46691.1. Samples: 1602023800. Policy #0 lag: (min: 0.0, avg: 51.3, max: 110.0) [2024-03-21 08:19:25,522][03784] Avg episode reward: [(0, '1.232')] [2024-03-21 08:19:28,176][04017] Updated weights for policy 0, policy_version 48856 (0.0012) [2024-03-21 08:19:29,400][03995] Signal inference workers to stop experience collection... (32250 times) [2024-03-21 08:19:29,400][03995] Signal inference workers to resume experience collection... (32250 times) [2024-03-21 08:19:29,447][04017] InferenceWorker_p0-w0: stopping experience collection (32250 times) [2024-03-21 08:19:29,447][04017] InferenceWorker_p0-w0: resuming experience collection (32250 times) [2024-03-21 08:19:30,521][03784] Fps is (10 sec: 55704.9, 60 sec: 48605.7, 300 sec: 45430.9). Total num frames: 1601044480. Throughput: 0: 46322.0. Samples: 1602151600. Policy #0 lag: (min: 0.0, avg: 51.3, max: 110.0) [2024-03-21 08:19:30,522][03784] Avg episode reward: [(0, '0.884')] [2024-03-21 08:19:32,943][04017] Updated weights for policy 0, policy_version 48866 (0.0020) [2024-03-21 08:19:35,521][03784] Fps is (10 sec: 72089.7, 60 sec: 50244.3, 300 sec: 46319.5). Total num frames: 1601372160. Throughput: 0: 45755.7. Samples: 1602402500. Policy #0 lag: (min: 0.0, avg: 51.3, max: 110.0) [2024-03-21 08:19:35,522][03784] Avg episode reward: [(0, '1.510')] [2024-03-21 08:19:40,521][03784] Fps is (10 sec: 39322.3, 60 sec: 48605.8, 300 sec: 46097.4). Total num frames: 1601437696. Throughput: 0: 46324.4. Samples: 1602719400. Policy #0 lag: (min: 0.0, avg: 51.3, max: 110.0) [2024-03-21 08:19:40,522][03784] Avg episode reward: [(0, '1.452')] [2024-03-21 08:19:45,374][04017] Updated weights for policy 0, policy_version 48876 (0.0012) [2024-03-21 08:19:45,521][03784] Fps is (10 sec: 19660.6, 60 sec: 45875.1, 300 sec: 45430.9). Total num frames: 1601568768. Throughput: 0: 46884.4. Samples: 1602880800. Policy #0 lag: (min: 0.0, avg: 51.3, max: 110.0) [2024-03-21 08:19:45,523][03784] Avg episode reward: [(0, '1.452')] [2024-03-21 08:19:50,521][03784] Fps is (10 sec: 29491.3, 60 sec: 44236.9, 300 sec: 45430.9). Total num frames: 1601732608. Throughput: 0: 47477.8. Samples: 1603168100. Policy #0 lag: (min: 0.0, avg: 51.3, max: 110.0) [2024-03-21 08:19:50,522][03784] Avg episode reward: [(0, '1.452')] [2024-03-21 08:19:53,814][04017] Updated weights for policy 0, policy_version 48886 (0.0018) [2024-03-21 08:19:55,521][03784] Fps is (10 sec: 45875.6, 60 sec: 46967.5, 300 sec: 45319.8). Total num frames: 1602027520. Throughput: 0: 46957.9. Samples: 1603419900. Policy #0 lag: (min: 0.0, avg: 26.1, max: 65.0) [2024-03-21 08:19:55,522][03784] Avg episode reward: [(0, '0.583')] [2024-03-21 08:20:00,225][04017] Updated weights for policy 0, policy_version 48896 (0.0017) [2024-03-21 08:20:00,521][03784] Fps is (10 sec: 49151.8, 60 sec: 44236.8, 300 sec: 45208.7). Total num frames: 1602224128. Throughput: 0: 47146.6. Samples: 1603574900. Policy #0 lag: (min: 0.0, avg: 26.1, max: 65.0) [2024-03-21 08:20:00,522][03784] Avg episode reward: [(0, '1.362')] [2024-03-21 08:20:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000048896_1602224128.pth... [2024-03-21 08:20:00,659][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000048567_1591443456.pth [2024-03-21 08:20:05,521][03784] Fps is (10 sec: 49151.4, 60 sec: 46421.2, 300 sec: 45430.9). Total num frames: 1602519040. Throughput: 0: 47833.3. Samples: 1603874700. Policy #0 lag: (min: 0.0, avg: 26.1, max: 65.0) [2024-03-21 08:20:05,523][03784] Avg episode reward: [(0, '1.362')] [2024-03-21 08:20:05,805][04017] Updated weights for policy 0, policy_version 48906 (0.0012) [2024-03-21 08:20:10,521][03784] Fps is (10 sec: 58982.0, 60 sec: 46967.4, 300 sec: 45541.9). Total num frames: 1602813952. Throughput: 0: 46433.2. Samples: 1604113300. Policy #0 lag: (min: 0.0, avg: 26.1, max: 65.0) [2024-03-21 08:20:10,522][03784] Avg episode reward: [(0, '1.530')] [2024-03-21 08:20:12,197][04017] Updated weights for policy 0, policy_version 48916 (0.0013) [2024-03-21 08:20:15,521][03784] Fps is (10 sec: 58983.2, 60 sec: 48059.7, 300 sec: 45653.1). Total num frames: 1603108864. Throughput: 0: 46815.8. Samples: 1604258300. Policy #0 lag: (min: 0.0, avg: 26.1, max: 65.0) [2024-03-21 08:20:15,522][03784] Avg episode reward: [(0, '1.031')] [2024-03-21 08:20:17,080][04017] Updated weights for policy 0, policy_version 48926 (0.0015) [2024-03-21 08:20:20,521][03784] Fps is (10 sec: 58983.4, 60 sec: 48606.0, 300 sec: 45986.3). Total num frames: 1603403776. Throughput: 0: 47017.8. Samples: 1604518300. Policy #0 lag: (min: 0.0, avg: 26.1, max: 65.0) [2024-03-21 08:20:20,522][03784] Avg episode reward: [(0, '1.705')] [2024-03-21 08:20:21,640][03995] Signal inference workers to stop experience collection... (32300 times) [2024-03-21 08:20:21,717][04017] InferenceWorker_p0-w0: stopping experience collection (32300 times) [2024-03-21 08:20:21,948][03995] Signal inference workers to resume experience collection... (32300 times) [2024-03-21 08:20:21,949][04017] InferenceWorker_p0-w0: resuming experience collection (32300 times) [2024-03-21 08:20:25,289][04017] Updated weights for policy 0, policy_version 48936 (0.0012) [2024-03-21 08:20:25,521][03784] Fps is (10 sec: 42598.5, 60 sec: 48059.8, 300 sec: 45875.2). Total num frames: 1603534848. Throughput: 0: 45500.1. Samples: 1604766900. Policy #0 lag: (min: 0.0, avg: 40.8, max: 101.0) [2024-03-21 08:20:25,522][03784] Avg episode reward: [(0, '1.416')] [2024-03-21 08:20:30,521][03784] Fps is (10 sec: 26213.8, 60 sec: 43690.7, 300 sec: 45986.3). Total num frames: 1603665920. Throughput: 0: 44879.9. Samples: 1604900400. Policy #0 lag: (min: 0.0, avg: 40.8, max: 101.0) [2024-03-21 08:20:30,522][03784] Avg episode reward: [(0, '1.733')] [2024-03-21 08:20:35,521][03784] Fps is (10 sec: 29490.9, 60 sec: 40960.0, 300 sec: 45653.0). Total num frames: 1603829760. Throughput: 0: 44846.6. Samples: 1605186200. Policy #0 lag: (min: 0.0, avg: 40.8, max: 101.0) [2024-03-21 08:20:35,522][03784] Avg episode reward: [(0, '0.729')] [2024-03-21 08:20:37,211][04017] Updated weights for policy 0, policy_version 48946 (0.0012) [2024-03-21 08:20:40,521][03784] Fps is (10 sec: 32768.5, 60 sec: 42598.4, 300 sec: 44986.6). Total num frames: 1603993600. Throughput: 0: 45351.1. Samples: 1605460700. Policy #0 lag: (min: 0.0, avg: 40.8, max: 101.0) [2024-03-21 08:20:40,522][03784] Avg episode reward: [(0, '1.386')] [2024-03-21 08:20:42,651][04017] Updated weights for policy 0, policy_version 48956 (0.0010) [2024-03-21 08:20:45,521][03784] Fps is (10 sec: 39322.4, 60 sec: 44237.0, 300 sec: 44764.5). Total num frames: 1604222976. Throughput: 0: 44784.6. Samples: 1605590200. Policy #0 lag: (min: 0.0, avg: 40.8, max: 101.0) [2024-03-21 08:20:45,521][03784] Avg episode reward: [(0, '1.198')] [2024-03-21 08:20:50,056][04017] Updated weights for policy 0, policy_version 48966 (0.0017) [2024-03-21 08:20:50,521][03784] Fps is (10 sec: 55705.3, 60 sec: 46967.4, 300 sec: 45208.7). Total num frames: 1604550656. Throughput: 0: 43906.7. Samples: 1605850500. Policy #0 lag: (min: 0.0, avg: 40.8, max: 101.0) [2024-03-21 08:20:50,522][03784] Avg episode reward: [(0, '1.353')] [2024-03-21 08:20:55,058][04017] Updated weights for policy 0, policy_version 48976 (0.0019) [2024-03-21 08:20:55,521][03784] Fps is (10 sec: 62258.4, 60 sec: 46967.5, 300 sec: 45653.1). Total num frames: 1604845568. Throughput: 0: 44409.0. Samples: 1606111700. Policy #0 lag: (min: 0.0, avg: 40.8, max: 101.0) [2024-03-21 08:20:55,522][03784] Avg episode reward: [(0, '1.509')] [2024-03-21 08:21:00,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 1605009408. Throughput: 0: 44184.4. Samples: 1606246600. Policy #0 lag: (min: 0.0, avg: 37.8, max: 82.0) [2024-03-21 08:21:00,522][03784] Avg episode reward: [(0, '1.731')] [2024-03-21 08:21:04,368][04017] Updated weights for policy 0, policy_version 48986 (0.0011) [2024-03-21 08:21:05,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44783.1, 300 sec: 45430.9). Total num frames: 1605206016. Throughput: 0: 44664.4. Samples: 1606528200. Policy #0 lag: (min: 0.0, avg: 37.8, max: 82.0) [2024-03-21 08:21:05,522][03784] Avg episode reward: [(0, '0.513')] [2024-03-21 08:21:10,521][03784] Fps is (10 sec: 36044.8, 60 sec: 42598.5, 300 sec: 45430.9). Total num frames: 1605369856. Throughput: 0: 45026.6. Samples: 1606793100. Policy #0 lag: (min: 0.0, avg: 37.8, max: 82.0) [2024-03-21 08:21:10,522][03784] Avg episode reward: [(0, '1.156')] [2024-03-21 08:21:11,506][04017] Updated weights for policy 0, policy_version 48996 (0.0015) [2024-03-21 08:21:15,521][03784] Fps is (10 sec: 52428.3, 60 sec: 43690.6, 300 sec: 45875.2). Total num frames: 1605730304. Throughput: 0: 44751.2. Samples: 1606914200. Policy #0 lag: (min: 0.0, avg: 37.8, max: 82.0) [2024-03-21 08:21:15,522][03784] Avg episode reward: [(0, '1.270')] [2024-03-21 08:21:19,783][04017] Updated weights for policy 0, policy_version 49006 (0.0027) [2024-03-21 08:21:20,521][03784] Fps is (10 sec: 49152.2, 60 sec: 40960.0, 300 sec: 45653.1). Total num frames: 1605861376. Throughput: 0: 44569.0. Samples: 1607191800. Policy #0 lag: (min: 0.0, avg: 37.8, max: 82.0) [2024-03-21 08:21:20,521][03784] Avg episode reward: [(0, '0.502')] [2024-03-21 08:21:24,204][03995] Signal inference workers to stop experience collection... (32350 times) [2024-03-21 08:21:24,277][03995] Signal inference workers to resume experience collection... (32350 times) [2024-03-21 08:21:24,307][04017] InferenceWorker_p0-w0: stopping experience collection (32350 times) [2024-03-21 08:21:24,358][04017] InferenceWorker_p0-w0: resuming experience collection (32350 times) [2024-03-21 08:21:25,521][03784] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 45319.8). Total num frames: 1606123520. Throughput: 0: 44291.2. Samples: 1607453800. Policy #0 lag: (min: 0.0, avg: 37.8, max: 82.0) [2024-03-21 08:21:25,522][03784] Avg episode reward: [(0, '0.930')] [2024-03-21 08:21:25,625][04017] Updated weights for policy 0, policy_version 49016 (0.0012) [2024-03-21 08:21:30,521][03784] Fps is (10 sec: 52428.1, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1606385664. Throughput: 0: 44217.5. Samples: 1607580000. Policy #0 lag: (min: 0.0, avg: 39.4, max: 104.0) [2024-03-21 08:21:30,522][03784] Avg episode reward: [(0, '1.500')] [2024-03-21 08:21:31,959][04017] Updated weights for policy 0, policy_version 49026 (0.0011) [2024-03-21 08:21:35,521][03784] Fps is (10 sec: 65535.6, 60 sec: 49152.0, 300 sec: 46097.3). Total num frames: 1606778880. Throughput: 0: 44406.7. Samples: 1607848800. Policy #0 lag: (min: 0.0, avg: 39.4, max: 104.0) [2024-03-21 08:21:35,522][03784] Avg episode reward: [(0, '1.081')] [2024-03-21 08:21:39,067][04017] Updated weights for policy 0, policy_version 49036 (0.0017) [2024-03-21 08:21:40,521][03784] Fps is (10 sec: 55706.1, 60 sec: 49152.0, 300 sec: 45764.1). Total num frames: 1606942720. Throughput: 0: 44828.9. Samples: 1608129000. Policy #0 lag: (min: 0.0, avg: 39.4, max: 104.0) [2024-03-21 08:21:40,522][03784] Avg episode reward: [(0, '1.225')] [2024-03-21 08:21:45,521][03784] Fps is (10 sec: 26214.5, 60 sec: 46967.4, 300 sec: 45653.4). Total num frames: 1607041024. Throughput: 0: 45064.5. Samples: 1608274500. Policy #0 lag: (min: 0.0, avg: 39.4, max: 104.0) [2024-03-21 08:21:45,522][03784] Avg episode reward: [(0, '1.236')] [2024-03-21 08:21:47,988][04017] Updated weights for policy 0, policy_version 49046 (0.0012) [2024-03-21 08:21:50,521][03784] Fps is (10 sec: 32768.0, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 1607270400. Throughput: 0: 45035.5. Samples: 1608554800. Policy #0 lag: (min: 0.0, avg: 39.4, max: 104.0) [2024-03-21 08:21:50,522][03784] Avg episode reward: [(0, '1.471')] [2024-03-21 08:21:55,298][04017] Updated weights for policy 0, policy_version 49056 (0.0011) [2024-03-21 08:21:55,521][03784] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 45764.1). Total num frames: 1607467008. Throughput: 0: 45002.2. Samples: 1608818200. Policy #0 lag: (min: 0.0, avg: 39.4, max: 104.0) [2024-03-21 08:21:55,522][03784] Avg episode reward: [(0, '1.069')] [2024-03-21 08:22:00,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 45764.2). Total num frames: 1607663616. Throughput: 0: 45448.9. Samples: 1608959400. Policy #0 lag: (min: 0.0, avg: 35.9, max: 75.0) [2024-03-21 08:22:00,522][03784] Avg episode reward: [(0, '1.094')] [2024-03-21 08:22:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000049062_1607663616.pth... [2024-03-21 08:22:00,700][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000048729_1596751872.pth [2024-03-21 08:22:04,121][04017] Updated weights for policy 0, policy_version 49066 (0.0015) [2024-03-21 08:22:05,521][03784] Fps is (10 sec: 36044.6, 60 sec: 43690.5, 300 sec: 45097.7). Total num frames: 1607827456. Throughput: 0: 45593.2. Samples: 1609243500. Policy #0 lag: (min: 0.0, avg: 35.9, max: 75.0) [2024-03-21 08:22:05,522][03784] Avg episode reward: [(0, '1.040')] [2024-03-21 08:22:10,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 1608089600. Throughput: 0: 45555.5. Samples: 1609503800. Policy #0 lag: (min: 0.0, avg: 35.9, max: 75.0) [2024-03-21 08:22:10,522][03784] Avg episode reward: [(0, '1.331')] [2024-03-21 08:22:11,276][04017] Updated weights for policy 0, policy_version 49076 (0.0011) [2024-03-21 08:22:15,521][03784] Fps is (10 sec: 58983.3, 60 sec: 44783.0, 300 sec: 45764.1). Total num frames: 1608417280. Throughput: 0: 45475.7. Samples: 1609626400. Policy #0 lag: (min: 0.0, avg: 35.9, max: 75.0) [2024-03-21 08:22:15,522][03784] Avg episode reward: [(0, '1.786')] [2024-03-21 08:22:15,685][04017] Updated weights for policy 0, policy_version 49086 (0.0011) [2024-03-21 08:22:16,732][03995] Signal inference workers to stop experience collection... (32400 times) [2024-03-21 08:22:16,788][04017] InferenceWorker_p0-w0: stopping experience collection (32400 times) [2024-03-21 08:22:16,809][03995] Signal inference workers to resume experience collection... (32400 times) [2024-03-21 08:22:16,838][04017] InferenceWorker_p0-w0: resuming experience collection (32400 times) [2024-03-21 08:22:20,521][03784] Fps is (10 sec: 58982.5, 60 sec: 46967.5, 300 sec: 46097.4). Total num frames: 1608679424. Throughput: 0: 45051.2. Samples: 1609876100. Policy #0 lag: (min: 0.0, avg: 35.9, max: 75.0) [2024-03-21 08:22:20,522][03784] Avg episode reward: [(0, '0.836')] [2024-03-21 08:22:22,504][04017] Updated weights for policy 0, policy_version 49096 (0.0016) [2024-03-21 08:22:25,521][03784] Fps is (10 sec: 45874.5, 60 sec: 45875.1, 300 sec: 46097.3). Total num frames: 1608876032. Throughput: 0: 44606.6. Samples: 1610136300. Policy #0 lag: (min: 0.0, avg: 35.9, max: 75.0) [2024-03-21 08:22:25,522][03784] Avg episode reward: [(0, '0.764')] [2024-03-21 08:22:30,521][03784] Fps is (10 sec: 26214.2, 60 sec: 42598.5, 300 sec: 45542.0). Total num frames: 1608941568. Throughput: 0: 44466.6. Samples: 1610275500. Policy #0 lag: (min: 0.0, avg: 35.9, max: 75.0) [2024-03-21 08:22:30,522][03784] Avg episode reward: [(0, '0.812')] [2024-03-21 08:22:33,955][04017] Updated weights for policy 0, policy_version 49106 (0.0011) [2024-03-21 08:22:35,521][03784] Fps is (10 sec: 32768.3, 60 sec: 40413.9, 300 sec: 45764.1). Total num frames: 1609203712. Throughput: 0: 44540.0. Samples: 1610559100. Policy #0 lag: (min: 0.0, avg: 33.1, max: 73.0) [2024-03-21 08:22:35,522][03784] Avg episode reward: [(0, '1.167')] [2024-03-21 08:22:39,913][04017] Updated weights for policy 0, policy_version 49116 (0.0013) [2024-03-21 08:22:40,521][03784] Fps is (10 sec: 52428.8, 60 sec: 42052.3, 300 sec: 45653.1). Total num frames: 1609465856. Throughput: 0: 44817.8. Samples: 1610835000. Policy #0 lag: (min: 0.0, avg: 33.1, max: 73.0) [2024-03-21 08:22:40,522][03784] Avg episode reward: [(0, '1.569')] [2024-03-21 08:22:45,521][03784] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 1609662464. Throughput: 0: 44808.8. Samples: 1610975800. Policy #0 lag: (min: 0.0, avg: 33.1, max: 73.0) [2024-03-21 08:22:45,522][03784] Avg episode reward: [(0, '1.369')] [2024-03-21 08:22:50,075][04017] Updated weights for policy 0, policy_version 49126 (0.0016) [2024-03-21 08:22:50,521][03784] Fps is (10 sec: 32768.1, 60 sec: 42052.3, 300 sec: 44542.3). Total num frames: 1609793536. Throughput: 0: 44729.0. Samples: 1611256300. Policy #0 lag: (min: 0.0, avg: 33.1, max: 73.0) [2024-03-21 08:22:50,522][03784] Avg episode reward: [(0, '1.390')] [2024-03-21 08:22:55,048][04017] Updated weights for policy 0, policy_version 49136 (0.0024) [2024-03-21 08:22:55,521][03784] Fps is (10 sec: 45875.0, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 1610121216. Throughput: 0: 44677.7. Samples: 1611514300. Policy #0 lag: (min: 0.0, avg: 33.1, max: 73.0) [2024-03-21 08:22:55,522][03784] Avg episode reward: [(0, '1.452')] [2024-03-21 08:22:58,657][04017] Updated weights for policy 0, policy_version 49146 (0.0021) [2024-03-21 08:23:00,521][03784] Fps is (10 sec: 75366.2, 60 sec: 48059.7, 300 sec: 46097.3). Total num frames: 1610547200. Throughput: 0: 44666.6. Samples: 1611636400. Policy #0 lag: (min: 0.0, avg: 33.1, max: 73.0) [2024-03-21 08:23:00,522][03784] Avg episode reward: [(0, '1.303')] [2024-03-21 08:23:02,746][04017] Updated weights for policy 0, policy_version 49156 (0.0009) [2024-03-21 08:23:05,521][03784] Fps is (10 sec: 65536.3, 60 sec: 49152.1, 300 sec: 45986.3). Total num frames: 1610776576. Throughput: 0: 45279.9. Samples: 1611913700. Policy #0 lag: (min: 0.0, avg: 43.0, max: 91.0) [2024-03-21 08:23:05,522][03784] Avg episode reward: [(0, '0.579')] [2024-03-21 08:23:08,377][03995] Signal inference workers to stop experience collection... (32450 times) [2024-03-21 08:23:08,378][03995] Signal inference workers to resume experience collection... (32450 times) [2024-03-21 08:23:08,477][04017] InferenceWorker_p0-w0: stopping experience collection (32450 times) [2024-03-21 08:23:08,478][04017] InferenceWorker_p0-w0: resuming experience collection (32450 times) [2024-03-21 08:23:10,521][03784] Fps is (10 sec: 39322.0, 60 sec: 47513.7, 300 sec: 46097.4). Total num frames: 1610940416. Throughput: 0: 45871.3. Samples: 1612200500. Policy #0 lag: (min: 0.0, avg: 43.0, max: 91.0) [2024-03-21 08:23:10,522][03784] Avg episode reward: [(0, '1.205')] [2024-03-21 08:23:11,841][04017] Updated weights for policy 0, policy_version 49166 (0.0012) [2024-03-21 08:23:15,521][03784] Fps is (10 sec: 52429.3, 60 sec: 48059.7, 300 sec: 46097.4). Total num frames: 1611300864. Throughput: 0: 45633.4. Samples: 1612329000. Policy #0 lag: (min: 0.0, avg: 43.0, max: 91.0) [2024-03-21 08:23:15,522][03784] Avg episode reward: [(0, '1.462')] [2024-03-21 08:23:20,209][04017] Updated weights for policy 0, policy_version 49176 (0.0018) [2024-03-21 08:23:20,521][03784] Fps is (10 sec: 45874.6, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 1611399168. Throughput: 0: 46100.0. Samples: 1612633600. Policy #0 lag: (min: 0.0, avg: 43.0, max: 91.0) [2024-03-21 08:23:20,522][03784] Avg episode reward: [(0, '1.098')] [2024-03-21 08:23:25,521][03784] Fps is (10 sec: 29491.1, 60 sec: 45329.2, 300 sec: 45653.0). Total num frames: 1611595776. Throughput: 0: 46433.4. Samples: 1612924500. Policy #0 lag: (min: 0.0, avg: 43.0, max: 91.0) [2024-03-21 08:23:25,522][03784] Avg episode reward: [(0, '1.098')] [2024-03-21 08:23:29,065][04017] Updated weights for policy 0, policy_version 49186 (0.0011) [2024-03-21 08:23:30,521][03784] Fps is (10 sec: 39321.2, 60 sec: 47513.5, 300 sec: 45541.9). Total num frames: 1611792384. Throughput: 0: 46764.4. Samples: 1613080200. Policy #0 lag: (min: 0.0, avg: 43.0, max: 91.0) [2024-03-21 08:23:30,522][03784] Avg episode reward: [(0, '1.587')] [2024-03-21 08:23:35,434][04017] Updated weights for policy 0, policy_version 49196 (0.0016) [2024-03-21 08:23:35,521][03784] Fps is (10 sec: 45875.3, 60 sec: 47513.7, 300 sec: 45875.2). Total num frames: 1612054528. Throughput: 0: 46786.7. Samples: 1613361700. Policy #0 lag: (min: 0.0, avg: 43.0, max: 91.0) [2024-03-21 08:23:35,522][03784] Avg episode reward: [(0, '0.621')] [2024-03-21 08:23:40,521][03784] Fps is (10 sec: 42598.9, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1612218368. Throughput: 0: 46773.4. Samples: 1613619100. Policy #0 lag: (min: 0.0, avg: 28.7, max: 81.0) [2024-03-21 08:23:40,522][03784] Avg episode reward: [(0, '0.936')] [2024-03-21 08:23:44,432][04017] Updated weights for policy 0, policy_version 49206 (0.0015) [2024-03-21 08:23:45,521][03784] Fps is (10 sec: 36044.2, 60 sec: 45875.1, 300 sec: 45208.7). Total num frames: 1612414976. Throughput: 0: 47293.2. Samples: 1613764600. Policy #0 lag: (min: 0.0, avg: 28.7, max: 81.0) [2024-03-21 08:23:45,523][03784] Avg episode reward: [(0, '1.460')] [2024-03-21 08:23:50,521][03784] Fps is (10 sec: 32768.2, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1612546048. Throughput: 0: 47044.6. Samples: 1614030700. Policy #0 lag: (min: 0.0, avg: 28.7, max: 81.0) [2024-03-21 08:23:50,521][03784] Avg episode reward: [(0, '1.184')] [2024-03-21 08:23:52,687][04017] Updated weights for policy 0, policy_version 49216 (0.0015) [2024-03-21 08:23:55,521][03784] Fps is (10 sec: 52428.0, 60 sec: 46967.3, 300 sec: 45319.8). Total num frames: 1612939264. Throughput: 0: 46221.8. Samples: 1614280500. Policy #0 lag: (min: 0.0, avg: 28.7, max: 81.0) [2024-03-21 08:23:55,523][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 08:23:56,544][04017] Updated weights for policy 0, policy_version 49226 (0.0015) [2024-03-21 08:24:00,521][03784] Fps is (10 sec: 75365.3, 60 sec: 45875.1, 300 sec: 45986.3). Total num frames: 1613299712. Throughput: 0: 46188.7. Samples: 1614407500. Policy #0 lag: (min: 0.0, avg: 28.7, max: 81.0) [2024-03-21 08:24:00,522][03784] Avg episode reward: [(0, '1.101')] [2024-03-21 08:24:00,693][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000049235_1613332480.pth... [2024-03-21 08:24:00,815][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000048896_1602224128.pth [2024-03-21 08:24:02,192][03995] Signal inference workers to stop experience collection... (32500 times) [2024-03-21 08:24:02,192][03995] Signal inference workers to resume experience collection... (32500 times) [2024-03-21 08:24:02,196][04017] Updated weights for policy 0, policy_version 49236 (0.0020) [2024-03-21 08:24:02,262][04017] InferenceWorker_p0-w0: stopping experience collection (32500 times) [2024-03-21 08:24:02,263][04017] InferenceWorker_p0-w0: resuming experience collection (32500 times) [2024-03-21 08:24:05,521][03784] Fps is (10 sec: 62261.5, 60 sec: 46421.4, 300 sec: 45986.3). Total num frames: 1613561856. Throughput: 0: 45622.3. Samples: 1614686600. Policy #0 lag: (min: 0.0, avg: 28.7, max: 81.0) [2024-03-21 08:24:05,522][03784] Avg episode reward: [(0, '1.599')] [2024-03-21 08:24:09,191][04017] Updated weights for policy 0, policy_version 49246 (0.0011) [2024-03-21 08:24:10,521][03784] Fps is (10 sec: 39321.9, 60 sec: 45875.1, 300 sec: 45653.0). Total num frames: 1613692928. Throughput: 0: 45668.9. Samples: 1614979600. Policy #0 lag: (min: 1.0, avg: 44.8, max: 82.0) [2024-03-21 08:24:10,522][03784] Avg episode reward: [(0, '1.164')] [2024-03-21 08:24:15,521][03784] Fps is (10 sec: 36044.2, 60 sec: 43690.6, 300 sec: 45542.0). Total num frames: 1613922304. Throughput: 0: 45380.0. Samples: 1615122300. Policy #0 lag: (min: 1.0, avg: 44.8, max: 82.0) [2024-03-21 08:24:15,522][03784] Avg episode reward: [(0, '1.677')] [2024-03-21 08:24:16,549][04017] Updated weights for policy 0, policy_version 49256 (0.0012) [2024-03-21 08:24:20,521][03784] Fps is (10 sec: 52428.9, 60 sec: 46967.5, 300 sec: 45986.3). Total num frames: 1614217216. Throughput: 0: 45484.4. Samples: 1615408500. Policy #0 lag: (min: 1.0, avg: 44.8, max: 82.0) [2024-03-21 08:24:20,522][03784] Avg episode reward: [(0, '1.513')] [2024-03-21 08:24:23,636][04017] Updated weights for policy 0, policy_version 49266 (0.0022) [2024-03-21 08:24:25,521][03784] Fps is (10 sec: 58982.0, 60 sec: 48605.7, 300 sec: 45653.1). Total num frames: 1614512128. Throughput: 0: 45908.7. Samples: 1615685000. Policy #0 lag: (min: 1.0, avg: 44.8, max: 82.0) [2024-03-21 08:24:25,522][03784] Avg episode reward: [(0, '1.513')] [2024-03-21 08:24:30,067][04017] Updated weights for policy 0, policy_version 49276 (0.0017) [2024-03-21 08:24:30,521][03784] Fps is (10 sec: 45875.1, 60 sec: 48059.8, 300 sec: 45097.7). Total num frames: 1614675968. Throughput: 0: 45602.3. Samples: 1615816700. Policy #0 lag: (min: 1.0, avg: 44.8, max: 82.0) [2024-03-21 08:24:30,522][03784] Avg episode reward: [(0, '0.814')] [2024-03-21 08:24:35,521][03784] Fps is (10 sec: 22937.9, 60 sec: 44782.9, 300 sec: 45097.7). Total num frames: 1614741504. Throughput: 0: 45948.8. Samples: 1616098400. Policy #0 lag: (min: 1.0, avg: 44.8, max: 82.0) [2024-03-21 08:24:35,522][03784] Avg episode reward: [(0, '1.516')] [2024-03-21 08:24:40,521][03784] Fps is (10 sec: 19661.0, 60 sec: 44236.9, 300 sec: 45097.7). Total num frames: 1614872576. Throughput: 0: 46858.2. Samples: 1616389100. Policy #0 lag: (min: 0.0, avg: 36.1, max: 116.0) [2024-03-21 08:24:40,521][03784] Avg episode reward: [(0, '1.068')] [2024-03-21 08:24:42,434][04017] Updated weights for policy 0, policy_version 49286 (0.0010) [2024-03-21 08:24:45,521][03784] Fps is (10 sec: 49152.0, 60 sec: 46967.5, 300 sec: 45764.1). Total num frames: 1615233024. Throughput: 0: 46728.9. Samples: 1616510300. Policy #0 lag: (min: 0.0, avg: 36.1, max: 116.0) [2024-03-21 08:24:45,522][03784] Avg episode reward: [(0, '0.919')] [2024-03-21 08:24:46,404][04017] Updated weights for policy 0, policy_version 49296 (0.0016) [2024-03-21 08:24:50,521][03784] Fps is (10 sec: 62258.3, 60 sec: 49151.9, 300 sec: 45653.0). Total num frames: 1615495168. Throughput: 0: 46408.7. Samples: 1616775000. Policy #0 lag: (min: 0.0, avg: 36.1, max: 116.0) [2024-03-21 08:24:50,522][03784] Avg episode reward: [(0, '1.220')] [2024-03-21 08:24:52,276][04017] Updated weights for policy 0, policy_version 49306 (0.0021) [2024-03-21 08:24:55,521][03784] Fps is (10 sec: 45875.5, 60 sec: 45875.4, 300 sec: 45653.1). Total num frames: 1615691776. Throughput: 0: 46406.7. Samples: 1617067900. Policy #0 lag: (min: 0.0, avg: 36.1, max: 116.0) [2024-03-21 08:24:55,522][03784] Avg episode reward: [(0, '1.220')] [2024-03-21 08:25:00,521][03784] Fps is (10 sec: 42598.3, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 1615921152. Throughput: 0: 46191.1. Samples: 1617200900. Policy #0 lag: (min: 0.0, avg: 36.1, max: 116.0) [2024-03-21 08:25:00,522][03784] Avg episode reward: [(0, '1.414')] [2024-03-21 08:25:01,540][04017] Updated weights for policy 0, policy_version 49316 (0.0012) [2024-03-21 08:25:02,143][03995] Signal inference workers to stop experience collection... (32550 times) [2024-03-21 08:25:02,206][04017] InferenceWorker_p0-w0: stopping experience collection (32550 times) [2024-03-21 08:25:02,218][03995] Signal inference workers to resume experience collection... (32550 times) [2024-03-21 08:25:02,257][04017] InferenceWorker_p0-w0: resuming experience collection (32550 times) [2024-03-21 08:25:05,521][03784] Fps is (10 sec: 55705.9, 60 sec: 44782.9, 300 sec: 45542.0). Total num frames: 1616248832. Throughput: 0: 46055.6. Samples: 1617481000. Policy #0 lag: (min: 0.0, avg: 36.1, max: 116.0) [2024-03-21 08:25:05,522][03784] Avg episode reward: [(0, '1.439')] [2024-03-21 08:25:05,888][04017] Updated weights for policy 0, policy_version 49326 (0.0015) [2024-03-21 08:25:10,121][04017] Updated weights for policy 0, policy_version 49336 (0.0018) [2024-03-21 08:25:10,521][03784] Fps is (10 sec: 75366.9, 60 sec: 49698.1, 300 sec: 45986.3). Total num frames: 1616674816. Throughput: 0: 44671.3. Samples: 1617695200. Policy #0 lag: (min: 0.0, avg: 36.1, max: 116.0) [2024-03-21 08:25:10,522][03784] Avg episode reward: [(0, '0.947')] [2024-03-21 08:25:15,521][03784] Fps is (10 sec: 42598.3, 60 sec: 45875.3, 300 sec: 44986.6). Total num frames: 1616674816. Throughput: 0: 45331.2. Samples: 1617856600. Policy #0 lag: (min: 0.0, avg: 36.1, max: 116.0) [2024-03-21 08:25:15,522][03784] Avg episode reward: [(0, '0.947')] [2024-03-21 08:25:20,521][03784] Fps is (10 sec: 19660.7, 60 sec: 44236.8, 300 sec: 45208.7). Total num frames: 1616871424. Throughput: 0: 45468.9. Samples: 1618144500. Policy #0 lag: (min: 0.0, avg: 43.7, max: 91.0) [2024-03-21 08:25:20,522][03784] Avg episode reward: [(0, '0.978')] [2024-03-21 08:25:22,844][04017] Updated weights for policy 0, policy_version 49346 (0.0011) [2024-03-21 08:25:25,522][03784] Fps is (10 sec: 32764.0, 60 sec: 41505.4, 300 sec: 45208.6). Total num frames: 1617002496. Throughput: 0: 44978.7. Samples: 1618413200. Policy #0 lag: (min: 0.0, avg: 43.7, max: 91.0) [2024-03-21 08:25:25,523][03784] Avg episode reward: [(0, '1.384')] [2024-03-21 08:25:30,521][03784] Fps is (10 sec: 36044.5, 60 sec: 42598.3, 300 sec: 45430.9). Total num frames: 1617231872. Throughput: 0: 44982.1. Samples: 1618534500. Policy #0 lag: (min: 0.0, avg: 43.7, max: 91.0) [2024-03-21 08:25:30,522][03784] Avg episode reward: [(0, '0.736')] [2024-03-21 08:25:31,148][04017] Updated weights for policy 0, policy_version 49356 (0.0011) [2024-03-21 08:25:35,521][03784] Fps is (10 sec: 45880.3, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 1617461248. Throughput: 0: 45775.5. Samples: 1618834900. Policy #0 lag: (min: 0.0, avg: 43.7, max: 91.0) [2024-03-21 08:25:35,522][03784] Avg episode reward: [(0, '0.769')] [2024-03-21 08:25:37,112][04017] Updated weights for policy 0, policy_version 49366 (0.0013) [2024-03-21 08:25:40,521][03784] Fps is (10 sec: 55706.3, 60 sec: 48605.8, 300 sec: 45986.3). Total num frames: 1617788928. Throughput: 0: 45215.5. Samples: 1619102600. Policy #0 lag: (min: 0.0, avg: 43.7, max: 91.0) [2024-03-21 08:25:40,522][03784] Avg episode reward: [(0, '0.769')] [2024-03-21 08:25:42,856][04017] Updated weights for policy 0, policy_version 49376 (0.0011) [2024-03-21 08:25:45,521][03784] Fps is (10 sec: 68813.3, 60 sec: 48605.9, 300 sec: 46097.4). Total num frames: 1618149376. Throughput: 0: 45382.3. Samples: 1619243100. Policy #0 lag: (min: 0.0, avg: 43.7, max: 91.0) [2024-03-21 08:25:45,522][03784] Avg episode reward: [(0, '0.945')] [2024-03-21 08:25:50,521][03784] Fps is (10 sec: 45874.8, 60 sec: 45875.2, 300 sec: 45430.9). Total num frames: 1618247680. Throughput: 0: 44882.1. Samples: 1619500700. Policy #0 lag: (min: 1.0, avg: 45.8, max: 104.0) [2024-03-21 08:25:50,522][03784] Avg episode reward: [(0, '1.052')] [2024-03-21 08:25:51,520][03995] Signal inference workers to stop experience collection... (32600 times) [2024-03-21 08:25:51,521][03995] Signal inference workers to resume experience collection... (32600 times) [2024-03-21 08:25:51,525][04017] Updated weights for policy 0, policy_version 49386 (0.0023) [2024-03-21 08:25:51,565][04017] InferenceWorker_p0-w0: stopping experience collection (32600 times) [2024-03-21 08:25:51,571][04017] InferenceWorker_p0-w0: resuming experience collection (32600 times) [2024-03-21 08:25:55,521][03784] Fps is (10 sec: 36045.1, 60 sec: 46967.5, 300 sec: 45764.1). Total num frames: 1618509824. Throughput: 0: 46204.5. Samples: 1619774400. Policy #0 lag: (min: 1.0, avg: 45.8, max: 104.0) [2024-03-21 08:25:55,522][03784] Avg episode reward: [(0, '0.697')] [2024-03-21 08:25:56,554][04017] Updated weights for policy 0, policy_version 49396 (0.0015) [2024-03-21 08:26:00,521][03784] Fps is (10 sec: 36044.5, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 1618608128. Throughput: 0: 45670.9. Samples: 1619911800. Policy #0 lag: (min: 1.0, avg: 45.8, max: 104.0) [2024-03-21 08:26:00,522][03784] Avg episode reward: [(0, '1.063')] [2024-03-21 08:26:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000049396_1618608128.pth... [2024-03-21 08:26:00,722][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000049062_1607663616.pth [2024-03-21 08:26:05,521][03784] Fps is (10 sec: 36044.8, 60 sec: 43690.6, 300 sec: 45764.1). Total num frames: 1618870272. Throughput: 0: 45344.5. Samples: 1620185000. Policy #0 lag: (min: 1.0, avg: 45.8, max: 104.0) [2024-03-21 08:26:05,522][03784] Avg episode reward: [(0, '0.749')] [2024-03-21 08:26:07,905][04017] Updated weights for policy 0, policy_version 49406 (0.0015) [2024-03-21 08:26:10,521][03784] Fps is (10 sec: 45875.8, 60 sec: 39867.7, 300 sec: 45208.7). Total num frames: 1619066880. Throughput: 0: 45465.6. Samples: 1620459100. Policy #0 lag: (min: 1.0, avg: 45.8, max: 104.0) [2024-03-21 08:26:10,522][03784] Avg episode reward: [(0, '1.497')] [2024-03-21 08:26:15,193][04017] Updated weights for policy 0, policy_version 49416 (0.0011) [2024-03-21 08:26:15,521][03784] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 45542.0). Total num frames: 1619296256. Throughput: 0: 45882.3. Samples: 1620599200. Policy #0 lag: (min: 1.0, avg: 45.8, max: 104.0) [2024-03-21 08:26:15,522][03784] Avg episode reward: [(0, '1.393')] [2024-03-21 08:26:20,521][03784] Fps is (10 sec: 42597.8, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 1619492864. Throughput: 0: 45057.7. Samples: 1620862500. Policy #0 lag: (min: 2.0, avg: 32.0, max: 76.0) [2024-03-21 08:26:20,522][03784] Avg episode reward: [(0, '1.109')] [2024-03-21 08:26:21,681][04017] Updated weights for policy 0, policy_version 49426 (0.0017) [2024-03-21 08:26:25,521][03784] Fps is (10 sec: 58982.4, 60 sec: 48060.7, 300 sec: 45764.1). Total num frames: 1619886080. Throughput: 0: 44868.9. Samples: 1621121700. Policy #0 lag: (min: 2.0, avg: 32.0, max: 76.0) [2024-03-21 08:26:25,522][03784] Avg episode reward: [(0, '1.109')] [2024-03-21 08:26:25,832][04017] Updated weights for policy 0, policy_version 49436 (0.0025) [2024-03-21 08:26:30,521][03784] Fps is (10 sec: 62260.8, 60 sec: 48059.9, 300 sec: 45208.7). Total num frames: 1620115456. Throughput: 0: 44955.7. Samples: 1621266100. Policy #0 lag: (min: 2.0, avg: 32.0, max: 76.0) [2024-03-21 08:26:30,522][03784] Avg episode reward: [(0, '1.401')] [2024-03-21 08:26:32,110][04017] Updated weights for policy 0, policy_version 49446 (0.0013) [2024-03-21 08:26:35,521][03784] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 45653.0). Total num frames: 1620410368. Throughput: 0: 45593.4. Samples: 1621552400. Policy #0 lag: (min: 2.0, avg: 32.0, max: 76.0) [2024-03-21 08:26:35,522][03784] Avg episode reward: [(0, '1.401')] [2024-03-21 08:26:40,521][03784] Fps is (10 sec: 42597.9, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 1620541440. Throughput: 0: 45926.6. Samples: 1621841100. Policy #0 lag: (min: 2.0, avg: 32.0, max: 76.0) [2024-03-21 08:26:40,522][03784] Avg episode reward: [(0, '1.693')] [2024-03-21 08:26:40,546][04017] Updated weights for policy 0, policy_version 49456 (0.0015) [2024-03-21 08:26:45,521][03784] Fps is (10 sec: 36044.6, 60 sec: 43690.6, 300 sec: 45764.1). Total num frames: 1620770816. Throughput: 0: 46068.9. Samples: 1621984900. Policy #0 lag: (min: 2.0, avg: 32.0, max: 76.0) [2024-03-21 08:26:45,522][03784] Avg episode reward: [(0, '0.866')] [2024-03-21 08:26:47,328][04017] Updated weights for policy 0, policy_version 49466 (0.0013) [2024-03-21 08:26:47,586][03995] Signal inference workers to stop experience collection... (32650 times) [2024-03-21 08:26:47,651][03995] Signal inference workers to resume experience collection... (32650 times) [2024-03-21 08:26:47,652][04017] InferenceWorker_p0-w0: stopping experience collection (32650 times) [2024-03-21 08:26:47,701][04017] InferenceWorker_p0-w0: resuming experience collection (32650 times) [2024-03-21 08:26:50,521][03784] Fps is (10 sec: 49151.7, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 1621032960. Throughput: 0: 45902.1. Samples: 1622250600. Policy #0 lag: (min: 1.0, avg: 60.8, max: 118.0) [2024-03-21 08:26:50,522][03784] Avg episode reward: [(0, '0.966')] [2024-03-21 08:26:54,592][04017] Updated weights for policy 0, policy_version 49476 (0.0011) [2024-03-21 08:26:55,521][03784] Fps is (10 sec: 45874.8, 60 sec: 45328.9, 300 sec: 45986.2). Total num frames: 1621229568. Throughput: 0: 45617.6. Samples: 1622511900. Policy #0 lag: (min: 1.0, avg: 60.8, max: 118.0) [2024-03-21 08:26:55,523][03784] Avg episode reward: [(0, '1.696')] [2024-03-21 08:27:00,521][03784] Fps is (10 sec: 32768.1, 60 sec: 45875.3, 300 sec: 45875.2). Total num frames: 1621360640. Throughput: 0: 45642.2. Samples: 1622653100. Policy #0 lag: (min: 1.0, avg: 60.8, max: 118.0) [2024-03-21 08:27:00,522][03784] Avg episode reward: [(0, '1.297')] [2024-03-21 08:27:05,521][03784] Fps is (10 sec: 29492.1, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 1621524480. Throughput: 0: 45246.9. Samples: 1622898600. Policy #0 lag: (min: 1.0, avg: 60.8, max: 118.0) [2024-03-21 08:27:05,522][03784] Avg episode reward: [(0, '0.668')] [2024-03-21 08:27:06,897][04017] Updated weights for policy 0, policy_version 49486 (0.0016) [2024-03-21 08:27:10,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 1621753856. Throughput: 0: 45064.4. Samples: 1623149600. Policy #0 lag: (min: 1.0, avg: 60.8, max: 118.0) [2024-03-21 08:27:10,522][03784] Avg episode reward: [(0, '1.244')] [2024-03-21 08:27:15,390][04017] Updated weights for policy 0, policy_version 49496 (0.0016) [2024-03-21 08:27:15,521][03784] Fps is (10 sec: 36044.6, 60 sec: 43144.6, 300 sec: 44764.4). Total num frames: 1621884928. Throughput: 0: 44873.3. Samples: 1623285400. Policy #0 lag: (min: 1.0, avg: 60.8, max: 118.0) [2024-03-21 08:27:15,522][03784] Avg episode reward: [(0, '0.644')] [2024-03-21 08:27:20,521][03784] Fps is (10 sec: 32768.2, 60 sec: 43144.7, 300 sec: 44764.4). Total num frames: 1622081536. Throughput: 0: 44402.3. Samples: 1623550500. Policy #0 lag: (min: 1.0, avg: 60.8, max: 118.0) [2024-03-21 08:27:20,522][03784] Avg episode reward: [(0, '1.030')] [2024-03-21 08:27:22,028][04017] Updated weights for policy 0, policy_version 49506 (0.0017) [2024-03-21 08:27:25,521][03784] Fps is (10 sec: 55705.0, 60 sec: 42598.4, 300 sec: 45764.1). Total num frames: 1622441984. Throughput: 0: 43620.0. Samples: 1623804000. Policy #0 lag: (min: 1.0, avg: 25.6, max: 51.0) [2024-03-21 08:27:25,522][03784] Avg episode reward: [(0, '1.430')] [2024-03-21 08:27:28,241][04017] Updated weights for policy 0, policy_version 49516 (0.0023) [2024-03-21 08:27:30,521][03784] Fps is (10 sec: 62258.7, 60 sec: 43144.4, 300 sec: 45764.1). Total num frames: 1622704128. Throughput: 0: 43922.3. Samples: 1623961400. Policy #0 lag: (min: 1.0, avg: 25.6, max: 51.0) [2024-03-21 08:27:30,522][03784] Avg episode reward: [(0, '1.359')] [2024-03-21 08:27:34,171][04017] Updated weights for policy 0, policy_version 49526 (0.0010) [2024-03-21 08:27:35,521][03784] Fps is (10 sec: 45875.2, 60 sec: 41506.1, 300 sec: 45542.0). Total num frames: 1622900736. Throughput: 0: 44095.6. Samples: 1624234900. Policy #0 lag: (min: 1.0, avg: 25.6, max: 51.0) [2024-03-21 08:27:35,522][03784] Avg episode reward: [(0, '1.359')] [2024-03-21 08:27:39,118][04017] Updated weights for policy 0, policy_version 49536 (0.0020) [2024-03-21 08:27:40,521][03784] Fps is (10 sec: 62258.7, 60 sec: 46421.2, 300 sec: 46319.5). Total num frames: 1623326720. Throughput: 0: 43655.6. Samples: 1624476400. Policy #0 lag: (min: 1.0, avg: 25.6, max: 51.0) [2024-03-21 08:27:40,523][03784] Avg episode reward: [(0, '0.803')] [2024-03-21 08:27:42,482][03995] Signal inference workers to stop experience collection... (32700 times) [2024-03-21 08:27:42,536][04017] InferenceWorker_p0-w0: stopping experience collection (32700 times) [2024-03-21 08:27:42,782][03995] Signal inference workers to resume experience collection... (32700 times) [2024-03-21 08:27:42,782][04017] InferenceWorker_p0-w0: resuming experience collection (32700 times) [2024-03-21 08:27:44,393][04017] Updated weights for policy 0, policy_version 49546 (0.0015) [2024-03-21 08:27:45,521][03784] Fps is (10 sec: 68813.2, 60 sec: 46967.6, 300 sec: 46763.8). Total num frames: 1623588864. Throughput: 0: 43420.1. Samples: 1624607000. Policy #0 lag: (min: 1.0, avg: 25.6, max: 51.0) [2024-03-21 08:27:45,522][03784] Avg episode reward: [(0, '0.754')] [2024-03-21 08:27:50,521][03784] Fps is (10 sec: 45875.7, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 1623785472. Throughput: 0: 43873.2. Samples: 1624872900. Policy #0 lag: (min: 1.0, avg: 25.6, max: 51.0) [2024-03-21 08:27:50,522][03784] Avg episode reward: [(0, '1.437')] [2024-03-21 08:27:51,411][04017] Updated weights for policy 0, policy_version 49556 (0.0016) [2024-03-21 08:27:55,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45329.2, 300 sec: 45430.9). Total num frames: 1623949312. Throughput: 0: 44844.5. Samples: 1625167600. Policy #0 lag: (min: 0.0, avg: 38.4, max: 74.0) [2024-03-21 08:27:55,522][03784] Avg episode reward: [(0, '1.437')] [2024-03-21 08:28:00,521][03784] Fps is (10 sec: 32768.1, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1624113152. Throughput: 0: 44899.9. Samples: 1625305900. Policy #0 lag: (min: 0.0, avg: 38.4, max: 74.0) [2024-03-21 08:28:00,522][03784] Avg episode reward: [(0, '1.635')] [2024-03-21 08:28:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000049564_1624113152.pth... [2024-03-21 08:28:00,651][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000049235_1613332480.pth [2024-03-21 08:28:04,320][04017] Updated weights for policy 0, policy_version 49566 (0.0016) [2024-03-21 08:28:05,521][03784] Fps is (10 sec: 29491.1, 60 sec: 45329.0, 300 sec: 45097.6). Total num frames: 1624244224. Throughput: 0: 45420.0. Samples: 1625594400. Policy #0 lag: (min: 0.0, avg: 38.4, max: 74.0) [2024-03-21 08:28:05,523][03784] Avg episode reward: [(0, '1.022')] [2024-03-21 08:28:10,521][03784] Fps is (10 sec: 29491.1, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1624408064. Throughput: 0: 45882.2. Samples: 1625868700. Policy #0 lag: (min: 0.0, avg: 38.4, max: 74.0) [2024-03-21 08:28:10,522][03784] Avg episode reward: [(0, '0.719')] [2024-03-21 08:28:11,272][04017] Updated weights for policy 0, policy_version 49576 (0.0021) [2024-03-21 08:28:15,521][03784] Fps is (10 sec: 45875.3, 60 sec: 46967.4, 300 sec: 45097.7). Total num frames: 1624702976. Throughput: 0: 44980.1. Samples: 1625985500. Policy #0 lag: (min: 0.0, avg: 38.4, max: 74.0) [2024-03-21 08:28:15,522][03784] Avg episode reward: [(0, '1.234')] [2024-03-21 08:28:17,372][04017] Updated weights for policy 0, policy_version 49586 (0.0011) [2024-03-21 08:28:20,521][03784] Fps is (10 sec: 62259.5, 60 sec: 49152.0, 300 sec: 45542.0). Total num frames: 1625030656. Throughput: 0: 44957.8. Samples: 1626258000. Policy #0 lag: (min: 0.0, avg: 38.4, max: 74.0) [2024-03-21 08:28:20,522][03784] Avg episode reward: [(0, '1.146')] [2024-03-21 08:28:25,250][04017] Updated weights for policy 0, policy_version 49596 (0.0021) [2024-03-21 08:28:25,521][03784] Fps is (10 sec: 45875.3, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1625161728. Throughput: 0: 46229.1. Samples: 1626556700. Policy #0 lag: (min: 1.0, avg: 28.6, max: 59.0) [2024-03-21 08:28:25,522][03784] Avg episode reward: [(0, '1.146')] [2024-03-21 08:28:30,521][03784] Fps is (10 sec: 39321.3, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1625423872. Throughput: 0: 46073.3. Samples: 1626680300. Policy #0 lag: (min: 1.0, avg: 28.6, max: 59.0) [2024-03-21 08:28:30,522][03784] Avg episode reward: [(0, '1.517')] [2024-03-21 08:28:31,669][04017] Updated weights for policy 0, policy_version 49606 (0.0012) [2024-03-21 08:28:35,521][03784] Fps is (10 sec: 58982.1, 60 sec: 47513.6, 300 sec: 45875.2). Total num frames: 1625751552. Throughput: 0: 45644.5. Samples: 1626926900. Policy #0 lag: (min: 1.0, avg: 28.6, max: 59.0) [2024-03-21 08:28:35,522][03784] Avg episode reward: [(0, '0.444')] [2024-03-21 08:28:37,597][04017] Updated weights for policy 0, policy_version 49616 (0.0012) [2024-03-21 08:28:40,521][03784] Fps is (10 sec: 45876.4, 60 sec: 42598.6, 300 sec: 45653.1). Total num frames: 1625882624. Throughput: 0: 45433.5. Samples: 1627212100. Policy #0 lag: (min: 1.0, avg: 28.6, max: 59.0) [2024-03-21 08:28:40,521][03784] Avg episode reward: [(0, '0.441')] [2024-03-21 08:28:45,521][03784] Fps is (10 sec: 26214.2, 60 sec: 40413.8, 300 sec: 45653.0). Total num frames: 1626013696. Throughput: 0: 45517.7. Samples: 1627354200. Policy #0 lag: (min: 1.0, avg: 28.6, max: 59.0) [2024-03-21 08:28:45,522][03784] Avg episode reward: [(0, '1.077')] [2024-03-21 08:28:46,121][03995] Signal inference workers to stop experience collection... (32750 times) [2024-03-21 08:28:46,250][04017] InferenceWorker_p0-w0: stopping experience collection (32750 times) [2024-03-21 08:28:46,335][03995] Signal inference workers to resume experience collection... (32750 times) [2024-03-21 08:28:46,336][04017] InferenceWorker_p0-w0: resuming experience collection (32750 times) [2024-03-21 08:28:46,701][04017] Updated weights for policy 0, policy_version 49626 (0.0016) [2024-03-21 08:28:50,521][03784] Fps is (10 sec: 32767.4, 60 sec: 40413.9, 300 sec: 44986.6). Total num frames: 1626210304. Throughput: 0: 45715.6. Samples: 1627651600. Policy #0 lag: (min: 1.0, avg: 28.6, max: 59.0) [2024-03-21 08:28:50,522][03784] Avg episode reward: [(0, '1.223')] [2024-03-21 08:28:55,170][04017] Updated weights for policy 0, policy_version 49636 (0.0018) [2024-03-21 08:28:55,521][03784] Fps is (10 sec: 49152.3, 60 sec: 42598.4, 300 sec: 44764.4). Total num frames: 1626505216. Throughput: 0: 45373.4. Samples: 1627910500. Policy #0 lag: (min: 1.0, avg: 43.6, max: 87.0) [2024-03-21 08:28:55,522][03784] Avg episode reward: [(0, '1.076')] [2024-03-21 08:28:58,436][04017] Updated weights for policy 0, policy_version 49646 (0.0018) [2024-03-21 08:29:00,521][03784] Fps is (10 sec: 72089.7, 60 sec: 46967.5, 300 sec: 45319.8). Total num frames: 1626931200. Throughput: 0: 45515.5. Samples: 1628033700. Policy #0 lag: (min: 1.0, avg: 43.6, max: 87.0) [2024-03-21 08:29:00,522][03784] Avg episode reward: [(0, '1.110')] [2024-03-21 08:29:03,073][04017] Updated weights for policy 0, policy_version 49656 (0.0015) [2024-03-21 08:29:05,521][03784] Fps is (10 sec: 68812.9, 60 sec: 49152.0, 300 sec: 45764.1). Total num frames: 1627193344. Throughput: 0: 45533.3. Samples: 1628307000. Policy #0 lag: (min: 1.0, avg: 43.6, max: 87.0) [2024-03-21 08:29:05,522][03784] Avg episode reward: [(0, '1.110')] [2024-03-21 08:29:10,521][03784] Fps is (10 sec: 39321.3, 60 sec: 48605.8, 300 sec: 45430.9). Total num frames: 1627324416. Throughput: 0: 45402.1. Samples: 1628599800. Policy #0 lag: (min: 1.0, avg: 43.6, max: 87.0) [2024-03-21 08:29:10,522][03784] Avg episode reward: [(0, '0.935')] [2024-03-21 08:29:12,602][04017] Updated weights for policy 0, policy_version 49666 (0.0011) [2024-03-21 08:29:15,521][03784] Fps is (10 sec: 36044.3, 60 sec: 47513.5, 300 sec: 45208.7). Total num frames: 1627553792. Throughput: 0: 45782.2. Samples: 1628740500. Policy #0 lag: (min: 1.0, avg: 43.6, max: 87.0) [2024-03-21 08:29:15,522][03784] Avg episode reward: [(0, '1.699')] [2024-03-21 08:29:19,577][04017] Updated weights for policy 0, policy_version 49676 (0.0016) [2024-03-21 08:29:20,521][03784] Fps is (10 sec: 45875.6, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 1627783168. Throughput: 0: 46877.8. Samples: 1629036400. Policy #0 lag: (min: 1.0, avg: 43.6, max: 87.0) [2024-03-21 08:29:20,522][03784] Avg episode reward: [(0, '1.699')] [2024-03-21 08:29:25,521][03784] Fps is (10 sec: 49152.6, 60 sec: 48059.7, 300 sec: 45319.8). Total num frames: 1628045312. Throughput: 0: 46553.1. Samples: 1629307000. Policy #0 lag: (min: 1.0, avg: 43.6, max: 87.0) [2024-03-21 08:29:25,522][03784] Avg episode reward: [(0, '1.410')] [2024-03-21 08:29:26,380][04017] Updated weights for policy 0, policy_version 49686 (0.0017) [2024-03-21 08:29:28,077][03995] Signal inference workers to stop experience collection... (32800 times) [2024-03-21 08:29:28,118][04017] InferenceWorker_p0-w0: stopping experience collection (32800 times) [2024-03-21 08:29:28,335][03995] Signal inference workers to resume experience collection... (32800 times) [2024-03-21 08:29:28,335][04017] InferenceWorker_p0-w0: resuming experience collection (32800 times) [2024-03-21 08:29:30,521][03784] Fps is (10 sec: 55705.3, 60 sec: 48605.9, 300 sec: 46097.4). Total num frames: 1628340224. Throughput: 0: 46368.9. Samples: 1629440800. Policy #0 lag: (min: 0.0, avg: 42.9, max: 118.0) [2024-03-21 08:29:30,522][03784] Avg episode reward: [(0, '1.250')] [2024-03-21 08:29:35,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44236.8, 300 sec: 45875.2). Total num frames: 1628405760. Throughput: 0: 46411.1. Samples: 1629740100. Policy #0 lag: (min: 0.0, avg: 42.9, max: 118.0) [2024-03-21 08:29:35,522][03784] Avg episode reward: [(0, '1.250')] [2024-03-21 08:29:35,793][04017] Updated weights for policy 0, policy_version 49696 (0.0015) [2024-03-21 08:29:40,521][03784] Fps is (10 sec: 26214.5, 60 sec: 45328.9, 300 sec: 45319.8). Total num frames: 1628602368. Throughput: 0: 47091.1. Samples: 1630029600. Policy #0 lag: (min: 0.0, avg: 42.9, max: 118.0) [2024-03-21 08:29:40,522][03784] Avg episode reward: [(0, '0.994')] [2024-03-21 08:29:42,678][04017] Updated weights for policy 0, policy_version 49706 (0.0016) [2024-03-21 08:29:45,521][03784] Fps is (10 sec: 52428.9, 60 sec: 48605.9, 300 sec: 45542.0). Total num frames: 1628930048. Throughput: 0: 47348.9. Samples: 1630164400. Policy #0 lag: (min: 0.0, avg: 42.9, max: 118.0) [2024-03-21 08:29:45,522][03784] Avg episode reward: [(0, '1.168')] [2024-03-21 08:29:49,712][04017] Updated weights for policy 0, policy_version 49716 (0.0017) [2024-03-21 08:29:50,521][03784] Fps is (10 sec: 55705.9, 60 sec: 49152.0, 300 sec: 45653.0). Total num frames: 1629159424. Throughput: 0: 47264.5. Samples: 1630433900. Policy #0 lag: (min: 0.0, avg: 42.9, max: 118.0) [2024-03-21 08:29:50,522][03784] Avg episode reward: [(0, '1.041')] [2024-03-21 08:29:55,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46967.4, 300 sec: 45430.9). Total num frames: 1629323264. Throughput: 0: 47153.3. Samples: 1630721700. Policy #0 lag: (min: 0.0, avg: 42.9, max: 118.0) [2024-03-21 08:29:55,522][03784] Avg episode reward: [(0, '0.904')] [2024-03-21 08:29:59,759][04017] Updated weights for policy 0, policy_version 49726 (0.0014) [2024-03-21 08:30:00,521][03784] Fps is (10 sec: 32767.6, 60 sec: 42598.3, 300 sec: 44875.5). Total num frames: 1629487104. Throughput: 0: 47577.8. Samples: 1630881500. Policy #0 lag: (min: 0.0, avg: 41.7, max: 102.0) [2024-03-21 08:30:00,522][03784] Avg episode reward: [(0, '0.904')] [2024-03-21 08:30:00,808][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000049729_1629519872.pth... [2024-03-21 08:30:00,868][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000049396_1618608128.pth [2024-03-21 08:30:03,185][04017] Updated weights for policy 0, policy_version 49736 (0.0010) [2024-03-21 08:30:05,521][03784] Fps is (10 sec: 55706.2, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 1629880320. Throughput: 0: 46086.7. Samples: 1631110300. Policy #0 lag: (min: 0.0, avg: 41.7, max: 102.0) [2024-03-21 08:30:05,522][03784] Avg episode reward: [(0, '1.532')] [2024-03-21 08:30:09,726][04017] Updated weights for policy 0, policy_version 49746 (0.0016) [2024-03-21 08:30:10,521][03784] Fps is (10 sec: 65536.6, 60 sec: 46967.5, 300 sec: 45653.0). Total num frames: 1630142464. Throughput: 0: 46393.3. Samples: 1631394700. Policy #0 lag: (min: 0.0, avg: 41.7, max: 102.0) [2024-03-21 08:30:10,522][03784] Avg episode reward: [(0, '1.237')] [2024-03-21 08:30:15,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46421.5, 300 sec: 45653.1). Total num frames: 1630339072. Throughput: 0: 46360.1. Samples: 1631527000. Policy #0 lag: (min: 0.0, avg: 41.7, max: 102.0) [2024-03-21 08:30:15,522][03784] Avg episode reward: [(0, '0.677')] [2024-03-21 08:30:15,979][04017] Updated weights for policy 0, policy_version 49756 (0.0029) [2024-03-21 08:30:20,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46967.4, 300 sec: 46097.5). Total num frames: 1630601216. Throughput: 0: 45666.7. Samples: 1631795100. Policy #0 lag: (min: 0.0, avg: 41.7, max: 102.0) [2024-03-21 08:30:20,522][03784] Avg episode reward: [(0, '1.305')] [2024-03-21 08:30:24,461][04017] Updated weights for policy 0, policy_version 49766 (0.0012) [2024-03-21 08:30:24,502][03995] Signal inference workers to stop experience collection... (32850 times) [2024-03-21 08:30:24,503][03995] Signal inference workers to resume experience collection... (32850 times) [2024-03-21 08:30:24,542][04017] InferenceWorker_p0-w0: stopping experience collection (32850 times) [2024-03-21 08:30:24,543][04017] InferenceWorker_p0-w0: resuming experience collection (32850 times) [2024-03-21 08:30:25,521][03784] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 45875.2). Total num frames: 1630765056. Throughput: 0: 45588.9. Samples: 1632081100. Policy #0 lag: (min: 0.0, avg: 41.7, max: 102.0) [2024-03-21 08:30:25,522][03784] Avg episode reward: [(0, '1.098')] [2024-03-21 08:30:30,521][03784] Fps is (10 sec: 42598.8, 60 sec: 44783.0, 300 sec: 45986.3). Total num frames: 1631027200. Throughput: 0: 45817.9. Samples: 1632226200. Policy #0 lag: (min: 1.0, avg: 41.7, max: 103.0) [2024-03-21 08:30:30,522][03784] Avg episode reward: [(0, '1.352')] [2024-03-21 08:30:31,301][04017] Updated weights for policy 0, policy_version 49776 (0.0027) [2024-03-21 08:30:35,521][03784] Fps is (10 sec: 52428.3, 60 sec: 48059.7, 300 sec: 45764.1). Total num frames: 1631289344. Throughput: 0: 45933.2. Samples: 1632500900. Policy #0 lag: (min: 1.0, avg: 41.7, max: 103.0) [2024-03-21 08:30:35,522][03784] Avg episode reward: [(0, '0.604')] [2024-03-21 08:30:37,094][04017] Updated weights for policy 0, policy_version 49786 (0.0022) [2024-03-21 08:30:40,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46967.5, 300 sec: 44986.6). Total num frames: 1631420416. Throughput: 0: 44897.9. Samples: 1632742100. Policy #0 lag: (min: 1.0, avg: 41.7, max: 103.0) [2024-03-21 08:30:40,522][03784] Avg episode reward: [(0, '1.251')] [2024-03-21 08:30:45,521][03784] Fps is (10 sec: 32768.2, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 1631617024. Throughput: 0: 44135.7. Samples: 1632867600. Policy #0 lag: (min: 1.0, avg: 41.7, max: 103.0) [2024-03-21 08:30:45,522][03784] Avg episode reward: [(0, '1.250')] [2024-03-21 08:30:48,668][04017] Updated weights for policy 0, policy_version 49796 (0.0014) [2024-03-21 08:30:50,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 1631846400. Throughput: 0: 45088.8. Samples: 1633139300. Policy #0 lag: (min: 1.0, avg: 41.7, max: 103.0) [2024-03-21 08:30:50,522][03784] Avg episode reward: [(0, '1.023')] [2024-03-21 08:30:55,521][03784] Fps is (10 sec: 39321.9, 60 sec: 44783.1, 300 sec: 45430.9). Total num frames: 1632010240. Throughput: 0: 45453.4. Samples: 1633440100. Policy #0 lag: (min: 1.0, avg: 41.7, max: 103.0) [2024-03-21 08:30:55,521][03784] Avg episode reward: [(0, '1.304')] [2024-03-21 08:30:55,575][04017] Updated weights for policy 0, policy_version 49806 (0.0021) [2024-03-21 08:31:00,521][03784] Fps is (10 sec: 42597.8, 60 sec: 46421.3, 300 sec: 45430.9). Total num frames: 1632272384. Throughput: 0: 45304.2. Samples: 1633565700. Policy #0 lag: (min: 1.0, avg: 41.7, max: 103.0) [2024-03-21 08:31:00,522][03784] Avg episode reward: [(0, '1.705')] [2024-03-21 08:31:02,979][04017] Updated weights for policy 0, policy_version 49816 (0.0015) [2024-03-21 08:31:05,521][03784] Fps is (10 sec: 52428.1, 60 sec: 44236.8, 300 sec: 45653.0). Total num frames: 1632534528. Throughput: 0: 45168.9. Samples: 1633827700. Policy #0 lag: (min: 0.0, avg: 51.0, max: 119.0) [2024-03-21 08:31:05,522][03784] Avg episode reward: [(0, '1.235')] [2024-03-21 08:31:08,738][04017] Updated weights for policy 0, policy_version 49826 (0.0011) [2024-03-21 08:31:10,521][03784] Fps is (10 sec: 55706.8, 60 sec: 44783.0, 300 sec: 45875.2). Total num frames: 1632829440. Throughput: 0: 44335.6. Samples: 1634076200. Policy #0 lag: (min: 0.0, avg: 51.0, max: 119.0) [2024-03-21 08:31:10,522][03784] Avg episode reward: [(0, '0.874')] [2024-03-21 08:31:15,474][04017] Updated weights for policy 0, policy_version 49836 (0.0012) [2024-03-21 08:31:15,521][03784] Fps is (10 sec: 49152.1, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 1633026048. Throughput: 0: 44257.7. Samples: 1634217800. Policy #0 lag: (min: 0.0, avg: 51.0, max: 119.0) [2024-03-21 08:31:15,522][03784] Avg episode reward: [(0, '0.794')] [2024-03-21 08:31:20,521][03784] Fps is (10 sec: 45874.8, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 1633288192. Throughput: 0: 43757.8. Samples: 1634470000. Policy #0 lag: (min: 0.0, avg: 51.0, max: 119.0) [2024-03-21 08:31:20,522][03784] Avg episode reward: [(0, '1.369')] [2024-03-21 08:31:21,597][04017] Updated weights for policy 0, policy_version 49846 (0.0011) [2024-03-21 08:31:22,914][03995] Signal inference workers to stop experience collection... (32900 times) [2024-03-21 08:31:22,986][03995] Signal inference workers to resume experience collection... (32900 times) [2024-03-21 08:31:23,000][04017] InferenceWorker_p0-w0: stopping experience collection (32900 times) [2024-03-21 08:31:23,045][04017] InferenceWorker_p0-w0: resuming experience collection (32900 times) [2024-03-21 08:31:25,521][03784] Fps is (10 sec: 39321.4, 60 sec: 44236.7, 300 sec: 45097.6). Total num frames: 1633419264. Throughput: 0: 44308.8. Samples: 1634736000. Policy #0 lag: (min: 0.0, avg: 51.0, max: 119.0) [2024-03-21 08:31:25,522][03784] Avg episode reward: [(0, '0.977')] [2024-03-21 08:31:30,521][03784] Fps is (10 sec: 32768.0, 60 sec: 43144.5, 300 sec: 44764.4). Total num frames: 1633615872. Throughput: 0: 44295.5. Samples: 1634860900. Policy #0 lag: (min: 0.0, avg: 51.0, max: 119.0) [2024-03-21 08:31:30,522][03784] Avg episode reward: [(0, '1.466')] [2024-03-21 08:31:33,770][04017] Updated weights for policy 0, policy_version 49856 (0.0012) [2024-03-21 08:31:35,521][03784] Fps is (10 sec: 29491.2, 60 sec: 40413.9, 300 sec: 44653.3). Total num frames: 1633714176. Throughput: 0: 44555.5. Samples: 1635144300. Policy #0 lag: (min: 0.0, avg: 37.7, max: 103.0) [2024-03-21 08:31:35,522][03784] Avg episode reward: [(0, '0.727')] [2024-03-21 08:31:38,949][04017] Updated weights for policy 0, policy_version 49866 (0.0017) [2024-03-21 08:31:40,521][03784] Fps is (10 sec: 52428.9, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1634140160. Throughput: 0: 43742.1. Samples: 1635408500. Policy #0 lag: (min: 0.0, avg: 37.7, max: 103.0) [2024-03-21 08:31:40,522][03784] Avg episode reward: [(0, '0.852')] [2024-03-21 08:31:42,317][04017] Updated weights for policy 0, policy_version 49876 (0.0012) [2024-03-21 08:31:45,521][03784] Fps is (10 sec: 68813.4, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1634402304. Throughput: 0: 43722.4. Samples: 1635533200. Policy #0 lag: (min: 0.0, avg: 37.7, max: 103.0) [2024-03-21 08:31:45,522][03784] Avg episode reward: [(0, '1.339')] [2024-03-21 08:31:50,521][03784] Fps is (10 sec: 49151.6, 60 sec: 46421.3, 300 sec: 45430.9). Total num frames: 1634631680. Throughput: 0: 44128.8. Samples: 1635813500. Policy #0 lag: (min: 0.0, avg: 37.7, max: 103.0) [2024-03-21 08:31:50,522][03784] Avg episode reward: [(0, '1.355')] [2024-03-21 08:31:50,953][04017] Updated weights for policy 0, policy_version 49886 (0.0011) [2024-03-21 08:31:55,521][03784] Fps is (10 sec: 36044.4, 60 sec: 45875.0, 300 sec: 45430.9). Total num frames: 1634762752. Throughput: 0: 44491.0. Samples: 1636078300. Policy #0 lag: (min: 0.0, avg: 37.7, max: 103.0) [2024-03-21 08:31:55,522][03784] Avg episode reward: [(0, '1.355')] [2024-03-21 08:31:59,620][04017] Updated weights for policy 0, policy_version 49896 (0.0014) [2024-03-21 08:32:00,521][03784] Fps is (10 sec: 39321.3, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 1635024896. Throughput: 0: 44059.8. Samples: 1636200500. Policy #0 lag: (min: 0.0, avg: 37.7, max: 103.0) [2024-03-21 08:32:00,522][03784] Avg episode reward: [(0, '1.680')] [2024-03-21 08:32:00,876][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000049898_1635057664.pth... [2024-03-21 08:32:01,014][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000049564_1624113152.pth [2024-03-21 08:32:05,521][03784] Fps is (10 sec: 45875.5, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 1635221504. Throughput: 0: 44231.1. Samples: 1636460400. Policy #0 lag: (min: 0.0, avg: 37.7, max: 103.0) [2024-03-21 08:32:05,522][03784] Avg episode reward: [(0, '1.092')] [2024-03-21 08:32:10,145][04017] Updated weights for policy 0, policy_version 49906 (0.0015) [2024-03-21 08:32:10,521][03784] Fps is (10 sec: 29491.1, 60 sec: 41506.0, 300 sec: 45541.9). Total num frames: 1635319808. Throughput: 0: 43922.1. Samples: 1636712500. Policy #0 lag: (min: 0.0, avg: 29.4, max: 72.0) [2024-03-21 08:32:10,522][03784] Avg episode reward: [(0, '0.976')] [2024-03-21 08:32:15,521][03784] Fps is (10 sec: 22937.6, 60 sec: 40413.8, 300 sec: 45319.8). Total num frames: 1635450880. Throughput: 0: 44260.0. Samples: 1636852600. Policy #0 lag: (min: 0.0, avg: 29.4, max: 72.0) [2024-03-21 08:32:15,522][03784] Avg episode reward: [(0, '1.332')] [2024-03-21 08:32:19,840][04017] Updated weights for policy 0, policy_version 49916 (0.0012) [2024-03-21 08:32:20,521][03784] Fps is (10 sec: 39322.2, 60 sec: 40413.9, 300 sec: 44986.6). Total num frames: 1635713024. Throughput: 0: 44260.0. Samples: 1637136000. Policy #0 lag: (min: 0.0, avg: 29.4, max: 72.0) [2024-03-21 08:32:20,522][03784] Avg episode reward: [(0, '1.146')] [2024-03-21 08:32:24,883][04017] Updated weights for policy 0, policy_version 49926 (0.0012) [2024-03-21 08:32:25,521][03784] Fps is (10 sec: 52428.4, 60 sec: 42598.4, 300 sec: 44986.6). Total num frames: 1635975168. Throughput: 0: 44275.5. Samples: 1637400900. Policy #0 lag: (min: 0.0, avg: 29.4, max: 72.0) [2024-03-21 08:32:25,522][03784] Avg episode reward: [(0, '1.343')] [2024-03-21 08:32:25,979][03995] Signal inference workers to stop experience collection... (32950 times) [2024-03-21 08:32:25,980][03995] Signal inference workers to resume experience collection... (32950 times) [2024-03-21 08:32:26,028][04017] InferenceWorker_p0-w0: stopping experience collection (32950 times) [2024-03-21 08:32:26,035][04017] InferenceWorker_p0-w0: resuming experience collection (32950 times) [2024-03-21 08:32:30,521][03784] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 45208.7). Total num frames: 1636237312. Throughput: 0: 44497.8. Samples: 1637535600. Policy #0 lag: (min: 0.0, avg: 29.4, max: 72.0) [2024-03-21 08:32:30,522][03784] Avg episode reward: [(0, '1.343')] [2024-03-21 08:32:31,351][04017] Updated weights for policy 0, policy_version 49936 (0.0011) [2024-03-21 08:32:35,521][03784] Fps is (10 sec: 52429.9, 60 sec: 46421.5, 300 sec: 44653.4). Total num frames: 1636499456. Throughput: 0: 44531.3. Samples: 1637817400. Policy #0 lag: (min: 0.0, avg: 29.4, max: 72.0) [2024-03-21 08:32:35,522][03784] Avg episode reward: [(0, '0.731')] [2024-03-21 08:32:38,547][04017] Updated weights for policy 0, policy_version 49947 (0.0016) [2024-03-21 08:32:40,521][03784] Fps is (10 sec: 55704.4, 60 sec: 44236.7, 300 sec: 44764.4). Total num frames: 1636794368. Throughput: 0: 45082.1. Samples: 1638107000. Policy #0 lag: (min: 0.0, avg: 29.4, max: 72.0) [2024-03-21 08:32:40,524][03784] Avg episode reward: [(0, '1.409')] [2024-03-21 08:32:42,538][04017] Updated weights for policy 0, policy_version 49957 (0.0024) [2024-03-21 08:32:45,521][03784] Fps is (10 sec: 75365.9, 60 sec: 47513.6, 300 sec: 45653.1). Total num frames: 1637253120. Throughput: 0: 45033.5. Samples: 1638227000. Policy #0 lag: (min: 1.0, avg: 39.7, max: 77.0) [2024-03-21 08:32:45,522][03784] Avg episode reward: [(0, '1.525')] [2024-03-21 08:32:47,063][04017] Updated weights for policy 0, policy_version 49967 (0.0021) [2024-03-21 08:32:50,521][03784] Fps is (10 sec: 62260.7, 60 sec: 46421.4, 300 sec: 45653.0). Total num frames: 1637416960. Throughput: 0: 45686.7. Samples: 1638516300. Policy #0 lag: (min: 1.0, avg: 39.7, max: 77.0) [2024-03-21 08:32:50,522][03784] Avg episode reward: [(0, '1.525')] [2024-03-21 08:32:55,521][03784] Fps is (10 sec: 32768.1, 60 sec: 46967.6, 300 sec: 45653.1). Total num frames: 1637580800. Throughput: 0: 46655.8. Samples: 1638812000. Policy #0 lag: (min: 1.0, avg: 39.7, max: 77.0) [2024-03-21 08:32:55,522][03784] Avg episode reward: [(0, '1.389')] [2024-03-21 08:32:57,672][04017] Updated weights for policy 0, policy_version 49977 (0.0012) [2024-03-21 08:33:00,521][03784] Fps is (10 sec: 32767.4, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 1637744640. Throughput: 0: 46768.8. Samples: 1638957200. Policy #0 lag: (min: 1.0, avg: 39.7, max: 77.0) [2024-03-21 08:33:00,522][03784] Avg episode reward: [(0, '1.166')] [2024-03-21 08:33:05,353][04017] Updated weights for policy 0, policy_version 49987 (0.0019) [2024-03-21 08:33:05,521][03784] Fps is (10 sec: 39320.8, 60 sec: 45875.1, 300 sec: 45986.3). Total num frames: 1637974016. Throughput: 0: 47059.9. Samples: 1639253700. Policy #0 lag: (min: 1.0, avg: 39.7, max: 77.0) [2024-03-21 08:33:05,522][03784] Avg episode reward: [(0, '1.399')] [2024-03-21 08:33:10,521][03784] Fps is (10 sec: 42598.7, 60 sec: 47513.7, 300 sec: 45653.0). Total num frames: 1638170624. Throughput: 0: 47024.5. Samples: 1639517000. Policy #0 lag: (min: 1.0, avg: 39.7, max: 77.0) [2024-03-21 08:33:10,522][03784] Avg episode reward: [(0, '1.399')] [2024-03-21 08:33:12,791][04017] Updated weights for policy 0, policy_version 49997 (0.0010) [2024-03-21 08:33:15,521][03784] Fps is (10 sec: 42598.8, 60 sec: 49152.0, 300 sec: 45319.8). Total num frames: 1638400000. Throughput: 0: 46975.5. Samples: 1639649500. Policy #0 lag: (min: 0.0, avg: 32.9, max: 80.0) [2024-03-21 08:33:15,522][03784] Avg episode reward: [(0, '1.278')] [2024-03-21 08:33:16,120][03995] Signal inference workers to stop experience collection... (33000 times) [2024-03-21 08:33:16,195][03995] Signal inference workers to resume experience collection... (33000 times) [2024-03-21 08:33:16,233][04017] InferenceWorker_p0-w0: stopping experience collection (33000 times) [2024-03-21 08:33:16,292][04017] InferenceWorker_p0-w0: resuming experience collection (33000 times) [2024-03-21 08:33:18,115][04017] Updated weights for policy 0, policy_version 50007 (0.0015) [2024-03-21 08:33:20,521][03784] Fps is (10 sec: 52429.3, 60 sec: 49698.2, 300 sec: 45875.2). Total num frames: 1638694912. Throughput: 0: 47062.1. Samples: 1639935200. Policy #0 lag: (min: 0.0, avg: 32.9, max: 80.0) [2024-03-21 08:33:20,522][03784] Avg episode reward: [(0, '1.212')] [2024-03-21 08:33:25,521][03784] Fps is (10 sec: 39321.0, 60 sec: 46967.4, 300 sec: 45319.8). Total num frames: 1638793216. Throughput: 0: 46940.0. Samples: 1640219300. Policy #0 lag: (min: 0.0, avg: 32.9, max: 80.0) [2024-03-21 08:33:25,522][03784] Avg episode reward: [(0, '1.599')] [2024-03-21 08:33:27,036][04017] Updated weights for policy 0, policy_version 50017 (0.0014) [2024-03-21 08:33:30,521][03784] Fps is (10 sec: 42597.6, 60 sec: 48059.6, 300 sec: 45319.8). Total num frames: 1639120896. Throughput: 0: 47148.7. Samples: 1640348700. Policy #0 lag: (min: 0.0, avg: 32.9, max: 80.0) [2024-03-21 08:33:30,522][03784] Avg episode reward: [(0, '0.980')] [2024-03-21 08:33:35,383][04017] Updated weights for policy 0, policy_version 50027 (0.0015) [2024-03-21 08:33:35,521][03784] Fps is (10 sec: 49152.7, 60 sec: 46421.2, 300 sec: 45430.9). Total num frames: 1639284736. Throughput: 0: 47175.5. Samples: 1640639200. Policy #0 lag: (min: 0.0, avg: 32.9, max: 80.0) [2024-03-21 08:33:35,522][03784] Avg episode reward: [(0, '1.401')] [2024-03-21 08:33:40,521][03784] Fps is (10 sec: 29491.3, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 1639415808. Throughput: 0: 47022.0. Samples: 1640928000. Policy #0 lag: (min: 0.0, avg: 32.9, max: 80.0) [2024-03-21 08:33:40,522][03784] Avg episode reward: [(0, '0.532')] [2024-03-21 08:33:43,323][04017] Updated weights for policy 0, policy_version 50037 (0.0018) [2024-03-21 08:33:45,521][03784] Fps is (10 sec: 42598.3, 60 sec: 40959.9, 300 sec: 45764.1). Total num frames: 1639710720. Throughput: 0: 46826.7. Samples: 1641064400. Policy #0 lag: (min: 5.0, avg: 38.5, max: 95.0) [2024-03-21 08:33:45,522][03784] Avg episode reward: [(0, '1.582')] [2024-03-21 08:33:48,428][04017] Updated weights for policy 0, policy_version 50047 (0.0013) [2024-03-21 08:33:50,521][03784] Fps is (10 sec: 72089.6, 60 sec: 45329.0, 300 sec: 46208.4). Total num frames: 1640136704. Throughput: 0: 46068.9. Samples: 1641326800. Policy #0 lag: (min: 5.0, avg: 38.5, max: 95.0) [2024-03-21 08:33:50,522][03784] Avg episode reward: [(0, '0.886')] [2024-03-21 08:33:52,467][04017] Updated weights for policy 0, policy_version 50057 (0.0013) [2024-03-21 08:33:55,521][03784] Fps is (10 sec: 65536.5, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1640366080. Throughput: 0: 46286.8. Samples: 1641599900. Policy #0 lag: (min: 5.0, avg: 38.5, max: 95.0) [2024-03-21 08:33:55,522][03784] Avg episode reward: [(0, '1.277')] [2024-03-21 08:33:59,987][04017] Updated weights for policy 0, policy_version 50067 (0.0010) [2024-03-21 08:34:00,521][03784] Fps is (10 sec: 49152.6, 60 sec: 48059.9, 300 sec: 45542.0). Total num frames: 1640628224. Throughput: 0: 46322.3. Samples: 1641734000. Policy #0 lag: (min: 5.0, avg: 38.5, max: 95.0) [2024-03-21 08:34:00,522][03784] Avg episode reward: [(0, '1.637')] [2024-03-21 08:34:00,780][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000050069_1640660992.pth... [2024-03-21 08:34:00,915][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000049729_1629519872.pth [2024-03-21 08:34:05,521][03784] Fps is (10 sec: 49152.7, 60 sec: 48060.0, 300 sec: 45875.2). Total num frames: 1640857600. Throughput: 0: 46600.1. Samples: 1642032200. Policy #0 lag: (min: 5.0, avg: 38.5, max: 95.0) [2024-03-21 08:34:05,522][03784] Avg episode reward: [(0, '1.637')] [2024-03-21 08:34:08,457][04017] Updated weights for policy 0, policy_version 50077 (0.0010) [2024-03-21 08:34:08,454][03995] Signal inference workers to stop experience collection... (33050 times) [2024-03-21 08:34:08,458][03995] Signal inference workers to resume experience collection... (33050 times) [2024-03-21 08:34:08,529][04017] InferenceWorker_p0-w0: stopping experience collection (33050 times) [2024-03-21 08:34:08,529][04017] InferenceWorker_p0-w0: resuming experience collection (33050 times) [2024-03-21 08:34:10,521][03784] Fps is (10 sec: 42597.9, 60 sec: 48059.7, 300 sec: 45764.1). Total num frames: 1641054208. Throughput: 0: 46542.3. Samples: 1642313700. Policy #0 lag: (min: 5.0, avg: 38.5, max: 95.0) [2024-03-21 08:34:10,522][03784] Avg episode reward: [(0, '1.341')] [2024-03-21 08:34:12,546][04017] Updated weights for policy 0, policy_version 50087 (0.0011) [2024-03-21 08:34:15,521][03784] Fps is (10 sec: 45874.7, 60 sec: 48605.9, 300 sec: 45875.2). Total num frames: 1641316352. Throughput: 0: 46615.8. Samples: 1642446400. Policy #0 lag: (min: 5.0, avg: 38.5, max: 95.0) [2024-03-21 08:34:15,522][03784] Avg episode reward: [(0, '1.514')] [2024-03-21 08:34:20,521][03784] Fps is (10 sec: 49152.1, 60 sec: 47513.5, 300 sec: 45764.1). Total num frames: 1641545728. Throughput: 0: 46371.1. Samples: 1642725900. Policy #0 lag: (min: 0.0, avg: 36.1, max: 85.0) [2024-03-21 08:34:20,522][03784] Avg episode reward: [(0, '1.368')] [2024-03-21 08:34:20,697][04017] Updated weights for policy 0, policy_version 50097 (0.0013) [2024-03-21 08:34:25,521][03784] Fps is (10 sec: 39321.5, 60 sec: 48606.1, 300 sec: 45319.8). Total num frames: 1641709568. Throughput: 0: 45766.8. Samples: 1642987500. Policy #0 lag: (min: 0.0, avg: 36.1, max: 85.0) [2024-03-21 08:34:25,522][03784] Avg episode reward: [(0, '1.138')] [2024-03-21 08:34:30,521][03784] Fps is (10 sec: 26214.6, 60 sec: 44783.1, 300 sec: 45430.9). Total num frames: 1641807872. Throughput: 0: 45697.8. Samples: 1643120800. Policy #0 lag: (min: 0.0, avg: 36.1, max: 85.0) [2024-03-21 08:34:30,522][03784] Avg episode reward: [(0, '1.457')] [2024-03-21 08:34:33,370][04017] Updated weights for policy 0, policy_version 50107 (0.0010) [2024-03-21 08:34:35,521][03784] Fps is (10 sec: 26214.1, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 1641971712. Throughput: 0: 46455.6. Samples: 1643417300. Policy #0 lag: (min: 0.0, avg: 36.1, max: 85.0) [2024-03-21 08:34:35,522][03784] Avg episode reward: [(0, '1.352')] [2024-03-21 08:34:40,521][03784] Fps is (10 sec: 36044.6, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1642168320. Throughput: 0: 46548.8. Samples: 1643694600. Policy #0 lag: (min: 0.0, avg: 36.1, max: 85.0) [2024-03-21 08:34:40,522][03784] Avg episode reward: [(0, '1.485')] [2024-03-21 08:34:42,217][04017] Updated weights for policy 0, policy_version 50117 (0.0023) [2024-03-21 08:34:45,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 1642364928. Throughput: 0: 46691.0. Samples: 1643835100. Policy #0 lag: (min: 0.0, avg: 36.1, max: 85.0) [2024-03-21 08:34:45,522][03784] Avg episode reward: [(0, '1.168')] [2024-03-21 08:34:48,077][04017] Updated weights for policy 0, policy_version 50127 (0.0015) [2024-03-21 08:34:50,521][03784] Fps is (10 sec: 58982.9, 60 sec: 43690.8, 300 sec: 45542.0). Total num frames: 1642758144. Throughput: 0: 45682.1. Samples: 1644087900. Policy #0 lag: (min: 3.0, avg: 37.2, max: 70.0) [2024-03-21 08:34:50,522][03784] Avg episode reward: [(0, '1.571')] [2024-03-21 08:34:51,557][04017] Updated weights for policy 0, policy_version 50137 (0.0019) [2024-03-21 08:34:53,878][03995] Signal inference workers to stop experience collection... (33100 times) [2024-03-21 08:34:53,952][03995] Signal inference workers to resume experience collection... (33100 times) [2024-03-21 08:34:53,968][04017] InferenceWorker_p0-w0: stopping experience collection (33100 times) [2024-03-21 08:34:54,014][04017] InferenceWorker_p0-w0: resuming experience collection (33100 times) [2024-03-21 08:34:55,521][03784] Fps is (10 sec: 78643.9, 60 sec: 46421.4, 300 sec: 46319.5). Total num frames: 1643151360. Throughput: 0: 44733.5. Samples: 1644326700. Policy #0 lag: (min: 3.0, avg: 37.2, max: 70.0) [2024-03-21 08:34:55,522][03784] Avg episode reward: [(0, '1.398')] [2024-03-21 08:34:58,592][04017] Updated weights for policy 0, policy_version 50147 (0.0016) [2024-03-21 08:35:00,521][03784] Fps is (10 sec: 65535.0, 60 sec: 46421.2, 300 sec: 45875.2). Total num frames: 1643413504. Throughput: 0: 45159.8. Samples: 1644478600. Policy #0 lag: (min: 3.0, avg: 37.2, max: 70.0) [2024-03-21 08:35:00,522][03784] Avg episode reward: [(0, '1.628')] [2024-03-21 08:35:03,568][04017] Updated weights for policy 0, policy_version 50157 (0.0014) [2024-03-21 08:35:05,521][03784] Fps is (10 sec: 45875.2, 60 sec: 45875.1, 300 sec: 45653.1). Total num frames: 1643610112. Throughput: 0: 44651.2. Samples: 1644735200. Policy #0 lag: (min: 3.0, avg: 37.2, max: 70.0) [2024-03-21 08:35:05,522][03784] Avg episode reward: [(0, '1.164')] [2024-03-21 08:35:08,713][04017] Updated weights for policy 0, policy_version 50167 (0.0014) [2024-03-21 08:35:10,521][03784] Fps is (10 sec: 49152.9, 60 sec: 47513.7, 300 sec: 45986.3). Total num frames: 1643905024. Throughput: 0: 44960.0. Samples: 1645010700. Policy #0 lag: (min: 3.0, avg: 37.2, max: 70.0) [2024-03-21 08:35:10,521][03784] Avg episode reward: [(0, '0.788')] [2024-03-21 08:35:15,521][03784] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 1644003328. Throughput: 0: 45193.3. Samples: 1645154500. Policy #0 lag: (min: 3.0, avg: 37.2, max: 70.0) [2024-03-21 08:35:15,522][03784] Avg episode reward: [(0, '0.788')] [2024-03-21 08:35:20,521][03784] Fps is (10 sec: 22937.4, 60 sec: 43144.6, 300 sec: 45319.8). Total num frames: 1644134400. Throughput: 0: 44920.0. Samples: 1645438700. Policy #0 lag: (min: 3.0, avg: 37.2, max: 70.0) [2024-03-21 08:35:20,522][03784] Avg episode reward: [(0, '1.317')] [2024-03-21 08:35:21,808][04017] Updated weights for policy 0, policy_version 50177 (0.0014) [2024-03-21 08:35:25,521][03784] Fps is (10 sec: 32768.1, 60 sec: 43690.7, 300 sec: 45097.7). Total num frames: 1644331008. Throughput: 0: 44789.0. Samples: 1645710100. Policy #0 lag: (min: 0.0, avg: 25.8, max: 66.0) [2024-03-21 08:35:25,522][03784] Avg episode reward: [(0, '1.427')] [2024-03-21 08:35:29,510][04017] Updated weights for policy 0, policy_version 50187 (0.0012) [2024-03-21 08:35:30,521][03784] Fps is (10 sec: 45875.2, 60 sec: 46421.3, 300 sec: 45097.7). Total num frames: 1644593152. Throughput: 0: 44348.9. Samples: 1645830800. Policy #0 lag: (min: 0.0, avg: 25.8, max: 66.0) [2024-03-21 08:35:30,522][03784] Avg episode reward: [(0, '1.237')] [2024-03-21 08:35:35,522][03784] Fps is (10 sec: 49150.1, 60 sec: 47513.4, 300 sec: 45430.8). Total num frames: 1644822528. Throughput: 0: 44210.7. Samples: 1646077400. Policy #0 lag: (min: 0.0, avg: 25.8, max: 66.0) [2024-03-21 08:35:35,523][03784] Avg episode reward: [(0, '0.738')] [2024-03-21 08:35:35,913][04017] Updated weights for policy 0, policy_version 50197 (0.0011) [2024-03-21 08:35:40,521][03784] Fps is (10 sec: 49152.1, 60 sec: 48605.9, 300 sec: 45653.0). Total num frames: 1645084672. Throughput: 0: 44562.2. Samples: 1646332000. Policy #0 lag: (min: 0.0, avg: 25.8, max: 66.0) [2024-03-21 08:35:40,522][03784] Avg episode reward: [(0, '1.542')] [2024-03-21 08:35:43,569][04017] Updated weights for policy 0, policy_version 50207 (0.0010) [2024-03-21 08:35:45,521][03784] Fps is (10 sec: 45876.6, 60 sec: 48605.9, 300 sec: 45542.0). Total num frames: 1645281280. Throughput: 0: 44302.3. Samples: 1646472200. Policy #0 lag: (min: 0.0, avg: 25.8, max: 66.0) [2024-03-21 08:35:45,522][03784] Avg episode reward: [(0, '1.604')] [2024-03-21 08:35:50,521][03784] Fps is (10 sec: 32768.2, 60 sec: 44236.8, 300 sec: 45430.9). Total num frames: 1645412352. Throughput: 0: 45086.6. Samples: 1646764100. Policy #0 lag: (min: 0.0, avg: 25.8, max: 66.0) [2024-03-21 08:35:50,522][03784] Avg episode reward: [(0, '1.337')] [2024-03-21 08:35:53,580][04017] Updated weights for policy 0, policy_version 50217 (0.0011) [2024-03-21 08:35:55,521][03784] Fps is (10 sec: 36044.9, 60 sec: 41506.1, 300 sec: 45319.8). Total num frames: 1645641728. Throughput: 0: 45108.8. Samples: 1647040600. Policy #0 lag: (min: 0.0, avg: 29.4, max: 64.0) [2024-03-21 08:35:55,522][03784] Avg episode reward: [(0, '0.718')] [2024-03-21 08:35:58,287][04017] Updated weights for policy 0, policy_version 50227 (0.0021) [2024-03-21 08:36:00,112][03995] Signal inference workers to stop experience collection... (33150 times) [2024-03-21 08:36:00,171][04017] InferenceWorker_p0-w0: stopping experience collection (33150 times) [2024-03-21 08:36:00,174][03995] Signal inference workers to resume experience collection... (33150 times) [2024-03-21 08:36:00,224][04017] InferenceWorker_p0-w0: resuming experience collection (33150 times) [2024-03-21 08:36:00,521][03784] Fps is (10 sec: 55705.9, 60 sec: 42598.5, 300 sec: 45542.0). Total num frames: 1645969408. Throughput: 0: 44926.7. Samples: 1647176200. Policy #0 lag: (min: 0.0, avg: 29.4, max: 64.0) [2024-03-21 08:36:00,521][03784] Avg episode reward: [(0, '1.017')] [2024-03-21 08:36:00,815][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000050233_1646034944.pth... [2024-03-21 08:36:00,941][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000049898_1635057664.pth [2024-03-21 08:36:04,822][04017] Updated weights for policy 0, policy_version 50237 (0.0011) [2024-03-21 08:36:05,521][03784] Fps is (10 sec: 55705.1, 60 sec: 43144.4, 300 sec: 45319.8). Total num frames: 1646198784. Throughput: 0: 45135.5. Samples: 1647469800. Policy #0 lag: (min: 0.0, avg: 29.4, max: 64.0) [2024-03-21 08:36:05,522][03784] Avg episode reward: [(0, '1.017')] [2024-03-21 08:36:10,385][04017] Updated weights for policy 0, policy_version 50247 (0.0011) [2024-03-21 08:36:10,521][03784] Fps is (10 sec: 52427.9, 60 sec: 43144.4, 300 sec: 45653.0). Total num frames: 1646493696. Throughput: 0: 44988.8. Samples: 1647734600. Policy #0 lag: (min: 0.0, avg: 29.4, max: 64.0) [2024-03-21 08:36:10,522][03784] Avg episode reward: [(0, '1.481')] [2024-03-21 08:36:15,521][03784] Fps is (10 sec: 62260.2, 60 sec: 46967.5, 300 sec: 45875.2). Total num frames: 1646821376. Throughput: 0: 45444.5. Samples: 1647875800. Policy #0 lag: (min: 0.0, avg: 29.4, max: 64.0) [2024-03-21 08:36:15,522][03784] Avg episode reward: [(0, '1.335')] [2024-03-21 08:36:15,525][04017] Updated weights for policy 0, policy_version 50257 (0.0019) [2024-03-21 08:36:20,521][03784] Fps is (10 sec: 55706.2, 60 sec: 48605.9, 300 sec: 46208.4). Total num frames: 1647050752. Throughput: 0: 46269.3. Samples: 1648159500. Policy #0 lag: (min: 0.0, avg: 29.4, max: 64.0) [2024-03-21 08:36:20,522][03784] Avg episode reward: [(0, '1.335')] [2024-03-21 08:36:23,679][04017] Updated weights for policy 0, policy_version 50267 (0.0012) [2024-03-21 08:36:25,521][03784] Fps is (10 sec: 32767.5, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 1647149056. Throughput: 0: 47053.3. Samples: 1648449400. Policy #0 lag: (min: 0.0, avg: 29.4, max: 64.0) [2024-03-21 08:36:25,522][03784] Avg episode reward: [(0, '0.716')] [2024-03-21 08:36:30,521][03784] Fps is (10 sec: 36044.6, 60 sec: 46967.5, 300 sec: 46430.6). Total num frames: 1647411200. Throughput: 0: 47317.8. Samples: 1648601500. Policy #0 lag: (min: 0.0, avg: 33.0, max: 74.0) [2024-03-21 08:36:30,522][03784] Avg episode reward: [(0, '0.716')] [2024-03-21 08:36:31,397][04017] Updated weights for policy 0, policy_version 50277 (0.0016) [2024-03-21 08:36:35,521][03784] Fps is (10 sec: 42599.2, 60 sec: 45875.5, 300 sec: 45542.0). Total num frames: 1647575040. Throughput: 0: 47289.0. Samples: 1648892100. Policy #0 lag: (min: 0.0, avg: 33.0, max: 74.0) [2024-03-21 08:36:35,521][03784] Avg episode reward: [(0, '1.183')] [2024-03-21 08:36:39,083][04017] Updated weights for policy 0, policy_version 50287 (0.0011) [2024-03-21 08:36:40,521][03784] Fps is (10 sec: 45875.1, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 1647869952. Throughput: 0: 46722.2. Samples: 1649143100. Policy #0 lag: (min: 0.0, avg: 33.0, max: 74.0) [2024-03-21 08:36:40,522][03784] Avg episode reward: [(0, '1.061')] [2024-03-21 08:36:45,521][03784] Fps is (10 sec: 32767.7, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 1647902720. Throughput: 0: 46875.5. Samples: 1649285600. Policy #0 lag: (min: 0.0, avg: 33.0, max: 74.0) [2024-03-21 08:36:45,522][03784] Avg episode reward: [(0, '0.547')] [2024-03-21 08:36:47,984][04017] Updated weights for policy 0, policy_version 50297 (0.0015) [2024-03-21 08:36:50,521][03784] Fps is (10 sec: 39321.6, 60 sec: 47513.6, 300 sec: 45764.1). Total num frames: 1648263168. Throughput: 0: 45302.3. Samples: 1649508400. Policy #0 lag: (min: 0.0, avg: 33.0, max: 74.0) [2024-03-21 08:36:50,522][03784] Avg episode reward: [(0, '1.304')] [2024-03-21 08:36:53,746][03995] Signal inference workers to stop experience collection... (33200 times) [2024-03-21 08:36:53,827][03995] Signal inference workers to resume experience collection... (33200 times) [2024-03-21 08:36:53,834][04017] InferenceWorker_p0-w0: stopping experience collection (33200 times) [2024-03-21 08:36:53,881][04017] InferenceWorker_p0-w0: resuming experience collection (33200 times) [2024-03-21 08:36:54,153][04017] Updated weights for policy 0, policy_version 50307 (0.0012) [2024-03-21 08:36:55,521][03784] Fps is (10 sec: 58981.9, 60 sec: 47513.6, 300 sec: 45653.1). Total num frames: 1648492544. Throughput: 0: 45100.0. Samples: 1649764100. Policy #0 lag: (min: 0.0, avg: 33.0, max: 74.0) [2024-03-21 08:36:55,522][03784] Avg episode reward: [(0, '0.562')] [2024-03-21 08:37:00,521][03784] Fps is (10 sec: 49151.2, 60 sec: 46421.1, 300 sec: 45875.2). Total num frames: 1648754688. Throughput: 0: 44986.4. Samples: 1649900200. Policy #0 lag: (min: 0.0, avg: 60.3, max: 121.0) [2024-03-21 08:37:00,522][03784] Avg episode reward: [(0, '1.395')] [2024-03-21 08:37:00,676][04017] Updated weights for policy 0, policy_version 50317 (0.0016) [2024-03-21 08:37:05,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46421.4, 300 sec: 46319.5). Total num frames: 1648984064. Throughput: 0: 44526.6. Samples: 1650163200. Policy #0 lag: (min: 0.0, avg: 60.3, max: 121.0) [2024-03-21 08:37:05,522][03784] Avg episode reward: [(0, '1.368')] [2024-03-21 08:37:10,521][03784] Fps is (10 sec: 32768.5, 60 sec: 43144.6, 300 sec: 46208.4). Total num frames: 1649082368. Throughput: 0: 44793.4. Samples: 1650465100. Policy #0 lag: (min: 0.0, avg: 60.3, max: 121.0) [2024-03-21 08:37:10,522][03784] Avg episode reward: [(0, '0.968')] [2024-03-21 08:37:11,902][04017] Updated weights for policy 0, policy_version 50327 (0.0015) [2024-03-21 08:37:15,521][03784] Fps is (10 sec: 19660.8, 60 sec: 39321.5, 300 sec: 45653.0). Total num frames: 1649180672. Throughput: 0: 44786.7. Samples: 1650616900. Policy #0 lag: (min: 0.0, avg: 60.3, max: 121.0) [2024-03-21 08:37:15,522][03784] Avg episode reward: [(0, '0.878')] [2024-03-21 08:37:20,422][04017] Updated weights for policy 0, policy_version 50337 (0.0011) [2024-03-21 08:37:20,521][03784] Fps is (10 sec: 36044.5, 60 sec: 39867.6, 300 sec: 45653.0). Total num frames: 1649442816. Throughput: 0: 44848.7. Samples: 1650910300. Policy #0 lag: (min: 0.0, avg: 60.3, max: 121.0) [2024-03-21 08:37:20,522][03784] Avg episode reward: [(0, '1.181')] [2024-03-21 08:37:25,526][03784] Fps is (10 sec: 45854.2, 60 sec: 41503.0, 300 sec: 45430.2). Total num frames: 1649639424. Throughput: 0: 45597.6. Samples: 1651195200. Policy #0 lag: (min: 0.0, avg: 60.3, max: 121.0) [2024-03-21 08:37:25,526][03784] Avg episode reward: [(0, '1.331')] [2024-03-21 08:37:26,870][04017] Updated weights for policy 0, policy_version 50347 (0.0015) [2024-03-21 08:37:30,521][03784] Fps is (10 sec: 55706.2, 60 sec: 43144.5, 300 sec: 45764.1). Total num frames: 1649999872. Throughput: 0: 45215.5. Samples: 1651320300. Policy #0 lag: (min: 1.0, avg: 36.0, max: 88.0) [2024-03-21 08:37:30,522][03784] Avg episode reward: [(0, '1.081')] [2024-03-21 08:37:31,202][04017] Updated weights for policy 0, policy_version 50357 (0.0013) [2024-03-21 08:37:34,409][04017] Updated weights for policy 0, policy_version 50367 (0.0016) [2024-03-21 08:37:35,521][03784] Fps is (10 sec: 81957.1, 60 sec: 48059.6, 300 sec: 46319.5). Total num frames: 1650458624. Throughput: 0: 45802.2. Samples: 1651569500. Policy #0 lag: (min: 1.0, avg: 36.0, max: 88.0) [2024-03-21 08:37:35,522][03784] Avg episode reward: [(0, '1.706')] [2024-03-21 08:37:40,521][03784] Fps is (10 sec: 62259.4, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1650622464. Throughput: 0: 45829.0. Samples: 1651826400. Policy #0 lag: (min: 1.0, avg: 36.0, max: 88.0) [2024-03-21 08:37:40,522][03784] Avg episode reward: [(0, '1.403')] [2024-03-21 08:37:43,702][03995] Signal inference workers to stop experience collection... (33250 times) [2024-03-21 08:37:43,767][03995] Signal inference workers to resume experience collection... (33250 times) [2024-03-21 08:37:43,772][04017] InferenceWorker_p0-w0: stopping experience collection (33250 times) [2024-03-21 08:37:43,774][04017] Updated weights for policy 0, policy_version 50377 (0.0011) [2024-03-21 08:37:43,809][04017] InferenceWorker_p0-w0: resuming experience collection (33250 times) [2024-03-21 08:37:45,521][03784] Fps is (10 sec: 36045.2, 60 sec: 48605.9, 300 sec: 45430.9). Total num frames: 1650819072. Throughput: 0: 45231.3. Samples: 1651935600. Policy #0 lag: (min: 1.0, avg: 36.0, max: 88.0) [2024-03-21 08:37:45,522][03784] Avg episode reward: [(0, '1.087')] [2024-03-21 08:37:48,980][04017] Updated weights for policy 0, policy_version 50387 (0.0016) [2024-03-21 08:37:50,521][03784] Fps is (10 sec: 45874.8, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 1651081216. Throughput: 0: 45142.2. Samples: 1652194600. Policy #0 lag: (min: 1.0, avg: 36.0, max: 88.0) [2024-03-21 08:37:50,522][03784] Avg episode reward: [(0, '0.548')] [2024-03-21 08:37:55,521][03784] Fps is (10 sec: 45874.6, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 1651277824. Throughput: 0: 45191.0. Samples: 1652498700. Policy #0 lag: (min: 1.0, avg: 36.0, max: 88.0) [2024-03-21 08:37:55,522][03784] Avg episode reward: [(0, '1.667')] [2024-03-21 08:37:57,079][04017] Updated weights for policy 0, policy_version 50397 (0.0016) [2024-03-21 08:38:00,521][03784] Fps is (10 sec: 42598.6, 60 sec: 45875.3, 300 sec: 45875.2). Total num frames: 1651507200. Throughput: 0: 44911.1. Samples: 1652637900. Policy #0 lag: (min: 1.0, avg: 36.0, max: 88.0) [2024-03-21 08:38:00,522][03784] Avg episode reward: [(0, '0.972')] [2024-03-21 08:38:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000050400_1651507200.pth... [2024-03-21 08:38:00,638][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000050069_1640660992.pth [2024-03-21 08:38:05,521][03784] Fps is (10 sec: 36045.3, 60 sec: 44236.8, 300 sec: 45653.1). Total num frames: 1651638272. Throughput: 0: 44577.9. Samples: 1652916300. Policy #0 lag: (min: 0.0, avg: 37.7, max: 79.0) [2024-03-21 08:38:05,522][03784] Avg episode reward: [(0, '1.266')] [2024-03-21 08:38:08,138][04017] Updated weights for policy 0, policy_version 50407 (0.0016) [2024-03-21 08:38:10,521][03784] Fps is (10 sec: 22937.7, 60 sec: 44236.8, 300 sec: 45208.7). Total num frames: 1651736576. Throughput: 0: 44426.8. Samples: 1653194200. Policy #0 lag: (min: 0.0, avg: 37.7, max: 79.0) [2024-03-21 08:38:10,522][03784] Avg episode reward: [(0, '1.381')] [2024-03-21 08:38:15,521][03784] Fps is (10 sec: 32768.1, 60 sec: 46421.4, 300 sec: 44986.6). Total num frames: 1651965952. Throughput: 0: 44702.3. Samples: 1653331900. Policy #0 lag: (min: 0.0, avg: 37.7, max: 79.0) [2024-03-21 08:38:15,522][03784] Avg episode reward: [(0, '1.410')] [2024-03-21 08:38:17,239][04017] Updated weights for policy 0, policy_version 50417 (0.0016) [2024-03-21 08:38:20,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44783.0, 300 sec: 45208.8). Total num frames: 1652129792. Throughput: 0: 45604.5. Samples: 1653621700. Policy #0 lag: (min: 0.0, avg: 37.7, max: 79.0) [2024-03-21 08:38:20,522][03784] Avg episode reward: [(0, '1.434')] [2024-03-21 08:38:25,378][04017] Updated weights for policy 0, policy_version 50427 (0.0015) [2024-03-21 08:38:25,521][03784] Fps is (10 sec: 42597.8, 60 sec: 45878.7, 300 sec: 44986.6). Total num frames: 1652391936. Throughput: 0: 45848.8. Samples: 1653889600. Policy #0 lag: (min: 0.0, avg: 37.7, max: 79.0) [2024-03-21 08:38:25,522][03784] Avg episode reward: [(0, '0.927')] [2024-03-21 08:38:30,084][04017] Updated weights for policy 0, policy_version 50437 (0.0013) [2024-03-21 08:38:30,521][03784] Fps is (10 sec: 58981.8, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 1652719616. Throughput: 0: 45933.2. Samples: 1654002600. Policy #0 lag: (min: 0.0, avg: 37.7, max: 79.0) [2024-03-21 08:38:30,522][03784] Avg episode reward: [(0, '1.502')] [2024-03-21 08:38:35,208][04017] Updated weights for policy 0, policy_version 50447 (0.0016) [2024-03-21 08:38:35,521][03784] Fps is (10 sec: 68813.7, 60 sec: 43690.8, 300 sec: 46319.5). Total num frames: 1653080064. Throughput: 0: 46449.0. Samples: 1654284800. Policy #0 lag: (min: 0.0, avg: 44.1, max: 95.0) [2024-03-21 08:38:35,522][03784] Avg episode reward: [(0, '1.502')] [2024-03-21 08:38:39,427][04017] Updated weights for policy 0, policy_version 50457 (0.0012) [2024-03-21 08:38:39,469][03995] Signal inference workers to stop experience collection... (33300 times) [2024-03-21 08:38:39,570][04017] InferenceWorker_p0-w0: stopping experience collection (33300 times) [2024-03-21 08:38:39,702][03995] Signal inference workers to resume experience collection... (33300 times) [2024-03-21 08:38:39,703][04017] InferenceWorker_p0-w0: resuming experience collection (33300 times) [2024-03-21 08:38:40,521][03784] Fps is (10 sec: 75367.5, 60 sec: 47513.6, 300 sec: 46652.8). Total num frames: 1653473280. Throughput: 0: 45451.3. Samples: 1654544000. Policy #0 lag: (min: 0.0, avg: 44.1, max: 95.0) [2024-03-21 08:38:40,522][03784] Avg episode reward: [(0, '0.582')] [2024-03-21 08:38:45,521][03784] Fps is (10 sec: 55704.7, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 1653637120. Throughput: 0: 45359.9. Samples: 1654679100. Policy #0 lag: (min: 0.0, avg: 44.1, max: 95.0) [2024-03-21 08:38:45,523][03784] Avg episode reward: [(0, '1.285')] [2024-03-21 08:38:46,203][04017] Updated weights for policy 0, policy_version 50467 (0.0018) [2024-03-21 08:38:50,521][03784] Fps is (10 sec: 29491.4, 60 sec: 44783.1, 300 sec: 45430.9). Total num frames: 1653768192. Throughput: 0: 45942.3. Samples: 1654983700. Policy #0 lag: (min: 0.0, avg: 44.1, max: 95.0) [2024-03-21 08:38:50,521][03784] Avg episode reward: [(0, '1.555')] [2024-03-21 08:38:55,521][03784] Fps is (10 sec: 32768.3, 60 sec: 44783.0, 300 sec: 45208.7). Total num frames: 1653964800. Throughput: 0: 46288.9. Samples: 1655277200. Policy #0 lag: (min: 0.0, avg: 44.1, max: 95.0) [2024-03-21 08:38:55,522][03784] Avg episode reward: [(0, '1.754')] [2024-03-21 08:38:56,198][04017] Updated weights for policy 0, policy_version 50477 (0.0021) [2024-03-21 08:39:00,521][03784] Fps is (10 sec: 52428.0, 60 sec: 46421.3, 300 sec: 45541.9). Total num frames: 1654292480. Throughput: 0: 46064.3. Samples: 1655404800. Policy #0 lag: (min: 0.0, avg: 44.1, max: 95.0) [2024-03-21 08:39:00,522][03784] Avg episode reward: [(0, '1.449')] [2024-03-21 08:39:01,468][04017] Updated weights for policy 0, policy_version 50487 (0.0024) [2024-03-21 08:39:05,521][03784] Fps is (10 sec: 49152.2, 60 sec: 46967.5, 300 sec: 45430.9). Total num frames: 1654456320. Throughput: 0: 45813.3. Samples: 1655683300. Policy #0 lag: (min: 0.0, avg: 37.3, max: 85.0) [2024-03-21 08:39:05,522][03784] Avg episode reward: [(0, '1.740')] [2024-03-21 08:39:10,521][03784] Fps is (10 sec: 36045.0, 60 sec: 48605.9, 300 sec: 45208.7). Total num frames: 1654652928. Throughput: 0: 46140.1. Samples: 1655965900. Policy #0 lag: (min: 0.0, avg: 37.3, max: 85.0) [2024-03-21 08:39:10,522][03784] Avg episode reward: [(0, '0.633')] [2024-03-21 08:39:11,748][04017] Updated weights for policy 0, policy_version 50497 (0.0015) [2024-03-21 08:39:15,521][03784] Fps is (10 sec: 39321.7, 60 sec: 48059.7, 300 sec: 45097.7). Total num frames: 1654849536. Throughput: 0: 46946.8. Samples: 1656115200. Policy #0 lag: (min: 0.0, avg: 37.3, max: 85.0) [2024-03-21 08:39:15,522][03784] Avg episode reward: [(0, '0.918')] [2024-03-21 08:39:17,736][04017] Updated weights for policy 0, policy_version 50507 (0.0018) [2024-03-21 08:39:20,521][03784] Fps is (10 sec: 45875.1, 60 sec: 49698.1, 300 sec: 45430.9). Total num frames: 1655111680. Throughput: 0: 47015.5. Samples: 1656400500. Policy #0 lag: (min: 0.0, avg: 37.3, max: 85.0) [2024-03-21 08:39:20,522][03784] Avg episode reward: [(0, '0.810')] [2024-03-21 08:39:24,474][04017] Updated weights for policy 0, policy_version 50517 (0.0011) [2024-03-21 08:39:25,521][03784] Fps is (10 sec: 58981.5, 60 sec: 50790.4, 300 sec: 46208.4). Total num frames: 1655439360. Throughput: 0: 47486.5. Samples: 1656680900. Policy #0 lag: (min: 0.0, avg: 37.3, max: 85.0) [2024-03-21 08:39:25,522][03784] Avg episode reward: [(0, '1.291')] [2024-03-21 08:39:30,521][03784] Fps is (10 sec: 45875.5, 60 sec: 47513.7, 300 sec: 46097.4). Total num frames: 1655570432. Throughput: 0: 47993.5. Samples: 1656838800. Policy #0 lag: (min: 0.0, avg: 37.3, max: 85.0) [2024-03-21 08:39:30,522][03784] Avg episode reward: [(0, '1.291')] [2024-03-21 08:39:31,709][04017] Updated weights for policy 0, policy_version 50527 (0.0016) [2024-03-21 08:39:35,521][03784] Fps is (10 sec: 32768.3, 60 sec: 44782.9, 300 sec: 46097.4). Total num frames: 1655767040. Throughput: 0: 47513.2. Samples: 1657121800. Policy #0 lag: (min: 0.0, avg: 37.3, max: 85.0) [2024-03-21 08:39:35,522][03784] Avg episode reward: [(0, '0.959')] [2024-03-21 08:39:40,133][04017] Updated weights for policy 0, policy_version 50537 (0.0011) [2024-03-21 08:39:40,521][03784] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 46208.4). Total num frames: 1655996416. Throughput: 0: 47402.2. Samples: 1657410300. Policy #0 lag: (min: 0.0, avg: 42.4, max: 98.0) [2024-03-21 08:39:40,522][03784] Avg episode reward: [(0, '0.545')] [2024-03-21 08:39:43,447][03995] Signal inference workers to stop experience collection... (33350 times) [2024-03-21 08:39:43,517][03995] Signal inference workers to resume experience collection... (33350 times) [2024-03-21 08:39:43,523][04017] InferenceWorker_p0-w0: stopping experience collection (33350 times) [2024-03-21 08:39:43,590][04017] InferenceWorker_p0-w0: resuming experience collection (33350 times) [2024-03-21 08:39:45,016][04017] Updated weights for policy 0, policy_version 50547 (0.0016) [2024-03-21 08:39:45,521][03784] Fps is (10 sec: 58982.9, 60 sec: 45329.2, 300 sec: 46097.4). Total num frames: 1656356864. Throughput: 0: 47749.0. Samples: 1657553500. Policy #0 lag: (min: 0.0, avg: 42.4, max: 98.0) [2024-03-21 08:39:45,522][03784] Avg episode reward: [(0, '0.545')] [2024-03-21 08:39:50,521][03784] Fps is (10 sec: 58982.6, 60 sec: 46967.4, 300 sec: 45542.0). Total num frames: 1656586240. Throughput: 0: 47757.8. Samples: 1657832400. Policy #0 lag: (min: 0.0, avg: 42.4, max: 98.0) [2024-03-21 08:39:50,522][03784] Avg episode reward: [(0, '1.652')] [2024-03-21 08:39:51,243][04017] Updated weights for policy 0, policy_version 50557 (0.0012) [2024-03-21 08:39:55,521][03784] Fps is (10 sec: 45874.8, 60 sec: 47513.6, 300 sec: 45430.9). Total num frames: 1656815616. Throughput: 0: 47855.5. Samples: 1658119400. Policy #0 lag: (min: 0.0, avg: 42.4, max: 98.0) [2024-03-21 08:39:55,522][03784] Avg episode reward: [(0, '1.422')] [2024-03-21 08:39:58,005][04017] Updated weights for policy 0, policy_version 50567 (0.0015) [2024-03-21 08:40:00,521][03784] Fps is (10 sec: 39321.5, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 1656979456. Throughput: 0: 47564.4. Samples: 1658255600. Policy #0 lag: (min: 0.0, avg: 42.4, max: 98.0) [2024-03-21 08:40:00,522][03784] Avg episode reward: [(0, '1.468')] [2024-03-21 08:40:00,535][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000050567_1656979456.pth... [2024-03-21 08:40:00,710][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000050233_1646034944.pth [2024-03-21 08:40:05,356][04017] Updated weights for policy 0, policy_version 50577 (0.0011) [2024-03-21 08:40:05,521][03784] Fps is (10 sec: 49151.6, 60 sec: 47513.5, 300 sec: 45430.9). Total num frames: 1657307136. Throughput: 0: 46944.4. Samples: 1658513000. Policy #0 lag: (min: 0.0, avg: 42.4, max: 98.0) [2024-03-21 08:40:05,531][03784] Avg episode reward: [(0, '1.413')] [2024-03-21 08:40:09,302][04017] Updated weights for policy 0, policy_version 50587 (0.0016) [2024-03-21 08:40:10,521][03784] Fps is (10 sec: 68813.3, 60 sec: 50244.3, 300 sec: 46319.5). Total num frames: 1657667584. Throughput: 0: 46013.5. Samples: 1658751500. Policy #0 lag: (min: 1.0, avg: 38.1, max: 75.0) [2024-03-21 08:40:10,522][03784] Avg episode reward: [(0, '0.705')] [2024-03-21 08:40:15,521][03784] Fps is (10 sec: 52429.1, 60 sec: 49698.1, 300 sec: 46430.6). Total num frames: 1657831424. Throughput: 0: 45659.9. Samples: 1658893500. Policy #0 lag: (min: 1.0, avg: 38.1, max: 75.0) [2024-03-21 08:40:15,522][03784] Avg episode reward: [(0, '1.054')] [2024-03-21 08:40:16,829][04017] Updated weights for policy 0, policy_version 50597 (0.0017) [2024-03-21 08:40:20,521][03784] Fps is (10 sec: 32767.6, 60 sec: 48059.7, 300 sec: 46319.5). Total num frames: 1657995264. Throughput: 0: 44904.4. Samples: 1659142500. Policy #0 lag: (min: 1.0, avg: 38.1, max: 75.0) [2024-03-21 08:40:20,522][03784] Avg episode reward: [(0, '0.857')] [2024-03-21 08:40:25,521][03784] Fps is (10 sec: 32768.4, 60 sec: 45329.2, 300 sec: 45986.3). Total num frames: 1658159104. Throughput: 0: 44860.1. Samples: 1659429000. Policy #0 lag: (min: 1.0, avg: 38.1, max: 75.0) [2024-03-21 08:40:25,522][03784] Avg episode reward: [(0, '0.849')] [2024-03-21 08:40:29,984][04017] Updated weights for policy 0, policy_version 50607 (0.0010) [2024-03-21 08:40:30,521][03784] Fps is (10 sec: 32767.6, 60 sec: 45875.0, 300 sec: 45764.1). Total num frames: 1658322944. Throughput: 0: 45155.3. Samples: 1659585500. Policy #0 lag: (min: 1.0, avg: 38.1, max: 75.0) [2024-03-21 08:40:30,522][03784] Avg episode reward: [(0, '1.316')] [2024-03-21 08:40:35,390][03995] Signal inference workers to stop experience collection... (33400 times) [2024-03-21 08:40:35,390][03995] Signal inference workers to resume experience collection... (33400 times) [2024-03-21 08:40:35,462][04017] InferenceWorker_p0-w0: stopping experience collection (33400 times) [2024-03-21 08:40:35,463][04017] InferenceWorker_p0-w0: resuming experience collection (33400 times) [2024-03-21 08:40:35,521][03784] Fps is (10 sec: 42597.5, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 1658585088. Throughput: 0: 44777.7. Samples: 1659847400. Policy #0 lag: (min: 1.0, avg: 38.1, max: 75.0) [2024-03-21 08:40:35,522][03784] Avg episode reward: [(0, '1.202')] [2024-03-21 08:40:35,753][04017] Updated weights for policy 0, policy_version 50617 (0.0012) [2024-03-21 08:40:40,521][03784] Fps is (10 sec: 32768.4, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 1658650624. Throughput: 0: 44731.1. Samples: 1660132300. Policy #0 lag: (min: 1.0, avg: 38.1, max: 75.0) [2024-03-21 08:40:40,522][03784] Avg episode reward: [(0, '1.245')] [2024-03-21 08:40:45,521][03784] Fps is (10 sec: 16384.3, 60 sec: 39867.7, 300 sec: 45208.7). Total num frames: 1658748928. Throughput: 0: 44724.5. Samples: 1660268200. Policy #0 lag: (min: 0.0, avg: 29.7, max: 67.0) [2024-03-21 08:40:45,522][03784] Avg episode reward: [(0, '0.921')] [2024-03-21 08:40:47,881][04017] Updated weights for policy 0, policy_version 50627 (0.0013) [2024-03-21 08:40:50,521][03784] Fps is (10 sec: 32768.3, 60 sec: 39867.8, 300 sec: 45208.7). Total num frames: 1658978304. Throughput: 0: 44651.2. Samples: 1660522300. Policy #0 lag: (min: 0.0, avg: 29.7, max: 67.0) [2024-03-21 08:40:50,522][03784] Avg episode reward: [(0, '1.033')] [2024-03-21 08:40:54,183][04017] Updated weights for policy 0, policy_version 50637 (0.0017) [2024-03-21 08:40:55,521][03784] Fps is (10 sec: 65536.3, 60 sec: 43144.6, 300 sec: 45542.0). Total num frames: 1659404288. Throughput: 0: 45262.2. Samples: 1660788300. Policy #0 lag: (min: 0.0, avg: 29.7, max: 67.0) [2024-03-21 08:40:55,522][03784] Avg episode reward: [(0, '1.533')] [2024-03-21 08:40:57,331][04017] Updated weights for policy 0, policy_version 50647 (0.0014) [2024-03-21 08:41:00,521][03784] Fps is (10 sec: 85194.3, 60 sec: 47513.4, 300 sec: 46208.4). Total num frames: 1659830272. Throughput: 0: 44519.8. Samples: 1660896900. Policy #0 lag: (min: 0.0, avg: 29.7, max: 67.0) [2024-03-21 08:41:00,523][03784] Avg episode reward: [(0, '1.197')] [2024-03-21 08:41:02,507][04017] Updated weights for policy 0, policy_version 50657 (0.0014) [2024-03-21 08:41:05,521][03784] Fps is (10 sec: 62258.5, 60 sec: 45329.1, 300 sec: 45875.2). Total num frames: 1660026880. Throughput: 0: 44713.4. Samples: 1661154600. Policy #0 lag: (min: 0.0, avg: 29.7, max: 67.0) [2024-03-21 08:41:05,522][03784] Avg episode reward: [(0, '1.364')] [2024-03-21 08:41:08,945][04017] Updated weights for policy 0, policy_version 50667 (0.0011) [2024-03-21 08:41:10,521][03784] Fps is (10 sec: 42599.3, 60 sec: 43144.4, 300 sec: 45541.9). Total num frames: 1660256256. Throughput: 0: 44595.4. Samples: 1661435800. Policy #0 lag: (min: 0.0, avg: 29.7, max: 67.0) [2024-03-21 08:41:10,522][03784] Avg episode reward: [(0, '1.292')] [2024-03-21 08:41:15,521][03784] Fps is (10 sec: 45874.8, 60 sec: 44236.7, 300 sec: 45541.9). Total num frames: 1660485632. Throughput: 0: 44202.3. Samples: 1661574600. Policy #0 lag: (min: 0.0, avg: 52.0, max: 116.0) [2024-03-21 08:41:15,522][03784] Avg episode reward: [(0, '0.635')] [2024-03-21 08:41:17,125][04017] Updated weights for policy 0, policy_version 50677 (0.0016) [2024-03-21 08:41:20,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 1660715008. Throughput: 0: 44460.0. Samples: 1661848100. Policy #0 lag: (min: 0.0, avg: 52.0, max: 116.0) [2024-03-21 08:41:20,522][03784] Avg episode reward: [(0, '0.949')] [2024-03-21 08:41:20,859][03995] Signal inference workers to stop experience collection... (33450 times) [2024-03-21 08:41:20,931][03995] Signal inference workers to resume experience collection... (33450 times) [2024-03-21 08:41:20,939][04017] InferenceWorker_p0-w0: stopping experience collection (33450 times) [2024-03-21 08:41:20,977][04017] InferenceWorker_p0-w0: resuming experience collection (33450 times) [2024-03-21 08:41:25,521][03784] Fps is (10 sec: 36045.2, 60 sec: 44782.9, 300 sec: 45542.0). Total num frames: 1660846080. Throughput: 0: 43895.6. Samples: 1662107600. Policy #0 lag: (min: 0.0, avg: 52.0, max: 116.0) [2024-03-21 08:41:25,522][03784] Avg episode reward: [(0, '1.750')] [2024-03-21 08:41:26,164][04017] Updated weights for policy 0, policy_version 50687 (0.0028) [2024-03-21 08:41:30,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 1661075456. Throughput: 0: 43397.6. Samples: 1662221100. Policy #0 lag: (min: 0.0, avg: 52.0, max: 116.0) [2024-03-21 08:41:30,522][03784] Avg episode reward: [(0, '1.362')] [2024-03-21 08:41:35,521][03784] Fps is (10 sec: 36044.7, 60 sec: 43690.7, 300 sec: 45208.7). Total num frames: 1661206528. Throughput: 0: 42895.5. Samples: 1662452600. Policy #0 lag: (min: 0.0, avg: 52.0, max: 116.0) [2024-03-21 08:41:35,522][03784] Avg episode reward: [(0, '0.957')] [2024-03-21 08:41:37,272][04017] Updated weights for policy 0, policy_version 50697 (0.0017) [2024-03-21 08:41:40,521][03784] Fps is (10 sec: 29491.4, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 1661370368. Throughput: 0: 42326.6. Samples: 1662693000. Policy #0 lag: (min: 0.0, avg: 52.0, max: 116.0) [2024-03-21 08:41:40,522][03784] Avg episode reward: [(0, '0.717')] [2024-03-21 08:41:45,521][03784] Fps is (10 sec: 26214.3, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 1661468672. Throughput: 0: 43324.6. Samples: 1662846500. Policy #0 lag: (min: 0.0, avg: 52.0, max: 116.0) [2024-03-21 08:41:45,522][03784] Avg episode reward: [(0, '1.516')] [2024-03-21 08:41:46,552][04017] Updated weights for policy 0, policy_version 50707 (0.0022) [2024-03-21 08:41:50,521][03784] Fps is (10 sec: 42597.9, 60 sec: 46967.3, 300 sec: 45097.6). Total num frames: 1661796352. Throughput: 0: 44155.5. Samples: 1663141600. Policy #0 lag: (min: 0.0, avg: 26.9, max: 68.0) [2024-03-21 08:41:50,522][03784] Avg episode reward: [(0, '0.899')] [2024-03-21 08:41:51,411][04017] Updated weights for policy 0, policy_version 50717 (0.0010) [2024-03-21 08:41:55,521][03784] Fps is (10 sec: 62259.5, 60 sec: 44782.9, 300 sec: 45208.8). Total num frames: 1662091264. Throughput: 0: 43897.8. Samples: 1663411200. Policy #0 lag: (min: 0.0, avg: 26.9, max: 68.0) [2024-03-21 08:41:55,522][03784] Avg episode reward: [(0, '1.049')] [2024-03-21 08:41:57,687][04017] Updated weights for policy 0, policy_version 50727 (0.0017) [2024-03-21 08:42:00,521][03784] Fps is (10 sec: 55706.1, 60 sec: 42052.4, 300 sec: 45319.8). Total num frames: 1662353408. Throughput: 0: 43842.3. Samples: 1663547500. Policy #0 lag: (min: 0.0, avg: 26.9, max: 68.0) [2024-03-21 08:42:00,522][03784] Avg episode reward: [(0, '1.467')] [2024-03-21 08:42:00,859][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000050732_1662386176.pth... [2024-03-21 08:42:00,966][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000050400_1651507200.pth [2024-03-21 08:42:05,521][03784] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 45430.9). Total num frames: 1662484480. Throughput: 0: 44433.4. Samples: 1663847600. Policy #0 lag: (min: 0.0, avg: 26.9, max: 68.0) [2024-03-21 08:42:05,522][03784] Avg episode reward: [(0, '1.467')] [2024-03-21 08:42:06,784][04017] Updated weights for policy 0, policy_version 50737 (0.0011) [2024-03-21 08:42:10,521][03784] Fps is (10 sec: 29491.2, 60 sec: 39867.8, 300 sec: 45653.0). Total num frames: 1662648320. Throughput: 0: 45351.1. Samples: 1664148400. Policy #0 lag: (min: 0.0, avg: 26.9, max: 68.0) [2024-03-21 08:42:10,522][03784] Avg episode reward: [(0, '1.180')] [2024-03-21 08:42:14,750][04017] Updated weights for policy 0, policy_version 50747 (0.0018) [2024-03-21 08:42:15,521][03784] Fps is (10 sec: 45875.3, 60 sec: 40960.1, 300 sec: 45764.1). Total num frames: 1662943232. Throughput: 0: 45864.5. Samples: 1664285000. Policy #0 lag: (min: 0.0, avg: 26.9, max: 68.0) [2024-03-21 08:42:15,522][03784] Avg episode reward: [(0, '1.034')] [2024-03-21 08:42:20,497][04017] Updated weights for policy 0, policy_version 50757 (0.0012) [2024-03-21 08:42:20,521][03784] Fps is (10 sec: 55705.3, 60 sec: 41506.1, 300 sec: 45987.0). Total num frames: 1663205376. Throughput: 0: 46722.2. Samples: 1664555100. Policy #0 lag: (min: 0.0, avg: 50.5, max: 108.0) [2024-03-21 08:42:20,522][03784] Avg episode reward: [(0, '1.283')] [2024-03-21 08:42:24,068][03995] Signal inference workers to stop experience collection... (33500 times) [2024-03-21 08:42:24,069][03995] Signal inference workers to resume experience collection... (33500 times) [2024-03-21 08:42:24,144][04017] InferenceWorker_p0-w0: stopping experience collection (33500 times) [2024-03-21 08:42:24,144][04017] InferenceWorker_p0-w0: resuming experience collection (33500 times) [2024-03-21 08:42:24,825][04017] Updated weights for policy 0, policy_version 50767 (0.0014) [2024-03-21 08:42:25,521][03784] Fps is (10 sec: 62259.7, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 1663565824. Throughput: 0: 47309.0. Samples: 1664821900. Policy #0 lag: (min: 0.0, avg: 50.5, max: 108.0) [2024-03-21 08:42:25,522][03784] Avg episode reward: [(0, '1.323')] [2024-03-21 08:42:29,472][04017] Updated weights for policy 0, policy_version 50777 (0.0011) [2024-03-21 08:42:30,521][03784] Fps is (10 sec: 72090.8, 60 sec: 47513.7, 300 sec: 45653.1). Total num frames: 1663926272. Throughput: 0: 46800.2. Samples: 1664952500. Policy #0 lag: (min: 0.0, avg: 50.5, max: 108.0) [2024-03-21 08:42:30,522][03784] Avg episode reward: [(0, '1.323')] [2024-03-21 08:42:35,457][04017] Updated weights for policy 0, policy_version 50787 (0.0011) [2024-03-21 08:42:35,521][03784] Fps is (10 sec: 62259.1, 60 sec: 49698.2, 300 sec: 45986.3). Total num frames: 1664188416. Throughput: 0: 46786.8. Samples: 1665247000. Policy #0 lag: (min: 0.0, avg: 50.5, max: 108.0) [2024-03-21 08:42:35,522][03784] Avg episode reward: [(0, '1.601')] [2024-03-21 08:42:40,521][03784] Fps is (10 sec: 42597.9, 60 sec: 49698.1, 300 sec: 45875.2). Total num frames: 1664352256. Throughput: 0: 47128.9. Samples: 1665532000. Policy #0 lag: (min: 0.0, avg: 50.5, max: 108.0) [2024-03-21 08:42:40,522][03784] Avg episode reward: [(0, '1.498')] [2024-03-21 08:42:43,251][04017] Updated weights for policy 0, policy_version 50797 (0.0010) [2024-03-21 08:42:45,521][03784] Fps is (10 sec: 42598.1, 60 sec: 52428.8, 300 sec: 45875.2). Total num frames: 1664614400. Throughput: 0: 47108.9. Samples: 1665667400. Policy #0 lag: (min: 0.0, avg: 50.5, max: 108.0) [2024-03-21 08:42:45,522][03784] Avg episode reward: [(0, '1.498')] [2024-03-21 08:42:50,521][03784] Fps is (10 sec: 45874.5, 60 sec: 50244.2, 300 sec: 45875.2). Total num frames: 1664811008. Throughput: 0: 46359.8. Samples: 1665933800. Policy #0 lag: (min: 0.0, avg: 50.5, max: 108.0) [2024-03-21 08:42:50,522][03784] Avg episode reward: [(0, '0.940')] [2024-03-21 08:42:50,741][04017] Updated weights for policy 0, policy_version 50807 (0.0011) [2024-03-21 08:42:55,521][03784] Fps is (10 sec: 45875.6, 60 sec: 49698.2, 300 sec: 45986.3). Total num frames: 1665073152. Throughput: 0: 45975.6. Samples: 1666217300. Policy #0 lag: (min: 0.0, avg: 39.3, max: 83.0) [2024-03-21 08:42:55,522][03784] Avg episode reward: [(0, '1.538')] [2024-03-21 08:43:00,521][03784] Fps is (10 sec: 36045.1, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 1665171456. Throughput: 0: 46459.9. Samples: 1666375700. Policy #0 lag: (min: 0.0, avg: 39.3, max: 83.0) [2024-03-21 08:43:00,522][03784] Avg episode reward: [(0, '0.977')] [2024-03-21 08:43:00,610][04017] Updated weights for policy 0, policy_version 50818 (0.0011) [2024-03-21 08:43:05,521][03784] Fps is (10 sec: 22937.7, 60 sec: 46967.6, 300 sec: 45986.3). Total num frames: 1665302528. Throughput: 0: 46369.1. Samples: 1666641700. Policy #0 lag: (min: 0.0, avg: 39.3, max: 83.0) [2024-03-21 08:43:05,522][03784] Avg episode reward: [(0, '1.063')] [2024-03-21 08:43:10,521][03784] Fps is (10 sec: 26214.7, 60 sec: 46421.4, 300 sec: 45653.0). Total num frames: 1665433600. Throughput: 0: 47173.3. Samples: 1666944700. Policy #0 lag: (min: 0.0, avg: 39.3, max: 83.0) [2024-03-21 08:43:10,522][03784] Avg episode reward: [(0, '1.460')] [2024-03-21 08:43:14,965][04017] Updated weights for policy 0, policy_version 50828 (0.0019) [2024-03-21 08:43:15,521][03784] Fps is (10 sec: 22937.4, 60 sec: 43144.5, 300 sec: 45430.9). Total num frames: 1665531904. Throughput: 0: 47819.9. Samples: 1667104400. Policy #0 lag: (min: 0.0, avg: 39.3, max: 83.0) [2024-03-21 08:43:15,522][03784] Avg episode reward: [(0, '1.460')] [2024-03-21 08:43:20,521][03784] Fps is (10 sec: 36045.2, 60 sec: 43144.7, 300 sec: 45430.9). Total num frames: 1665794048. Throughput: 0: 47429.0. Samples: 1667381300. Policy #0 lag: (min: 0.0, avg: 39.3, max: 83.0) [2024-03-21 08:43:20,521][03784] Avg episode reward: [(0, '1.167')] [2024-03-21 08:43:21,011][04017] Updated weights for policy 0, policy_version 50838 (0.0021) [2024-03-21 08:43:22,454][03995] Signal inference workers to stop experience collection... (33550 times) [2024-03-21 08:43:22,528][03995] Signal inference workers to resume experience collection... (33550 times) [2024-03-21 08:43:22,533][04017] InferenceWorker_p0-w0: stopping experience collection (33550 times) [2024-03-21 08:43:22,590][04017] InferenceWorker_p0-w0: resuming experience collection (33550 times) [2024-03-21 08:43:24,660][04017] Updated weights for policy 0, policy_version 50848 (0.0021) [2024-03-21 08:43:25,521][03784] Fps is (10 sec: 75366.1, 60 sec: 45329.0, 300 sec: 45986.3). Total num frames: 1666285568. Throughput: 0: 46753.3. Samples: 1667635900. Policy #0 lag: (min: 3.0, avg: 37.7, max: 100.0) [2024-03-21 08:43:25,522][03784] Avg episode reward: [(0, '0.815')] [2024-03-21 08:43:28,429][04017] Updated weights for policy 0, policy_version 50858 (0.0017) [2024-03-21 08:43:30,521][03784] Fps is (10 sec: 78642.6, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 1666580480. Throughput: 0: 46369.0. Samples: 1667754000. Policy #0 lag: (min: 3.0, avg: 37.7, max: 100.0) [2024-03-21 08:43:30,522][03784] Avg episode reward: [(0, '0.815')] [2024-03-21 08:43:33,528][04017] Updated weights for policy 0, policy_version 50868 (0.0018) [2024-03-21 08:43:35,521][03784] Fps is (10 sec: 65536.5, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 1666940928. Throughput: 0: 46606.9. Samples: 1668031100. Policy #0 lag: (min: 3.0, avg: 37.7, max: 100.0) [2024-03-21 08:43:35,522][03784] Avg episode reward: [(0, '0.815')] [2024-03-21 08:43:39,219][04017] Updated weights for policy 0, policy_version 50878 (0.0016) [2024-03-21 08:43:40,521][03784] Fps is (10 sec: 68812.2, 60 sec: 48605.9, 300 sec: 46208.4). Total num frames: 1667268608. Throughput: 0: 46099.9. Samples: 1668291800. Policy #0 lag: (min: 3.0, avg: 37.7, max: 100.0) [2024-03-21 08:43:40,522][03784] Avg episode reward: [(0, '1.119')] [2024-03-21 08:43:44,766][04017] Updated weights for policy 0, policy_version 50888 (0.0011) [2024-03-21 08:43:45,521][03784] Fps is (10 sec: 58981.8, 60 sec: 48605.8, 300 sec: 46652.7). Total num frames: 1667530752. Throughput: 0: 45591.1. Samples: 1668427300. Policy #0 lag: (min: 3.0, avg: 37.7, max: 100.0) [2024-03-21 08:43:45,522][03784] Avg episode reward: [(0, '0.903')] [2024-03-21 08:43:50,521][03784] Fps is (10 sec: 29491.4, 60 sec: 45875.4, 300 sec: 46097.4). Total num frames: 1667563520. Throughput: 0: 46086.6. Samples: 1668715600. Policy #0 lag: (min: 3.0, avg: 37.7, max: 100.0) [2024-03-21 08:43:50,522][03784] Avg episode reward: [(0, '1.385')] [2024-03-21 08:43:55,521][03784] Fps is (10 sec: 19661.0, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 1667727360. Throughput: 0: 45351.1. Samples: 1668985500. Policy #0 lag: (min: 0.0, avg: 37.0, max: 77.0) [2024-03-21 08:43:55,522][03784] Avg episode reward: [(0, '1.474')] [2024-03-21 08:43:57,600][04017] Updated weights for policy 0, policy_version 50898 (0.0010) [2024-03-21 08:44:00,521][03784] Fps is (10 sec: 32767.7, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 1667891200. Throughput: 0: 44764.4. Samples: 1669118800. Policy #0 lag: (min: 0.0, avg: 37.0, max: 77.0) [2024-03-21 08:44:00,522][03784] Avg episode reward: [(0, '1.382')] [2024-03-21 08:44:00,580][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000050901_1667923968.pth... [2024-03-21 08:44:00,677][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000050567_1656979456.pth [2024-03-21 08:44:05,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46967.4, 300 sec: 45653.0). Total num frames: 1668120576. Throughput: 0: 45379.8. Samples: 1669423400. Policy #0 lag: (min: 0.0, avg: 37.0, max: 77.0) [2024-03-21 08:44:05,522][03784] Avg episode reward: [(0, '1.281')] [2024-03-21 08:44:06,689][04017] Updated weights for policy 0, policy_version 50908 (0.0010) [2024-03-21 08:44:10,521][03784] Fps is (10 sec: 32768.1, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1668218880. Throughput: 0: 46140.0. Samples: 1669712200. Policy #0 lag: (min: 0.0, avg: 37.0, max: 77.0) [2024-03-21 08:44:10,522][03784] Avg episode reward: [(0, '1.281')] [2024-03-21 08:44:13,271][03995] Signal inference workers to stop experience collection... (33600 times) [2024-03-21 08:44:13,271][03995] Signal inference workers to resume experience collection... (33600 times) [2024-03-21 08:44:13,353][04017] InferenceWorker_p0-w0: stopping experience collection (33600 times) [2024-03-21 08:44:13,354][04017] InferenceWorker_p0-w0: resuming experience collection (33600 times) [2024-03-21 08:44:15,521][03784] Fps is (10 sec: 32768.3, 60 sec: 48605.9, 300 sec: 45208.7). Total num frames: 1668448256. Throughput: 0: 46397.8. Samples: 1669841900. Policy #0 lag: (min: 0.0, avg: 37.0, max: 77.0) [2024-03-21 08:44:15,522][03784] Avg episode reward: [(0, '1.240')] [2024-03-21 08:44:18,047][04017] Updated weights for policy 0, policy_version 50918 (0.0012) [2024-03-21 08:44:20,521][03784] Fps is (10 sec: 36044.9, 60 sec: 46421.2, 300 sec: 44542.3). Total num frames: 1668579328. Throughput: 0: 46131.1. Samples: 1670107000. Policy #0 lag: (min: 0.0, avg: 37.0, max: 77.0) [2024-03-21 08:44:20,522][03784] Avg episode reward: [(0, '1.054')] [2024-03-21 08:44:23,020][04017] Updated weights for policy 0, policy_version 50928 (0.0011) [2024-03-21 08:44:25,521][03784] Fps is (10 sec: 58982.2, 60 sec: 45875.3, 300 sec: 45653.0). Total num frames: 1669038080. Throughput: 0: 45642.3. Samples: 1670345700. Policy #0 lag: (min: 0.0, avg: 37.0, max: 77.0) [2024-03-21 08:44:25,522][03784] Avg episode reward: [(0, '1.647')] [2024-03-21 08:44:28,000][04017] Updated weights for policy 0, policy_version 50938 (0.0019) [2024-03-21 08:44:30,521][03784] Fps is (10 sec: 78642.7, 60 sec: 46421.2, 300 sec: 46097.3). Total num frames: 1669365760. Throughput: 0: 45904.4. Samples: 1670493000. Policy #0 lag: (min: 1.0, avg: 44.1, max: 97.0) [2024-03-21 08:44:30,522][03784] Avg episode reward: [(0, '1.572')] [2024-03-21 08:44:31,581][04017] Updated weights for policy 0, policy_version 50948 (0.0012) [2024-03-21 08:44:35,521][03784] Fps is (10 sec: 72089.2, 60 sec: 46967.4, 300 sec: 46652.7). Total num frames: 1669758976. Throughput: 0: 45217.7. Samples: 1670750400. Policy #0 lag: (min: 1.0, avg: 44.1, max: 97.0) [2024-03-21 08:44:35,522][03784] Avg episode reward: [(0, '0.967')] [2024-03-21 08:44:38,244][04017] Updated weights for policy 0, policy_version 50958 (0.0030) [2024-03-21 08:44:40,521][03784] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 45875.2). Total num frames: 1669890048. Throughput: 0: 45391.0. Samples: 1671028100. Policy #0 lag: (min: 1.0, avg: 44.1, max: 97.0) [2024-03-21 08:44:40,522][03784] Avg episode reward: [(0, '1.464')] [2024-03-21 08:44:45,521][03784] Fps is (10 sec: 22937.8, 60 sec: 40960.1, 300 sec: 45430.9). Total num frames: 1669988352. Throughput: 0: 45600.1. Samples: 1671170800. Policy #0 lag: (min: 1.0, avg: 44.1, max: 97.0) [2024-03-21 08:44:45,522][03784] Avg episode reward: [(0, '1.111')] [2024-03-21 08:44:47,423][04017] Updated weights for policy 0, policy_version 50968 (0.0014) [2024-03-21 08:44:50,521][03784] Fps is (10 sec: 36044.7, 60 sec: 44782.9, 300 sec: 45542.0). Total num frames: 1670250496. Throughput: 0: 45235.5. Samples: 1671459000. Policy #0 lag: (min: 1.0, avg: 44.1, max: 97.0) [2024-03-21 08:44:50,522][03784] Avg episode reward: [(0, '1.598')] [2024-03-21 08:44:53,248][04017] Updated weights for policy 0, policy_version 50978 (0.0017) [2024-03-21 08:44:55,521][03784] Fps is (10 sec: 52428.4, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 1670512640. Throughput: 0: 44128.9. Samples: 1671698000. Policy #0 lag: (min: 1.0, avg: 44.1, max: 97.0) [2024-03-21 08:44:55,522][03784] Avg episode reward: [(0, '1.082')] [2024-03-21 08:45:00,521][03784] Fps is (10 sec: 29491.1, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 1670545408. Throughput: 0: 43948.7. Samples: 1671819600. Policy #0 lag: (min: 1.0, avg: 44.1, max: 97.0) [2024-03-21 08:45:00,522][03784] Avg episode reward: [(0, '1.042')] [2024-03-21 08:45:05,521][03784] Fps is (10 sec: 16384.1, 60 sec: 42598.4, 300 sec: 44097.9). Total num frames: 1670676480. Throughput: 0: 43840.0. Samples: 1672079800. Policy #0 lag: (min: 0.0, avg: 26.1, max: 88.0) [2024-03-21 08:45:05,522][03784] Avg episode reward: [(0, '0.547')] [2024-03-21 08:45:06,453][03995] Signal inference workers to stop experience collection... (33650 times) [2024-03-21 08:45:06,454][03995] Signal inference workers to resume experience collection... (33650 times) [2024-03-21 08:45:06,504][04017] InferenceWorker_p0-w0: stopping experience collection (33650 times) [2024-03-21 08:45:06,504][04017] InferenceWorker_p0-w0: resuming experience collection (33650 times) [2024-03-21 08:45:06,833][04017] Updated weights for policy 0, policy_version 50988 (0.0016) [2024-03-21 08:45:10,521][03784] Fps is (10 sec: 49151.7, 60 sec: 46967.3, 300 sec: 44764.4). Total num frames: 1671036928. Throughput: 0: 43948.7. Samples: 1672323400. Policy #0 lag: (min: 0.0, avg: 26.1, max: 88.0) [2024-03-21 08:45:10,523][03784] Avg episode reward: [(0, '1.445')] [2024-03-21 08:45:11,185][04017] Updated weights for policy 0, policy_version 50998 (0.0011) [2024-03-21 08:45:15,521][03784] Fps is (10 sec: 62258.1, 60 sec: 47513.4, 300 sec: 45097.6). Total num frames: 1671299072. Throughput: 0: 43473.3. Samples: 1672449300. Policy #0 lag: (min: 0.0, avg: 26.1, max: 88.0) [2024-03-21 08:45:15,522][03784] Avg episode reward: [(0, '1.054')] [2024-03-21 08:45:17,865][04017] Updated weights for policy 0, policy_version 51009 (0.0010) [2024-03-21 08:45:20,521][03784] Fps is (10 sec: 65536.9, 60 sec: 51882.6, 300 sec: 45875.2). Total num frames: 1671692288. Throughput: 0: 43611.1. Samples: 1672712900. Policy #0 lag: (min: 0.0, avg: 26.1, max: 88.0) [2024-03-21 08:45:20,522][03784] Avg episode reward: [(0, '0.785')] [2024-03-21 08:45:22,476][04017] Updated weights for policy 0, policy_version 51019 (0.0010) [2024-03-21 08:45:25,521][03784] Fps is (10 sec: 55706.3, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 1671856128. Throughput: 0: 43786.7. Samples: 1672998500. Policy #0 lag: (min: 0.0, avg: 26.1, max: 88.0) [2024-03-21 08:45:25,522][03784] Avg episode reward: [(0, '1.367')] [2024-03-21 08:45:30,521][03784] Fps is (10 sec: 29491.1, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 1671987200. Throughput: 0: 43622.1. Samples: 1673133800. Policy #0 lag: (min: 0.0, avg: 26.1, max: 88.0) [2024-03-21 08:45:30,523][03784] Avg episode reward: [(0, '1.065')] [2024-03-21 08:45:34,169][04017] Updated weights for policy 0, policy_version 51029 (0.0017) [2024-03-21 08:45:35,521][03784] Fps is (10 sec: 32768.1, 60 sec: 40413.9, 300 sec: 45875.2). Total num frames: 1672183808. Throughput: 0: 43720.1. Samples: 1673426400. Policy #0 lag: (min: 0.0, avg: 33.6, max: 65.0) [2024-03-21 08:45:35,522][03784] Avg episode reward: [(0, '1.065')] [2024-03-21 08:45:40,521][03784] Fps is (10 sec: 32768.1, 60 sec: 40413.9, 300 sec: 45986.3). Total num frames: 1672314880. Throughput: 0: 44635.6. Samples: 1673706600. Policy #0 lag: (min: 0.0, avg: 33.6, max: 65.0) [2024-03-21 08:45:40,522][03784] Avg episode reward: [(0, '0.812')] [2024-03-21 08:45:43,100][04017] Updated weights for policy 0, policy_version 51039 (0.0010) [2024-03-21 08:45:45,521][03784] Fps is (10 sec: 29491.1, 60 sec: 41506.1, 300 sec: 45764.1). Total num frames: 1672478720. Throughput: 0: 44824.5. Samples: 1673836700. Policy #0 lag: (min: 0.0, avg: 33.6, max: 65.0) [2024-03-21 08:45:45,522][03784] Avg episode reward: [(0, '1.190')] [2024-03-21 08:45:50,521][03784] Fps is (10 sec: 26214.5, 60 sec: 38775.5, 300 sec: 44653.3). Total num frames: 1672577024. Throughput: 0: 45100.0. Samples: 1674109300. Policy #0 lag: (min: 0.0, avg: 33.6, max: 65.0) [2024-03-21 08:45:50,522][03784] Avg episode reward: [(0, '0.642')] [2024-03-21 08:45:54,096][04017] Updated weights for policy 0, policy_version 51049 (0.0012) [2024-03-21 08:45:55,521][03784] Fps is (10 sec: 42598.2, 60 sec: 39867.7, 300 sec: 44320.1). Total num frames: 1672904704. Throughput: 0: 45413.4. Samples: 1674367000. Policy #0 lag: (min: 0.0, avg: 33.6, max: 65.0) [2024-03-21 08:45:55,522][03784] Avg episode reward: [(0, '1.345')] [2024-03-21 08:45:56,998][03995] Signal inference workers to stop experience collection... (33700 times) [2024-03-21 08:45:57,053][04017] InferenceWorker_p0-w0: stopping experience collection (33700 times) [2024-03-21 08:45:57,315][03995] Signal inference workers to resume experience collection... (33700 times) [2024-03-21 08:45:57,315][04017] InferenceWorker_p0-w0: resuming experience collection (33700 times) [2024-03-21 08:45:57,638][04017] Updated weights for policy 0, policy_version 51059 (0.0021) [2024-03-21 08:46:00,521][03784] Fps is (10 sec: 68812.8, 60 sec: 45329.2, 300 sec: 44875.5). Total num frames: 1673265152. Throughput: 0: 44882.4. Samples: 1674469000. Policy #0 lag: (min: 0.0, avg: 33.6, max: 65.0) [2024-03-21 08:46:00,531][03784] Avg episode reward: [(0, '1.309')] [2024-03-21 08:46:00,543][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000051064_1673265152.pth... [2024-03-21 08:46:00,672][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000050732_1662386176.pth [2024-03-21 08:46:04,000][04017] Updated weights for policy 0, policy_version 51069 (0.0012) [2024-03-21 08:46:05,521][03784] Fps is (10 sec: 65536.9, 60 sec: 48059.8, 300 sec: 45097.7). Total num frames: 1673560064. Throughput: 0: 45402.3. Samples: 1674756000. Policy #0 lag: (min: 3.0, avg: 40.3, max: 85.0) [2024-03-21 08:46:05,522][03784] Avg episode reward: [(0, '1.309')] [2024-03-21 08:46:08,886][04017] Updated weights for policy 0, policy_version 51079 (0.0012) [2024-03-21 08:46:10,521][03784] Fps is (10 sec: 65535.4, 60 sec: 48059.8, 300 sec: 45542.0). Total num frames: 1673920512. Throughput: 0: 44879.9. Samples: 1675018100. Policy #0 lag: (min: 3.0, avg: 40.3, max: 85.0) [2024-03-21 08:46:10,522][03784] Avg episode reward: [(0, '1.175')] [2024-03-21 08:46:14,756][04017] Updated weights for policy 0, policy_version 51089 (0.0018) [2024-03-21 08:46:15,521][03784] Fps is (10 sec: 55705.5, 60 sec: 46967.6, 300 sec: 45430.9). Total num frames: 1674117120. Throughput: 0: 45289.0. Samples: 1675171800. Policy #0 lag: (min: 3.0, avg: 40.3, max: 85.0) [2024-03-21 08:46:15,522][03784] Avg episode reward: [(0, '1.175')] [2024-03-21 08:46:20,521][03784] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 1674379264. Throughput: 0: 44533.2. Samples: 1675430400. Policy #0 lag: (min: 3.0, avg: 40.3, max: 85.0) [2024-03-21 08:46:20,522][03784] Avg episode reward: [(0, '1.188')] [2024-03-21 08:46:20,844][04017] Updated weights for policy 0, policy_version 51099 (0.0013) [2024-03-21 08:46:25,521][03784] Fps is (10 sec: 52428.7, 60 sec: 46421.4, 300 sec: 45986.3). Total num frames: 1674641408. Throughput: 0: 43564.5. Samples: 1675667000. Policy #0 lag: (min: 3.0, avg: 40.3, max: 85.0) [2024-03-21 08:46:25,522][03784] Avg episode reward: [(0, '1.188')] [2024-03-21 08:46:28,571][04017] Updated weights for policy 0, policy_version 51109 (0.0024) [2024-03-21 08:46:30,521][03784] Fps is (10 sec: 45875.4, 60 sec: 47513.6, 300 sec: 46208.4). Total num frames: 1674838016. Throughput: 0: 43500.0. Samples: 1675794200. Policy #0 lag: (min: 3.0, avg: 40.3, max: 85.0) [2024-03-21 08:46:30,522][03784] Avg episode reward: [(0, '1.448')] [2024-03-21 08:46:35,521][03784] Fps is (10 sec: 26214.1, 60 sec: 45329.0, 300 sec: 45875.2). Total num frames: 1674903552. Throughput: 0: 43506.6. Samples: 1676067100. Policy #0 lag: (min: 3.0, avg: 40.3, max: 85.0) [2024-03-21 08:46:35,522][03784] Avg episode reward: [(0, '0.775')] [2024-03-21 08:46:39,345][04017] Updated weights for policy 0, policy_version 51119 (0.0012) [2024-03-21 08:46:40,521][03784] Fps is (10 sec: 26214.6, 60 sec: 46421.4, 300 sec: 46208.4). Total num frames: 1675100160. Throughput: 0: 44106.8. Samples: 1676351800. Policy #0 lag: (min: 0.0, avg: 38.5, max: 109.0) [2024-03-21 08:46:40,522][03784] Avg episode reward: [(0, '1.344')] [2024-03-21 08:46:45,521][03784] Fps is (10 sec: 26214.9, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 1675165696. Throughput: 0: 45000.1. Samples: 1676494000. Policy #0 lag: (min: 0.0, avg: 38.5, max: 109.0) [2024-03-21 08:46:45,521][03784] Avg episode reward: [(0, '1.119')] [2024-03-21 08:46:50,521][03784] Fps is (10 sec: 26214.3, 60 sec: 46421.3, 300 sec: 44986.6). Total num frames: 1675362304. Throughput: 0: 45142.2. Samples: 1676787400. Policy #0 lag: (min: 0.0, avg: 38.5, max: 109.0) [2024-03-21 08:46:50,522][03784] Avg episode reward: [(0, '1.372')] [2024-03-21 08:46:51,806][04017] Updated weights for policy 0, policy_version 51129 (0.0010) [2024-03-21 08:46:52,887][03995] Signal inference workers to stop experience collection... (33750 times) [2024-03-21 08:46:52,888][03995] Signal inference workers to resume experience collection... (33750 times) [2024-03-21 08:46:52,949][04017] InferenceWorker_p0-w0: stopping experience collection (33750 times) [2024-03-21 08:46:52,950][04017] InferenceWorker_p0-w0: resuming experience collection (33750 times) [2024-03-21 08:46:55,521][03784] Fps is (10 sec: 42598.1, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 1675591680. Throughput: 0: 45449.0. Samples: 1677063300. Policy #0 lag: (min: 0.0, avg: 38.5, max: 109.0) [2024-03-21 08:46:55,522][03784] Avg episode reward: [(0, '1.423')] [2024-03-21 08:46:56,659][04017] Updated weights for policy 0, policy_version 51139 (0.0012) [2024-03-21 08:47:00,521][03784] Fps is (10 sec: 55705.6, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 1675919360. Throughput: 0: 44997.7. Samples: 1677196700. Policy #0 lag: (min: 0.0, avg: 38.5, max: 109.0) [2024-03-21 08:47:00,522][03784] Avg episode reward: [(0, '0.835')] [2024-03-21 08:47:03,359][04017] Updated weights for policy 0, policy_version 51149 (0.0015) [2024-03-21 08:47:05,521][03784] Fps is (10 sec: 62257.6, 60 sec: 44236.6, 300 sec: 45986.2). Total num frames: 1676214272. Throughput: 0: 45415.4. Samples: 1677474100. Policy #0 lag: (min: 0.0, avg: 38.5, max: 109.0) [2024-03-21 08:47:05,522][03784] Avg episode reward: [(0, '1.451')] [2024-03-21 08:47:08,201][04017] Updated weights for policy 0, policy_version 51159 (0.0016) [2024-03-21 08:47:10,521][03784] Fps is (10 sec: 49151.6, 60 sec: 41506.1, 300 sec: 45653.0). Total num frames: 1676410880. Throughput: 0: 46679.9. Samples: 1677767600. Policy #0 lag: (min: 0.0, avg: 38.5, max: 109.0) [2024-03-21 08:47:10,522][03784] Avg episode reward: [(0, '0.898')] [2024-03-21 08:47:15,521][03784] Fps is (10 sec: 36045.7, 60 sec: 40960.0, 300 sec: 45319.8). Total num frames: 1676574720. Throughput: 0: 46966.7. Samples: 1677907700. Policy #0 lag: (min: 0.0, avg: 35.1, max: 73.0) [2024-03-21 08:47:15,522][03784] Avg episode reward: [(0, '0.936')] [2024-03-21 08:47:17,432][04017] Updated weights for policy 0, policy_version 51169 (0.0012) [2024-03-21 08:47:20,521][03784] Fps is (10 sec: 52428.6, 60 sec: 42598.4, 300 sec: 45319.8). Total num frames: 1676935168. Throughput: 0: 46786.6. Samples: 1678172500. Policy #0 lag: (min: 0.0, avg: 35.1, max: 73.0) [2024-03-21 08:47:20,522][03784] Avg episode reward: [(0, '1.079')] [2024-03-21 08:47:21,448][04017] Updated weights for policy 0, policy_version 51179 (0.0012) [2024-03-21 08:47:25,124][04017] Updated weights for policy 0, policy_version 51189 (0.0010) [2024-03-21 08:47:25,521][03784] Fps is (10 sec: 78642.9, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 1677361152. Throughput: 0: 46066.6. Samples: 1678424800. Policy #0 lag: (min: 0.0, avg: 35.1, max: 73.0) [2024-03-21 08:47:25,522][03784] Avg episode reward: [(0, '0.601')] [2024-03-21 08:47:30,521][03784] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 44653.3). Total num frames: 1677361152. Throughput: 0: 46575.3. Samples: 1678589900. Policy #0 lag: (min: 0.0, avg: 35.1, max: 73.0) [2024-03-21 08:47:30,522][03784] Avg episode reward: [(0, '0.601')] [2024-03-21 08:47:35,521][03784] Fps is (10 sec: 13107.2, 60 sec: 43144.6, 300 sec: 44542.3). Total num frames: 1677492224. Throughput: 0: 46586.6. Samples: 1678883800. Policy #0 lag: (min: 0.0, avg: 35.1, max: 73.0) [2024-03-21 08:47:35,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 08:47:39,340][04017] Updated weights for policy 0, policy_version 51199 (0.0010) [2024-03-21 08:47:39,584][03995] Signal inference workers to stop experience collection... (33800 times) [2024-03-21 08:47:39,652][03995] Signal inference workers to resume experience collection... (33800 times) [2024-03-21 08:47:39,706][04017] InferenceWorker_p0-w0: stopping experience collection (33800 times) [2024-03-21 08:47:39,707][04017] InferenceWorker_p0-w0: resuming experience collection (33800 times) [2024-03-21 08:47:40,521][03784] Fps is (10 sec: 39321.9, 60 sec: 44236.7, 300 sec: 44542.3). Total num frames: 1677754368. Throughput: 0: 46202.1. Samples: 1679142400. Policy #0 lag: (min: 0.0, avg: 35.1, max: 73.0) [2024-03-21 08:47:40,522][03784] Avg episode reward: [(0, '1.374')] [2024-03-21 08:47:44,629][04017] Updated weights for policy 0, policy_version 51209 (0.0012) [2024-03-21 08:47:45,521][03784] Fps is (10 sec: 58983.2, 60 sec: 48605.9, 300 sec: 44986.6). Total num frames: 1678082048. Throughput: 0: 45775.7. Samples: 1679256600. Policy #0 lag: (min: 0.0, avg: 48.4, max: 107.0) [2024-03-21 08:47:45,521][03784] Avg episode reward: [(0, '1.431')] [2024-03-21 08:47:49,219][04017] Updated weights for policy 0, policy_version 51219 (0.0016) [2024-03-21 08:47:50,521][03784] Fps is (10 sec: 65536.6, 60 sec: 50790.4, 300 sec: 45208.7). Total num frames: 1678409728. Throughput: 0: 44895.8. Samples: 1679494400. Policy #0 lag: (min: 0.0, avg: 48.4, max: 107.0) [2024-03-21 08:47:50,522][03784] Avg episode reward: [(0, '0.397')] [2024-03-21 08:47:55,521][03784] Fps is (10 sec: 55704.9, 60 sec: 50790.4, 300 sec: 45653.1). Total num frames: 1678639104. Throughput: 0: 44835.6. Samples: 1679785200. Policy #0 lag: (min: 0.0, avg: 48.4, max: 107.0) [2024-03-21 08:47:55,522][03784] Avg episode reward: [(0, '0.685')] [2024-03-21 08:47:56,883][04017] Updated weights for policy 0, policy_version 51229 (0.0016) [2024-03-21 08:48:00,521][03784] Fps is (10 sec: 36045.1, 60 sec: 47513.7, 300 sec: 45653.0). Total num frames: 1678770176. Throughput: 0: 44951.2. Samples: 1679930500. Policy #0 lag: (min: 0.0, avg: 48.4, max: 107.0) [2024-03-21 08:48:00,521][03784] Avg episode reward: [(0, '0.685')] [2024-03-21 08:48:00,857][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000051234_1678835712.pth... [2024-03-21 08:48:00,995][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000050901_1667923968.pth [2024-03-21 08:48:04,985][04017] Updated weights for policy 0, policy_version 51239 (0.0019) [2024-03-21 08:48:05,521][03784] Fps is (10 sec: 39322.0, 60 sec: 46967.7, 300 sec: 46097.4). Total num frames: 1679032320. Throughput: 0: 45658.0. Samples: 1680227100. Policy #0 lag: (min: 0.0, avg: 48.4, max: 107.0) [2024-03-21 08:48:05,522][03784] Avg episode reward: [(0, '0.685')] [2024-03-21 08:48:10,521][03784] Fps is (10 sec: 42598.2, 60 sec: 46421.4, 300 sec: 46319.5). Total num frames: 1679196160. Throughput: 0: 46211.2. Samples: 1680504300. Policy #0 lag: (min: 0.0, avg: 48.4, max: 107.0) [2024-03-21 08:48:10,522][03784] Avg episode reward: [(0, '1.263')] [2024-03-21 08:48:13,870][04017] Updated weights for policy 0, policy_version 51249 (0.0012) [2024-03-21 08:48:15,521][03784] Fps is (10 sec: 36044.3, 60 sec: 46967.4, 300 sec: 46097.3). Total num frames: 1679392768. Throughput: 0: 45609.0. Samples: 1680642300. Policy #0 lag: (min: 0.0, avg: 48.4, max: 107.0) [2024-03-21 08:48:15,522][03784] Avg episode reward: [(0, '0.766')] [2024-03-21 08:48:20,521][03784] Fps is (10 sec: 26214.1, 60 sec: 42052.3, 300 sec: 44653.3). Total num frames: 1679458304. Throughput: 0: 45886.6. Samples: 1680948700. Policy #0 lag: (min: 0.0, avg: 42.6, max: 100.0) [2024-03-21 08:48:20,522][03784] Avg episode reward: [(0, '1.561')] [2024-03-21 08:48:25,521][03784] Fps is (10 sec: 16384.1, 60 sec: 36591.0, 300 sec: 43986.9). Total num frames: 1679556608. Throughput: 0: 46124.5. Samples: 1681218000. Policy #0 lag: (min: 0.0, avg: 42.6, max: 100.0) [2024-03-21 08:48:25,522][03784] Avg episode reward: [(0, '1.004')] [2024-03-21 08:48:27,281][04017] Updated weights for policy 0, policy_version 51259 (0.0012) [2024-03-21 08:48:30,521][03784] Fps is (10 sec: 49152.9, 60 sec: 43144.7, 300 sec: 44098.0). Total num frames: 1679949824. Throughput: 0: 46404.4. Samples: 1681344800. Policy #0 lag: (min: 0.0, avg: 42.6, max: 100.0) [2024-03-21 08:48:30,521][03784] Avg episode reward: [(0, '1.437')] [2024-03-21 08:48:30,801][04017] Updated weights for policy 0, policy_version 51269 (0.0025) [2024-03-21 08:48:31,824][03995] Signal inference workers to stop experience collection... (33850 times) [2024-03-21 08:48:31,890][04017] InferenceWorker_p0-w0: stopping experience collection (33850 times) [2024-03-21 08:48:32,092][03995] Signal inference workers to resume experience collection... (33850 times) [2024-03-21 08:48:32,092][04017] InferenceWorker_p0-w0: resuming experience collection (33850 times) [2024-03-21 08:48:33,993][04017] Updated weights for policy 0, policy_version 51279 (0.0011) [2024-03-21 08:48:35,521][03784] Fps is (10 sec: 85196.6, 60 sec: 48605.9, 300 sec: 44542.3). Total num frames: 1680408576. Throughput: 0: 46326.7. Samples: 1681579100. Policy #0 lag: (min: 0.0, avg: 42.6, max: 100.0) [2024-03-21 08:48:35,522][03784] Avg episode reward: [(0, '1.297')] [2024-03-21 08:48:40,521][03784] Fps is (10 sec: 62257.9, 60 sec: 46967.5, 300 sec: 44209.0). Total num frames: 1680572416. Throughput: 0: 45917.7. Samples: 1681851500. Policy #0 lag: (min: 0.0, avg: 42.6, max: 100.0) [2024-03-21 08:48:40,522][03784] Avg episode reward: [(0, '1.297')] [2024-03-21 08:48:41,241][04017] Updated weights for policy 0, policy_version 51289 (0.0016) [2024-03-21 08:48:45,521][03784] Fps is (10 sec: 52428.2, 60 sec: 47513.4, 300 sec: 45319.8). Total num frames: 1680932864. Throughput: 0: 45537.6. Samples: 1681979700. Policy #0 lag: (min: 0.0, avg: 42.6, max: 100.0) [2024-03-21 08:48:45,522][03784] Avg episode reward: [(0, '1.110')] [2024-03-21 08:48:46,073][04017] Updated weights for policy 0, policy_version 51299 (0.0020) [2024-03-21 08:48:50,521][03784] Fps is (10 sec: 55706.0, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 1681129472. Throughput: 0: 45277.7. Samples: 1682264600. Policy #0 lag: (min: 0.0, avg: 42.6, max: 100.0) [2024-03-21 08:48:50,522][03784] Avg episode reward: [(0, '1.492')] [2024-03-21 08:48:52,012][04017] Updated weights for policy 0, policy_version 51309 (0.0024) [2024-03-21 08:48:55,521][03784] Fps is (10 sec: 52429.6, 60 sec: 46967.5, 300 sec: 45986.3). Total num frames: 1681457152. Throughput: 0: 44855.6. Samples: 1682522800. Policy #0 lag: (min: 2.0, avg: 49.8, max: 96.0) [2024-03-21 08:48:55,522][03784] Avg episode reward: [(0, '0.909')] [2024-03-21 08:48:57,463][04017] Updated weights for policy 0, policy_version 51319 (0.0011) [2024-03-21 08:49:00,521][03784] Fps is (10 sec: 52428.5, 60 sec: 48059.6, 300 sec: 45875.2). Total num frames: 1681653760. Throughput: 0: 44673.3. Samples: 1682652600. Policy #0 lag: (min: 2.0, avg: 49.8, max: 96.0) [2024-03-21 08:49:00,522][03784] Avg episode reward: [(0, '0.630')] [2024-03-21 08:49:05,521][03784] Fps is (10 sec: 22937.7, 60 sec: 44236.8, 300 sec: 45653.1). Total num frames: 1681686528. Throughput: 0: 44731.2. Samples: 1682961600. Policy #0 lag: (min: 2.0, avg: 49.8, max: 96.0) [2024-03-21 08:49:05,522][03784] Avg episode reward: [(0, '1.016')] [2024-03-21 08:49:10,521][03784] Fps is (10 sec: 26214.7, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 1681915904. Throughput: 0: 45062.2. Samples: 1683245800. Policy #0 lag: (min: 2.0, avg: 49.8, max: 96.0) [2024-03-21 08:49:10,522][03784] Avg episode reward: [(0, '1.420')] [2024-03-21 08:49:11,762][04017] Updated weights for policy 0, policy_version 51329 (0.0011) [2024-03-21 08:49:15,521][03784] Fps is (10 sec: 26214.0, 60 sec: 42598.4, 300 sec: 45319.8). Total num frames: 1681948672. Throughput: 0: 45408.7. Samples: 1683388200. Policy #0 lag: (min: 2.0, avg: 49.8, max: 96.0) [2024-03-21 08:49:15,522][03784] Avg episode reward: [(0, '0.619')] [2024-03-21 08:49:19,951][04017] Updated weights for policy 0, policy_version 51339 (0.0027) [2024-03-21 08:49:20,521][03784] Fps is (10 sec: 39321.0, 60 sec: 47513.6, 300 sec: 44986.6). Total num frames: 1682309120. Throughput: 0: 46093.2. Samples: 1683653300. Policy #0 lag: (min: 2.0, avg: 49.8, max: 96.0) [2024-03-21 08:49:20,522][03784] Avg episode reward: [(0, '1.227')] [2024-03-21 08:49:21,060][03995] Signal inference workers to stop experience collection... (33900 times) [2024-03-21 08:49:21,060][03995] Signal inference workers to resume experience collection... (33900 times) [2024-03-21 08:49:21,131][04017] InferenceWorker_p0-w0: stopping experience collection (33900 times) [2024-03-21 08:49:21,131][04017] InferenceWorker_p0-w0: resuming experience collection (33900 times) [2024-03-21 08:49:25,521][03784] Fps is (10 sec: 55706.1, 60 sec: 49152.0, 300 sec: 44542.3). Total num frames: 1682505728. Throughput: 0: 46291.2. Samples: 1683934600. Policy #0 lag: (min: 2.0, avg: 49.8, max: 96.0) [2024-03-21 08:49:25,522][03784] Avg episode reward: [(0, '0.666')] [2024-03-21 08:49:28,023][04017] Updated weights for policy 0, policy_version 51349 (0.0014) [2024-03-21 08:49:30,521][03784] Fps is (10 sec: 36045.5, 60 sec: 45329.0, 300 sec: 43764.7). Total num frames: 1682669568. Throughput: 0: 46435.8. Samples: 1684069300. Policy #0 lag: (min: 0.0, avg: 44.8, max: 96.0) [2024-03-21 08:49:30,521][03784] Avg episode reward: [(0, '1.550')] [2024-03-21 08:49:34,192][04017] Updated weights for policy 0, policy_version 51359 (0.0017) [2024-03-21 08:49:35,521][03784] Fps is (10 sec: 52429.5, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 1683030016. Throughput: 0: 46386.8. Samples: 1684352000. Policy #0 lag: (min: 0.0, avg: 44.8, max: 96.0) [2024-03-21 08:49:35,521][03784] Avg episode reward: [(0, '1.544')] [2024-03-21 08:49:37,347][04017] Updated weights for policy 0, policy_version 51369 (0.0018) [2024-03-21 08:49:40,521][03784] Fps is (10 sec: 72088.5, 60 sec: 46967.5, 300 sec: 45430.9). Total num frames: 1683390464. Throughput: 0: 46124.3. Samples: 1684598400. Policy #0 lag: (min: 0.0, avg: 44.8, max: 96.0) [2024-03-21 08:49:40,522][03784] Avg episode reward: [(0, '1.458')] [2024-03-21 08:49:43,551][04017] Updated weights for policy 0, policy_version 51379 (0.0011) [2024-03-21 08:49:45,521][03784] Fps is (10 sec: 55704.7, 60 sec: 44236.9, 300 sec: 45208.7). Total num frames: 1683587072. Throughput: 0: 46048.9. Samples: 1684724800. Policy #0 lag: (min: 0.0, avg: 44.8, max: 96.0) [2024-03-21 08:49:45,522][03784] Avg episode reward: [(0, '1.650')] [2024-03-21 08:49:50,515][04017] Updated weights for policy 0, policy_version 51389 (0.0011) [2024-03-21 08:49:50,521][03784] Fps is (10 sec: 52429.6, 60 sec: 46421.4, 300 sec: 45430.9). Total num frames: 1683914752. Throughput: 0: 45088.9. Samples: 1684990600. Policy #0 lag: (min: 0.0, avg: 44.8, max: 96.0) [2024-03-21 08:49:50,522][03784] Avg episode reward: [(0, '1.100')] [2024-03-21 08:49:55,521][03784] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 45764.1). Total num frames: 1684045824. Throughput: 0: 45124.5. Samples: 1685276400. Policy #0 lag: (min: 0.0, avg: 44.8, max: 96.0) [2024-03-21 08:49:55,522][03784] Avg episode reward: [(0, '1.070')] [2024-03-21 08:50:00,380][04017] Updated weights for policy 0, policy_version 51399 (0.0019) [2024-03-21 08:50:00,521][03784] Fps is (10 sec: 32767.5, 60 sec: 43144.6, 300 sec: 45986.3). Total num frames: 1684242432. Throughput: 0: 44740.0. Samples: 1685401500. Policy #0 lag: (min: 0.0, avg: 33.7, max: 73.0) [2024-03-21 08:50:00,522][03784] Avg episode reward: [(0, '0.601')] [2024-03-21 08:50:00,706][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000051400_1684275200.pth... [2024-03-21 08:50:00,831][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000051064_1673265152.pth [2024-03-21 08:50:05,521][03784] Fps is (10 sec: 32767.6, 60 sec: 44782.8, 300 sec: 45208.7). Total num frames: 1684373504. Throughput: 0: 44037.8. Samples: 1685635000. Policy #0 lag: (min: 0.0, avg: 33.7, max: 73.0) [2024-03-21 08:50:05,522][03784] Avg episode reward: [(0, '0.722')] [2024-03-21 08:50:08,815][04017] Updated weights for policy 0, policy_version 51409 (0.0012) [2024-03-21 08:50:10,230][03995] Signal inference workers to stop experience collection... (33950 times) [2024-03-21 08:50:10,231][03995] Signal inference workers to resume experience collection... (33950 times) [2024-03-21 08:50:10,310][04017] InferenceWorker_p0-w0: stopping experience collection (33950 times) [2024-03-21 08:50:10,311][04017] InferenceWorker_p0-w0: resuming experience collection (33950 times) [2024-03-21 08:50:10,521][03784] Fps is (10 sec: 45875.6, 60 sec: 46421.3, 300 sec: 45430.9). Total num frames: 1684701184. Throughput: 0: 43000.0. Samples: 1685869600. Policy #0 lag: (min: 0.0, avg: 33.7, max: 73.0) [2024-03-21 08:50:10,522][03784] Avg episode reward: [(0, '1.119')] [2024-03-21 08:50:14,857][04017] Updated weights for policy 0, policy_version 51419 (0.0015) [2024-03-21 08:50:15,521][03784] Fps is (10 sec: 58983.4, 60 sec: 50244.4, 300 sec: 44986.6). Total num frames: 1684963328. Throughput: 0: 43568.9. Samples: 1686029900. Policy #0 lag: (min: 0.0, avg: 33.7, max: 73.0) [2024-03-21 08:50:15,522][03784] Avg episode reward: [(0, '0.698')] [2024-03-21 08:50:20,521][03784] Fps is (10 sec: 49152.2, 60 sec: 48059.9, 300 sec: 45208.7). Total num frames: 1685192704. Throughput: 0: 43455.5. Samples: 1686307500. Policy #0 lag: (min: 0.0, avg: 33.7, max: 73.0) [2024-03-21 08:50:20,522][03784] Avg episode reward: [(0, '0.698')] [2024-03-21 08:50:22,448][04017] Updated weights for policy 0, policy_version 51429 (0.0012) [2024-03-21 08:50:25,521][03784] Fps is (10 sec: 36044.3, 60 sec: 46967.5, 300 sec: 45208.7). Total num frames: 1685323776. Throughput: 0: 44646.7. Samples: 1686607500. Policy #0 lag: (min: 0.0, avg: 33.7, max: 73.0) [2024-03-21 08:50:25,522][03784] Avg episode reward: [(0, '0.698')] [2024-03-21 08:50:30,521][03784] Fps is (10 sec: 29491.3, 60 sec: 46967.5, 300 sec: 45097.7). Total num frames: 1685487616. Throughput: 0: 45224.6. Samples: 1686759900. Policy #0 lag: (min: 0.0, avg: 33.7, max: 73.0) [2024-03-21 08:50:30,521][03784] Avg episode reward: [(0, '0.698')] [2024-03-21 08:50:30,867][04017] Updated weights for policy 0, policy_version 51439 (0.0011) [2024-03-21 08:50:35,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 1685716992. Throughput: 0: 45453.3. Samples: 1687036000. Policy #0 lag: (min: 0.0, avg: 33.2, max: 66.0) [2024-03-21 08:50:35,522][03784] Avg episode reward: [(0, '0.699')] [2024-03-21 08:50:38,048][04017] Updated weights for policy 0, policy_version 51449 (0.0015) [2024-03-21 08:50:40,521][03784] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 45542.0). Total num frames: 1685913600. Throughput: 0: 45086.6. Samples: 1687305300. Policy #0 lag: (min: 0.0, avg: 33.2, max: 66.0) [2024-03-21 08:50:40,522][03784] Avg episode reward: [(0, '0.780')] [2024-03-21 08:50:45,521][03784] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 45875.2). Total num frames: 1686110208. Throughput: 0: 45366.7. Samples: 1687443000. Policy #0 lag: (min: 0.0, avg: 33.2, max: 66.0) [2024-03-21 08:50:45,522][03784] Avg episode reward: [(0, '0.593')] [2024-03-21 08:50:46,881][04017] Updated weights for policy 0, policy_version 51459 (0.0011) [2024-03-21 08:50:50,521][03784] Fps is (10 sec: 52428.0, 60 sec: 42052.1, 300 sec: 45875.2). Total num frames: 1686437888. Throughput: 0: 46204.3. Samples: 1687714200. Policy #0 lag: (min: 0.0, avg: 33.2, max: 66.0) [2024-03-21 08:50:50,522][03784] Avg episode reward: [(0, '1.022')] [2024-03-21 08:50:53,306][04017] Updated weights for policy 0, policy_version 51469 (0.0012) [2024-03-21 08:50:55,521][03784] Fps is (10 sec: 49152.1, 60 sec: 42598.4, 300 sec: 45208.7). Total num frames: 1686601728. Throughput: 0: 46911.1. Samples: 1687980600. Policy #0 lag: (min: 0.0, avg: 33.2, max: 66.0) [2024-03-21 08:50:55,522][03784] Avg episode reward: [(0, '0.492')] [2024-03-21 08:51:00,521][03784] Fps is (10 sec: 36045.2, 60 sec: 42598.4, 300 sec: 44875.5). Total num frames: 1686798336. Throughput: 0: 46277.6. Samples: 1688112400. Policy #0 lag: (min: 0.0, avg: 33.2, max: 66.0) [2024-03-21 08:51:00,522][03784] Avg episode reward: [(0, '0.600')] [2024-03-21 08:51:01,035][04017] Updated weights for policy 0, policy_version 51479 (0.0024) [2024-03-21 08:51:03,168][03995] Signal inference workers to stop experience collection... (34000 times) [2024-03-21 08:51:03,237][03995] Signal inference workers to resume experience collection... (34000 times) [2024-03-21 08:51:03,252][04017] InferenceWorker_p0-w0: stopping experience collection (34000 times) [2024-03-21 08:51:03,301][04017] InferenceWorker_p0-w0: resuming experience collection (34000 times) [2024-03-21 08:51:05,521][03784] Fps is (10 sec: 52428.8, 60 sec: 45875.3, 300 sec: 44764.4). Total num frames: 1687126016. Throughput: 0: 45553.3. Samples: 1688357400. Policy #0 lag: (min: 0.0, avg: 33.2, max: 66.0) [2024-03-21 08:51:05,522][03784] Avg episode reward: [(0, '1.033')] [2024-03-21 08:51:06,760][04017] Updated weights for policy 0, policy_version 51489 (0.0011) [2024-03-21 08:51:10,521][03784] Fps is (10 sec: 58982.4, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 1687388160. Throughput: 0: 44944.4. Samples: 1688630000. Policy #0 lag: (min: 0.0, avg: 44.6, max: 84.0) [2024-03-21 08:51:10,522][03784] Avg episode reward: [(0, '1.536')] [2024-03-21 08:51:13,412][04017] Updated weights for policy 0, policy_version 51499 (0.0010) [2024-03-21 08:51:15,521][03784] Fps is (10 sec: 52429.1, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 1687650304. Throughput: 0: 44855.5. Samples: 1688778400. Policy #0 lag: (min: 0.0, avg: 44.6, max: 84.0) [2024-03-21 08:51:15,522][03784] Avg episode reward: [(0, '1.499')] [2024-03-21 08:51:19,502][04017] Updated weights for policy 0, policy_version 51509 (0.0019) [2024-03-21 08:51:20,521][03784] Fps is (10 sec: 55706.3, 60 sec: 45875.2, 300 sec: 45097.7). Total num frames: 1687945216. Throughput: 0: 44306.7. Samples: 1689029800. Policy #0 lag: (min: 0.0, avg: 44.6, max: 84.0) [2024-03-21 08:51:20,522][03784] Avg episode reward: [(0, '1.597')] [2024-03-21 08:51:25,123][04017] Updated weights for policy 0, policy_version 51519 (0.0023) [2024-03-21 08:51:25,521][03784] Fps is (10 sec: 55705.0, 60 sec: 48059.7, 300 sec: 45319.8). Total num frames: 1688207360. Throughput: 0: 43875.5. Samples: 1689279700. Policy #0 lag: (min: 0.0, avg: 44.6, max: 84.0) [2024-03-21 08:51:25,522][03784] Avg episode reward: [(0, '1.401')] [2024-03-21 08:51:30,521][03784] Fps is (10 sec: 39321.6, 60 sec: 47513.6, 300 sec: 45542.0). Total num frames: 1688338432. Throughput: 0: 43986.7. Samples: 1689422400. Policy #0 lag: (min: 0.0, avg: 44.6, max: 84.0) [2024-03-21 08:51:30,522][03784] Avg episode reward: [(0, '0.929')] [2024-03-21 08:51:35,521][03784] Fps is (10 sec: 26214.6, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1688469504. Throughput: 0: 43840.2. Samples: 1689687000. Policy #0 lag: (min: 0.0, avg: 44.6, max: 84.0) [2024-03-21 08:51:35,522][03784] Avg episode reward: [(0, '0.569')] [2024-03-21 08:51:36,674][04017] Updated weights for policy 0, policy_version 51529 (0.0011) [2024-03-21 08:51:40,521][03784] Fps is (10 sec: 32767.8, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 1688666112. Throughput: 0: 44315.6. Samples: 1689974800. Policy #0 lag: (min: 0.0, avg: 48.7, max: 109.0) [2024-03-21 08:51:40,522][03784] Avg episode reward: [(0, '0.569')] [2024-03-21 08:51:44,998][04017] Updated weights for policy 0, policy_version 51539 (0.0011) [2024-03-21 08:51:45,521][03784] Fps is (10 sec: 39321.2, 60 sec: 45875.1, 300 sec: 45764.1). Total num frames: 1688862720. Throughput: 0: 44586.6. Samples: 1690118800. Policy #0 lag: (min: 0.0, avg: 48.7, max: 109.0) [2024-03-21 08:51:45,522][03784] Avg episode reward: [(0, '1.338')] [2024-03-21 08:51:50,521][03784] Fps is (10 sec: 36044.4, 60 sec: 43144.6, 300 sec: 45541.9). Total num frames: 1689026560. Throughput: 0: 45159.9. Samples: 1690389600. Policy #0 lag: (min: 0.0, avg: 48.7, max: 109.0) [2024-03-21 08:51:50,522][03784] Avg episode reward: [(0, '0.485')] [2024-03-21 08:51:51,831][04017] Updated weights for policy 0, policy_version 51549 (0.0015) [2024-03-21 08:51:55,521][03784] Fps is (10 sec: 36045.2, 60 sec: 43690.7, 300 sec: 45097.7). Total num frames: 1689223168. Throughput: 0: 45440.1. Samples: 1690674800. Policy #0 lag: (min: 0.0, avg: 48.7, max: 109.0) [2024-03-21 08:51:55,522][03784] Avg episode reward: [(0, '0.661')] [2024-03-21 08:52:00,521][03784] Fps is (10 sec: 39322.1, 60 sec: 43690.7, 300 sec: 44764.5). Total num frames: 1689419776. Throughput: 0: 45286.6. Samples: 1690816300. Policy #0 lag: (min: 0.0, avg: 48.7, max: 109.0) [2024-03-21 08:52:00,522][03784] Avg episode reward: [(0, '1.293')] [2024-03-21 08:52:00,533][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000051557_1689419776.pth... [2024-03-21 08:52:00,668][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000051234_1678835712.pth [2024-03-21 08:52:02,325][04017] Updated weights for policy 0, policy_version 51559 (0.0012) [2024-03-21 08:52:05,521][03784] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 44875.5). Total num frames: 1689649152. Throughput: 0: 46048.8. Samples: 1691102000. Policy #0 lag: (min: 0.0, avg: 48.7, max: 109.0) [2024-03-21 08:52:05,522][03784] Avg episode reward: [(0, '0.750')] [2024-03-21 08:52:08,006][04017] Updated weights for policy 0, policy_version 51569 (0.0011) [2024-03-21 08:52:09,820][03995] Signal inference workers to stop experience collection... (34050 times) [2024-03-21 08:52:09,821][03995] Signal inference workers to resume experience collection... (34050 times) [2024-03-21 08:52:09,896][04017] InferenceWorker_p0-w0: stopping experience collection (34050 times) [2024-03-21 08:52:09,896][04017] InferenceWorker_p0-w0: resuming experience collection (34050 times) [2024-03-21 08:52:10,521][03784] Fps is (10 sec: 52428.8, 60 sec: 42598.5, 300 sec: 45319.8). Total num frames: 1689944064. Throughput: 0: 46526.7. Samples: 1691373400. Policy #0 lag: (min: 0.0, avg: 48.7, max: 109.0) [2024-03-21 08:52:10,522][03784] Avg episode reward: [(0, '1.527')] [2024-03-21 08:52:12,991][04017] Updated weights for policy 0, policy_version 51579 (0.0012) [2024-03-21 08:52:15,521][03784] Fps is (10 sec: 68813.7, 60 sec: 44783.0, 300 sec: 45430.9). Total num frames: 1690337280. Throughput: 0: 45960.1. Samples: 1691490600. Policy #0 lag: (min: 0.0, avg: 35.9, max: 78.0) [2024-03-21 08:52:15,521][03784] Avg episode reward: [(0, '0.853')] [2024-03-21 08:52:16,496][04017] Updated weights for policy 0, policy_version 51589 (0.0017) [2024-03-21 08:52:20,521][03784] Fps is (10 sec: 65535.4, 60 sec: 44236.7, 300 sec: 44875.5). Total num frames: 1690599424. Throughput: 0: 45737.7. Samples: 1691745200. Policy #0 lag: (min: 0.0, avg: 35.9, max: 78.0) [2024-03-21 08:52:20,522][03784] Avg episode reward: [(0, '1.427')] [2024-03-21 08:52:23,990][04017] Updated weights for policy 0, policy_version 51599 (0.0011) [2024-03-21 08:52:25,521][03784] Fps is (10 sec: 58981.9, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 1690927104. Throughput: 0: 45364.5. Samples: 1692016200. Policy #0 lag: (min: 0.0, avg: 35.9, max: 78.0) [2024-03-21 08:52:25,522][03784] Avg episode reward: [(0, '1.083')] [2024-03-21 08:52:30,521][03784] Fps is (10 sec: 45875.6, 60 sec: 45329.0, 300 sec: 45986.3). Total num frames: 1691058176. Throughput: 0: 45137.9. Samples: 1692150000. Policy #0 lag: (min: 0.0, avg: 35.9, max: 78.0) [2024-03-21 08:52:30,522][03784] Avg episode reward: [(0, '1.482')] [2024-03-21 08:52:31,318][04017] Updated weights for policy 0, policy_version 51609 (0.0011) [2024-03-21 08:52:35,521][03784] Fps is (10 sec: 36044.8, 60 sec: 46967.5, 300 sec: 45875.2). Total num frames: 1691287552. Throughput: 0: 45182.4. Samples: 1692422800. Policy #0 lag: (min: 0.0, avg: 35.9, max: 78.0) [2024-03-21 08:52:35,522][03784] Avg episode reward: [(0, '1.102')] [2024-03-21 08:52:40,521][03784] Fps is (10 sec: 36044.6, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1691418624. Throughput: 0: 45091.1. Samples: 1692703900. Policy #0 lag: (min: 0.0, avg: 35.9, max: 78.0) [2024-03-21 08:52:40,522][03784] Avg episode reward: [(0, '1.435')] [2024-03-21 08:52:41,068][04017] Updated weights for policy 0, policy_version 51619 (0.0015) [2024-03-21 08:52:44,291][04017] Updated weights for policy 0, policy_version 51629 (0.0027) [2024-03-21 08:52:45,521][03784] Fps is (10 sec: 55705.1, 60 sec: 49698.2, 300 sec: 45542.0). Total num frames: 1691844608. Throughput: 0: 44684.4. Samples: 1692827100. Policy #0 lag: (min: 5.0, avg: 50.3, max: 119.0) [2024-03-21 08:52:45,522][03784] Avg episode reward: [(0, '1.358')] [2024-03-21 08:52:50,521][03784] Fps is (10 sec: 52428.7, 60 sec: 48605.9, 300 sec: 45097.6). Total num frames: 1691942912. Throughput: 0: 44991.1. Samples: 1693126600. Policy #0 lag: (min: 5.0, avg: 50.3, max: 119.0) [2024-03-21 08:52:50,522][03784] Avg episode reward: [(0, '1.358')] [2024-03-21 08:52:55,521][03784] Fps is (10 sec: 13107.3, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1691975680. Throughput: 0: 45737.8. Samples: 1693431600. Policy #0 lag: (min: 5.0, avg: 50.3, max: 119.0) [2024-03-21 08:52:55,522][03784] Avg episode reward: [(0, '1.076')] [2024-03-21 08:52:58,139][04017] Updated weights for policy 0, policy_version 51639 (0.0015) [2024-03-21 08:53:00,521][03784] Fps is (10 sec: 19660.7, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1692139520. Throughput: 0: 46059.8. Samples: 1693563300. Policy #0 lag: (min: 5.0, avg: 50.3, max: 119.0) [2024-03-21 08:53:00,522][03784] Avg episode reward: [(0, '1.472')] [2024-03-21 08:53:03,163][03995] Signal inference workers to stop experience collection... (34100 times) [2024-03-21 08:53:03,163][03995] Signal inference workers to resume experience collection... (34100 times) [2024-03-21 08:53:03,213][04017] InferenceWorker_p0-w0: stopping experience collection (34100 times) [2024-03-21 08:53:03,213][04017] InferenceWorker_p0-w0: resuming experience collection (34100 times) [2024-03-21 08:53:05,259][04017] Updated weights for policy 0, policy_version 51649 (0.0013) [2024-03-21 08:53:05,521][03784] Fps is (10 sec: 49152.4, 60 sec: 46967.6, 300 sec: 44986.6). Total num frames: 1692467200. Throughput: 0: 46900.2. Samples: 1693855700. Policy #0 lag: (min: 5.0, avg: 50.3, max: 119.0) [2024-03-21 08:53:05,522][03784] Avg episode reward: [(0, '1.270')] [2024-03-21 08:53:08,980][04017] Updated weights for policy 0, policy_version 51659 (0.0015) [2024-03-21 08:53:10,521][03784] Fps is (10 sec: 72090.4, 60 sec: 48605.8, 300 sec: 45653.1). Total num frames: 1692860416. Throughput: 0: 46235.5. Samples: 1694096800. Policy #0 lag: (min: 5.0, avg: 50.3, max: 119.0) [2024-03-21 08:53:10,522][03784] Avg episode reward: [(0, '1.027')] [2024-03-21 08:53:15,521][03784] Fps is (10 sec: 58981.8, 60 sec: 45329.0, 300 sec: 46097.4). Total num frames: 1693057024. Throughput: 0: 46415.5. Samples: 1694238700. Policy #0 lag: (min: 5.0, avg: 50.3, max: 119.0) [2024-03-21 08:53:15,522][03784] Avg episode reward: [(0, '1.382')] [2024-03-21 08:53:19,482][04017] Updated weights for policy 0, policy_version 51669 (0.0019) [2024-03-21 08:53:20,521][03784] Fps is (10 sec: 26214.2, 60 sec: 42052.3, 300 sec: 45986.3). Total num frames: 1693122560. Throughput: 0: 47404.3. Samples: 1694556000. Policy #0 lag: (min: 0.0, avg: 40.2, max: 117.0) [2024-03-21 08:53:20,522][03784] Avg episode reward: [(0, '0.979')] [2024-03-21 08:53:24,347][04017] Updated weights for policy 0, policy_version 51679 (0.0015) [2024-03-21 08:53:25,521][03784] Fps is (10 sec: 36044.4, 60 sec: 41506.0, 300 sec: 45653.0). Total num frames: 1693417472. Throughput: 0: 47128.8. Samples: 1694824700. Policy #0 lag: (min: 0.0, avg: 40.2, max: 117.0) [2024-03-21 08:53:25,522][03784] Avg episode reward: [(0, '0.496')] [2024-03-21 08:53:30,521][03784] Fps is (10 sec: 49152.4, 60 sec: 42598.4, 300 sec: 44764.4). Total num frames: 1693614080. Throughput: 0: 47844.5. Samples: 1694980100. Policy #0 lag: (min: 0.0, avg: 40.2, max: 117.0) [2024-03-21 08:53:30,522][03784] Avg episode reward: [(0, '1.131')] [2024-03-21 08:53:31,601][04017] Updated weights for policy 0, policy_version 51689 (0.0012) [2024-03-21 08:53:35,049][04017] Updated weights for policy 0, policy_version 51699 (0.0016) [2024-03-21 08:53:35,521][03784] Fps is (10 sec: 68813.3, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 1694105600. Throughput: 0: 46373.3. Samples: 1695213400. Policy #0 lag: (min: 0.0, avg: 40.2, max: 117.0) [2024-03-21 08:53:35,522][03784] Avg episode reward: [(0, '1.163')] [2024-03-21 08:53:40,521][03784] Fps is (10 sec: 72089.0, 60 sec: 48605.8, 300 sec: 45430.9). Total num frames: 1694334976. Throughput: 0: 46019.9. Samples: 1695502500. Policy #0 lag: (min: 0.0, avg: 40.2, max: 117.0) [2024-03-21 08:53:40,522][03784] Avg episode reward: [(0, '1.163')] [2024-03-21 08:53:41,071][04017] Updated weights for policy 0, policy_version 51709 (0.0020) [2024-03-21 08:53:43,844][03995] Signal inference workers to stop experience collection... (34150 times) [2024-03-21 08:53:43,914][03995] Signal inference workers to resume experience collection... (34150 times) [2024-03-21 08:53:43,918][04017] InferenceWorker_p0-w0: stopping experience collection (34150 times) [2024-03-21 08:53:43,986][04017] InferenceWorker_p0-w0: resuming experience collection (34150 times) [2024-03-21 08:53:45,521][03784] Fps is (10 sec: 49152.1, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 1694597120. Throughput: 0: 46424.5. Samples: 1695652400. Policy #0 lag: (min: 0.0, avg: 40.2, max: 117.0) [2024-03-21 08:53:45,522][03784] Avg episode reward: [(0, '0.864')] [2024-03-21 08:53:48,384][04017] Updated weights for policy 0, policy_version 51719 (0.0017) [2024-03-21 08:53:50,521][03784] Fps is (10 sec: 49151.8, 60 sec: 48059.7, 300 sec: 45319.8). Total num frames: 1694826496. Throughput: 0: 46008.7. Samples: 1695926100. Policy #0 lag: (min: 0.0, avg: 40.2, max: 117.0) [2024-03-21 08:53:50,522][03784] Avg episode reward: [(0, '1.675')] [2024-03-21 08:53:55,521][03784] Fps is (10 sec: 36044.8, 60 sec: 49698.1, 300 sec: 45097.7). Total num frames: 1694957568. Throughput: 0: 46993.3. Samples: 1696211500. Policy #0 lag: (min: 0.0, avg: 36.9, max: 78.0) [2024-03-21 08:53:55,522][03784] Avg episode reward: [(0, '1.675')] [2024-03-21 08:53:59,554][04017] Updated weights for policy 0, policy_version 51729 (0.0012) [2024-03-21 08:54:00,521][03784] Fps is (10 sec: 32768.0, 60 sec: 50244.3, 300 sec: 45653.0). Total num frames: 1695154176. Throughput: 0: 46915.5. Samples: 1696349900. Policy #0 lag: (min: 0.0, avg: 36.9, max: 78.0) [2024-03-21 08:54:00,522][03784] Avg episode reward: [(0, '1.105')] [2024-03-21 08:54:00,740][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000051733_1695186944.pth... [2024-03-21 08:54:00,867][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000051400_1684275200.pth [2024-03-21 08:54:02,989][04017] Updated weights for policy 0, policy_version 51739 (0.0020) [2024-03-21 08:54:05,521][03784] Fps is (10 sec: 42598.0, 60 sec: 48605.7, 300 sec: 45653.0). Total num frames: 1695383552. Throughput: 0: 45679.9. Samples: 1696611600. Policy #0 lag: (min: 0.0, avg: 36.9, max: 78.0) [2024-03-21 08:54:05,522][03784] Avg episode reward: [(0, '1.748')] [2024-03-21 08:54:10,521][03784] Fps is (10 sec: 45875.5, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 1695612928. Throughput: 0: 46186.7. Samples: 1696903100. Policy #0 lag: (min: 0.0, avg: 36.9, max: 78.0) [2024-03-21 08:54:10,522][03784] Avg episode reward: [(0, '1.748')] [2024-03-21 08:54:14,161][04017] Updated weights for policy 0, policy_version 51749 (0.0010) [2024-03-21 08:54:15,521][03784] Fps is (10 sec: 39321.9, 60 sec: 45329.0, 300 sec: 45653.1). Total num frames: 1695776768. Throughput: 0: 45853.3. Samples: 1697043500. Policy #0 lag: (min: 0.0, avg: 36.9, max: 78.0) [2024-03-21 08:54:15,522][03784] Avg episode reward: [(0, '1.273')] [2024-03-21 08:54:20,521][03784] Fps is (10 sec: 32768.0, 60 sec: 46967.5, 300 sec: 45542.0). Total num frames: 1695940608. Throughput: 0: 47071.1. Samples: 1697331600. Policy #0 lag: (min: 0.0, avg: 36.9, max: 78.0) [2024-03-21 08:54:20,522][03784] Avg episode reward: [(0, '1.370')] [2024-03-21 08:54:22,996][04017] Updated weights for policy 0, policy_version 51759 (0.0025) [2024-03-21 08:54:25,521][03784] Fps is (10 sec: 42598.5, 60 sec: 46421.4, 300 sec: 45875.2). Total num frames: 1696202752. Throughput: 0: 46171.2. Samples: 1697580200. Policy #0 lag: (min: 0.0, avg: 36.9, max: 78.0) [2024-03-21 08:54:25,522][03784] Avg episode reward: [(0, '1.553')] [2024-03-21 08:54:27,814][04017] Updated weights for policy 0, policy_version 51769 (0.0019) [2024-03-21 08:54:30,521][03784] Fps is (10 sec: 62259.1, 60 sec: 49151.9, 300 sec: 45875.2). Total num frames: 1696563200. Throughput: 0: 45404.4. Samples: 1697695600. Policy #0 lag: (min: 0.0, avg: 38.7, max: 83.0) [2024-03-21 08:54:30,522][03784] Avg episode reward: [(0, '0.898')] [2024-03-21 08:54:32,684][04017] Updated weights for policy 0, policy_version 51779 (0.0040) [2024-03-21 08:54:35,521][03784] Fps is (10 sec: 49152.2, 60 sec: 43144.6, 300 sec: 45097.7). Total num frames: 1696694272. Throughput: 0: 45762.4. Samples: 1697985400. Policy #0 lag: (min: 0.0, avg: 38.7, max: 83.0) [2024-03-21 08:54:35,522][03784] Avg episode reward: [(0, '0.611')] [2024-03-21 08:54:39,114][03995] Signal inference workers to stop experience collection... (34200 times) [2024-03-21 08:54:39,204][04017] InferenceWorker_p0-w0: stopping experience collection (34200 times) [2024-03-21 08:54:39,249][03995] Signal inference workers to resume experience collection... (34200 times) [2024-03-21 08:54:39,253][04017] InferenceWorker_p0-w0: resuming experience collection (34200 times) [2024-03-21 08:54:39,617][04017] Updated weights for policy 0, policy_version 51789 (0.0016) [2024-03-21 08:54:40,521][03784] Fps is (10 sec: 49151.5, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 1697054720. Throughput: 0: 45602.1. Samples: 1698263600. Policy #0 lag: (min: 0.0, avg: 38.7, max: 83.0) [2024-03-21 08:54:40,522][03784] Avg episode reward: [(0, '0.611')] [2024-03-21 08:54:45,521][03784] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 45097.6). Total num frames: 1697218560. Throughput: 0: 45600.1. Samples: 1698401900. Policy #0 lag: (min: 0.0, avg: 38.7, max: 83.0) [2024-03-21 08:54:45,522][03784] Avg episode reward: [(0, '0.720')] [2024-03-21 08:54:49,816][04017] Updated weights for policy 0, policy_version 51799 (0.0013) [2024-03-21 08:54:50,521][03784] Fps is (10 sec: 32768.5, 60 sec: 42598.5, 300 sec: 45208.7). Total num frames: 1697382400. Throughput: 0: 45815.7. Samples: 1698673300. Policy #0 lag: (min: 0.0, avg: 38.7, max: 83.0) [2024-03-21 08:54:50,522][03784] Avg episode reward: [(0, '1.072')] [2024-03-21 08:54:55,521][03784] Fps is (10 sec: 39321.4, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 1697611776. Throughput: 0: 44513.3. Samples: 1698906200. Policy #0 lag: (min: 0.0, avg: 38.7, max: 83.0) [2024-03-21 08:54:55,522][03784] Avg episode reward: [(0, '1.030')] [2024-03-21 08:54:56,718][04017] Updated weights for policy 0, policy_version 51809 (0.0020) [2024-03-21 08:55:00,521][03784] Fps is (10 sec: 49151.9, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 1697873920. Throughput: 0: 43984.5. Samples: 1699022800. Policy #0 lag: (min: 4.0, avg: 48.4, max: 108.0) [2024-03-21 08:55:00,522][03784] Avg episode reward: [(0, '0.880')] [2024-03-21 08:55:01,531][04017] Updated weights for policy 0, policy_version 51819 (0.0016) [2024-03-21 08:55:05,521][03784] Fps is (10 sec: 52429.4, 60 sec: 45875.4, 300 sec: 45542.0). Total num frames: 1698136064. Throughput: 0: 43631.2. Samples: 1699295000. Policy #0 lag: (min: 4.0, avg: 48.4, max: 108.0) [2024-03-21 08:55:05,522][03784] Avg episode reward: [(0, '1.320')] [2024-03-21 08:55:10,044][04017] Updated weights for policy 0, policy_version 51829 (0.0012) [2024-03-21 08:55:10,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1698332672. Throughput: 0: 44148.9. Samples: 1699566900. Policy #0 lag: (min: 4.0, avg: 48.4, max: 108.0) [2024-03-21 08:55:10,531][03784] Avg episode reward: [(0, '0.540')] [2024-03-21 08:55:15,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45329.2, 300 sec: 45097.7). Total num frames: 1698496512. Throughput: 0: 44402.3. Samples: 1699693700. Policy #0 lag: (min: 4.0, avg: 48.4, max: 108.0) [2024-03-21 08:55:15,522][03784] Avg episode reward: [(0, '0.995')] [2024-03-21 08:55:18,528][04017] Updated weights for policy 0, policy_version 51839 (0.0012) [2024-03-21 08:55:20,521][03784] Fps is (10 sec: 36044.8, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1698693120. Throughput: 0: 44493.3. Samples: 1699987600. Policy #0 lag: (min: 4.0, avg: 48.4, max: 108.0) [2024-03-21 08:55:20,522][03784] Avg episode reward: [(0, '1.616')] [2024-03-21 08:55:25,521][03784] Fps is (10 sec: 39321.7, 60 sec: 44783.0, 300 sec: 45430.9). Total num frames: 1698889728. Throughput: 0: 44269.1. Samples: 1700255700. Policy #0 lag: (min: 4.0, avg: 48.4, max: 108.0) [2024-03-21 08:55:25,522][03784] Avg episode reward: [(0, '1.215')] [2024-03-21 08:55:26,769][04017] Updated weights for policy 0, policy_version 51849 (0.0024) [2024-03-21 08:55:30,521][03784] Fps is (10 sec: 52428.9, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 1699217408. Throughput: 0: 44217.7. Samples: 1700391700. Policy #0 lag: (min: 3.0, avg: 39.7, max: 90.0) [2024-03-21 08:55:30,522][03784] Avg episode reward: [(0, '1.248')] [2024-03-21 08:55:34,676][03995] Signal inference workers to stop experience collection... (34250 times) [2024-03-21 08:55:34,740][03995] Signal inference workers to resume experience collection... (34250 times) [2024-03-21 08:55:34,769][04017] InferenceWorker_p0-w0: stopping experience collection (34250 times) [2024-03-21 08:55:34,809][04017] InferenceWorker_p0-w0: resuming experience collection (34250 times) [2024-03-21 08:55:35,099][04017] Updated weights for policy 0, policy_version 51859 (0.0017) [2024-03-21 08:55:35,521][03784] Fps is (10 sec: 45874.8, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 1699348480. Throughput: 0: 44668.9. Samples: 1700683400. Policy #0 lag: (min: 3.0, avg: 39.7, max: 90.0) [2024-03-21 08:55:35,522][03784] Avg episode reward: [(0, '1.478')] [2024-03-21 08:55:40,521][03784] Fps is (10 sec: 36044.9, 60 sec: 42052.4, 300 sec: 45653.0). Total num frames: 1699577856. Throughput: 0: 45360.1. Samples: 1700947400. Policy #0 lag: (min: 3.0, avg: 39.7, max: 90.0) [2024-03-21 08:55:40,522][03784] Avg episode reward: [(0, '0.861')] [2024-03-21 08:55:42,227][04017] Updated weights for policy 0, policy_version 51869 (0.0015) [2024-03-21 08:55:45,521][03784] Fps is (10 sec: 36045.0, 60 sec: 41506.1, 300 sec: 44986.6). Total num frames: 1699708928. Throughput: 0: 45811.2. Samples: 1701084300. Policy #0 lag: (min: 3.0, avg: 39.7, max: 90.0) [2024-03-21 08:55:45,522][03784] Avg episode reward: [(0, '0.861')] [2024-03-21 08:55:50,521][03784] Fps is (10 sec: 36044.9, 60 sec: 42598.4, 300 sec: 45208.7). Total num frames: 1699938304. Throughput: 0: 45275.5. Samples: 1701332400. Policy #0 lag: (min: 3.0, avg: 39.7, max: 90.0) [2024-03-21 08:55:50,522][03784] Avg episode reward: [(0, '0.531')] [2024-03-21 08:55:50,809][04017] Updated weights for policy 0, policy_version 51879 (0.0011) [2024-03-21 08:55:55,075][04017] Updated weights for policy 0, policy_version 51889 (0.0014) [2024-03-21 08:55:55,521][03784] Fps is (10 sec: 62258.2, 60 sec: 45329.0, 300 sec: 45875.2). Total num frames: 1700331520. Throughput: 0: 44815.5. Samples: 1701583600. Policy #0 lag: (min: 3.0, avg: 39.7, max: 90.0) [2024-03-21 08:55:55,522][03784] Avg episode reward: [(0, '1.249')] [2024-03-21 08:55:58,949][04017] Updated weights for policy 0, policy_version 51899 (0.0011) [2024-03-21 08:56:00,521][03784] Fps is (10 sec: 78643.0, 60 sec: 47513.6, 300 sec: 46097.4). Total num frames: 1700724736. Throughput: 0: 45033.3. Samples: 1701720200. Policy #0 lag: (min: 3.0, avg: 39.7, max: 90.0) [2024-03-21 08:56:00,522][03784] Avg episode reward: [(0, '1.249')] [2024-03-21 08:56:00,567][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000051903_1700757504.pth... [2024-03-21 08:56:00,688][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000051557_1689419776.pth [2024-03-21 08:56:05,151][04017] Updated weights for policy 0, policy_version 51909 (0.0015) [2024-03-21 08:56:05,521][03784] Fps is (10 sec: 62259.8, 60 sec: 46967.4, 300 sec: 45986.3). Total num frames: 1700954112. Throughput: 0: 44924.4. Samples: 1702009200. Policy #0 lag: (min: 1.0, avg: 40.1, max: 71.0) [2024-03-21 08:56:05,522][03784] Avg episode reward: [(0, '1.657')] [2024-03-21 08:56:10,521][03784] Fps is (10 sec: 39321.5, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 1701117952. Throughput: 0: 44777.7. Samples: 1702270700. Policy #0 lag: (min: 1.0, avg: 40.1, max: 71.0) [2024-03-21 08:56:10,522][03784] Avg episode reward: [(0, '0.992')] [2024-03-21 08:56:15,165][04017] Updated weights for policy 0, policy_version 51919 (0.0011) [2024-03-21 08:56:15,521][03784] Fps is (10 sec: 32768.6, 60 sec: 46421.4, 300 sec: 45208.7). Total num frames: 1701281792. Throughput: 0: 44771.3. Samples: 1702406400. Policy #0 lag: (min: 1.0, avg: 40.1, max: 71.0) [2024-03-21 08:56:15,521][03784] Avg episode reward: [(0, '1.265')] [2024-03-21 08:56:20,521][03784] Fps is (10 sec: 36044.7, 60 sec: 46421.3, 300 sec: 44986.6). Total num frames: 1701478400. Throughput: 0: 44304.4. Samples: 1702677100. Policy #0 lag: (min: 1.0, avg: 40.1, max: 71.0) [2024-03-21 08:56:20,522][03784] Avg episode reward: [(0, '0.783')] [2024-03-21 08:56:24,321][04017] Updated weights for policy 0, policy_version 51929 (0.0016) [2024-03-21 08:56:25,521][03784] Fps is (10 sec: 42597.9, 60 sec: 46967.4, 300 sec: 45319.8). Total num frames: 1701707776. Throughput: 0: 44460.0. Samples: 1702948100. Policy #0 lag: (min: 1.0, avg: 40.1, max: 71.0) [2024-03-21 08:56:25,522][03784] Avg episode reward: [(0, '0.741')] [2024-03-21 08:56:27,047][03995] Signal inference workers to stop experience collection... (34300 times) [2024-03-21 08:56:27,086][04017] InferenceWorker_p0-w0: stopping experience collection (34300 times) [2024-03-21 08:56:27,323][03995] Signal inference workers to resume experience collection... (34300 times) [2024-03-21 08:56:27,324][04017] InferenceWorker_p0-w0: resuming experience collection (34300 times) [2024-03-21 08:56:29,880][04017] Updated weights for policy 0, policy_version 51939 (0.0019) [2024-03-21 08:56:30,521][03784] Fps is (10 sec: 45875.6, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 1701937152. Throughput: 0: 44457.8. Samples: 1703084900. Policy #0 lag: (min: 1.0, avg: 40.1, max: 71.0) [2024-03-21 08:56:30,522][03784] Avg episode reward: [(0, '1.126')] [2024-03-21 08:56:35,521][03784] Fps is (10 sec: 36044.7, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1702068224. Throughput: 0: 45646.6. Samples: 1703386500. Policy #0 lag: (min: 1.0, avg: 40.1, max: 71.0) [2024-03-21 08:56:35,522][03784] Avg episode reward: [(0, '1.126')] [2024-03-21 08:56:40,521][03784] Fps is (10 sec: 29490.5, 60 sec: 44236.6, 300 sec: 45319.8). Total num frames: 1702232064. Throughput: 0: 46264.4. Samples: 1703665500. Policy #0 lag: (min: 0.0, avg: 26.7, max: 67.0) [2024-03-21 08:56:40,522][03784] Avg episode reward: [(0, '1.342')] [2024-03-21 08:56:40,840][04017] Updated weights for policy 0, policy_version 51949 (0.0011) [2024-03-21 08:56:45,468][04017] Updated weights for policy 0, policy_version 51959 (0.0025) [2024-03-21 08:56:45,521][03784] Fps is (10 sec: 52428.2, 60 sec: 48059.6, 300 sec: 45986.3). Total num frames: 1702592512. Throughput: 0: 46006.5. Samples: 1703790500. Policy #0 lag: (min: 0.0, avg: 26.7, max: 67.0) [2024-03-21 08:56:45,522][03784] Avg episode reward: [(0, '1.390')] [2024-03-21 08:56:50,521][03784] Fps is (10 sec: 68814.1, 60 sec: 49698.1, 300 sec: 46430.6). Total num frames: 1702920192. Throughput: 0: 45431.1. Samples: 1704053600. Policy #0 lag: (min: 0.0, avg: 26.7, max: 67.0) [2024-03-21 08:56:50,522][03784] Avg episode reward: [(0, '1.739')] [2024-03-21 08:56:50,802][04017] Updated weights for policy 0, policy_version 51970 (0.0016) [2024-03-21 08:56:55,521][03784] Fps is (10 sec: 52429.2, 60 sec: 46421.4, 300 sec: 46430.6). Total num frames: 1703116800. Throughput: 0: 45997.8. Samples: 1704340600. Policy #0 lag: (min: 0.0, avg: 26.7, max: 67.0) [2024-03-21 08:56:55,522][03784] Avg episode reward: [(0, '1.739')] [2024-03-21 08:57:00,259][04017] Updated weights for policy 0, policy_version 51980 (0.0015) [2024-03-21 08:57:00,521][03784] Fps is (10 sec: 36044.9, 60 sec: 42598.4, 300 sec: 46208.4). Total num frames: 1703280640. Throughput: 0: 46175.4. Samples: 1704484300. Policy #0 lag: (min: 0.0, avg: 26.7, max: 67.0) [2024-03-21 08:57:00,522][03784] Avg episode reward: [(0, '1.689')] [2024-03-21 08:57:05,521][03784] Fps is (10 sec: 32768.3, 60 sec: 41506.2, 300 sec: 45764.1). Total num frames: 1703444480. Throughput: 0: 46771.2. Samples: 1704781800. Policy #0 lag: (min: 0.0, avg: 26.7, max: 67.0) [2024-03-21 08:57:05,522][03784] Avg episode reward: [(0, '1.231')] [2024-03-21 08:57:08,500][04017] Updated weights for policy 0, policy_version 51990 (0.0010) [2024-03-21 08:57:10,521][03784] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 45208.7). Total num frames: 1703673856. Throughput: 0: 46955.6. Samples: 1705061100. Policy #0 lag: (min: 4.0, avg: 30.0, max: 61.0) [2024-03-21 08:57:10,522][03784] Avg episode reward: [(0, '1.347')] [2024-03-21 08:57:15,521][03784] Fps is (10 sec: 45874.9, 60 sec: 43690.5, 300 sec: 45097.7). Total num frames: 1703903232. Throughput: 0: 47422.2. Samples: 1705218900. Policy #0 lag: (min: 4.0, avg: 30.0, max: 61.0) [2024-03-21 08:57:15,522][03784] Avg episode reward: [(0, '1.347')] [2024-03-21 08:57:15,733][04017] Updated weights for policy 0, policy_version 52000 (0.0012) [2024-03-21 08:57:20,521][03784] Fps is (10 sec: 49151.4, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 1704165376. Throughput: 0: 46137.7. Samples: 1705462700. Policy #0 lag: (min: 4.0, avg: 30.0, max: 61.0) [2024-03-21 08:57:20,522][03784] Avg episode reward: [(0, '0.903')] [2024-03-21 08:57:21,669][04017] Updated weights for policy 0, policy_version 52010 (0.0019) [2024-03-21 08:57:21,740][03995] Signal inference workers to stop experience collection... (34350 times) [2024-03-21 08:57:21,810][04017] InferenceWorker_p0-w0: stopping experience collection (34350 times) [2024-03-21 08:57:22,052][03995] Signal inference workers to resume experience collection... (34350 times) [2024-03-21 08:57:22,053][04017] InferenceWorker_p0-w0: resuming experience collection (34350 times) [2024-03-21 08:57:25,521][03784] Fps is (10 sec: 58982.4, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1704493056. Throughput: 0: 45715.7. Samples: 1705722700. Policy #0 lag: (min: 4.0, avg: 30.0, max: 61.0) [2024-03-21 08:57:25,522][03784] Avg episode reward: [(0, '1.493')] [2024-03-21 08:57:28,277][04017] Updated weights for policy 0, policy_version 52020 (0.0017) [2024-03-21 08:57:30,521][03784] Fps is (10 sec: 58982.7, 60 sec: 46967.4, 300 sec: 45653.0). Total num frames: 1704755200. Throughput: 0: 46257.9. Samples: 1705872100. Policy #0 lag: (min: 4.0, avg: 30.0, max: 61.0) [2024-03-21 08:57:30,522][03784] Avg episode reward: [(0, '0.900')] [2024-03-21 08:57:32,228][04017] Updated weights for policy 0, policy_version 52030 (0.0011) [2024-03-21 08:57:35,521][03784] Fps is (10 sec: 55705.6, 60 sec: 49698.1, 300 sec: 46208.4). Total num frames: 1705050112. Throughput: 0: 46262.2. Samples: 1706135400. Policy #0 lag: (min: 4.0, avg: 30.0, max: 61.0) [2024-03-21 08:57:35,522][03784] Avg episode reward: [(0, '1.128')] [2024-03-21 08:57:39,651][04017] Updated weights for policy 0, policy_version 52040 (0.0014) [2024-03-21 08:57:40,521][03784] Fps is (10 sec: 55705.6, 60 sec: 51336.7, 300 sec: 45653.1). Total num frames: 1705312256. Throughput: 0: 46133.4. Samples: 1706416600. Policy #0 lag: (min: 4.0, avg: 30.0, max: 61.0) [2024-03-21 08:57:40,522][03784] Avg episode reward: [(0, '1.753')] [2024-03-21 08:57:45,521][03784] Fps is (10 sec: 36045.0, 60 sec: 46967.6, 300 sec: 45653.1). Total num frames: 1705410560. Throughput: 0: 46200.0. Samples: 1706563300. Policy #0 lag: (min: 0.0, avg: 39.1, max: 83.0) [2024-03-21 08:57:45,522][03784] Avg episode reward: [(0, '1.344')] [2024-03-21 08:57:50,521][03784] Fps is (10 sec: 19660.7, 60 sec: 43144.5, 300 sec: 45875.2). Total num frames: 1705508864. Throughput: 0: 46104.3. Samples: 1706856500. Policy #0 lag: (min: 0.0, avg: 39.1, max: 83.0) [2024-03-21 08:57:50,522][03784] Avg episode reward: [(0, '0.589')] [2024-03-21 08:57:50,956][04017] Updated weights for policy 0, policy_version 52050 (0.0017) [2024-03-21 08:57:55,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44236.9, 300 sec: 46208.5). Total num frames: 1705771008. Throughput: 0: 45924.4. Samples: 1707127700. Policy #0 lag: (min: 0.0, avg: 39.1, max: 83.0) [2024-03-21 08:57:55,522][03784] Avg episode reward: [(0, '0.550')] [2024-03-21 08:57:58,269][04017] Updated weights for policy 0, policy_version 52060 (0.0015) [2024-03-21 08:58:00,521][03784] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 45764.1). Total num frames: 1705967616. Throughput: 0: 45460.0. Samples: 1707264600. Policy #0 lag: (min: 0.0, avg: 39.1, max: 83.0) [2024-03-21 08:58:00,522][03784] Avg episode reward: [(0, '1.134')] [2024-03-21 08:58:00,924][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052063_1706000384.pth... [2024-03-21 08:58:00,976][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000051733_1695186944.pth [2024-03-21 08:58:05,163][04017] Updated weights for policy 0, policy_version 52070 (0.0016) [2024-03-21 08:58:05,521][03784] Fps is (10 sec: 49151.9, 60 sec: 46967.5, 300 sec: 45430.9). Total num frames: 1706262528. Throughput: 0: 46195.7. Samples: 1707541500. Policy #0 lag: (min: 0.0, avg: 39.1, max: 83.0) [2024-03-21 08:58:05,522][03784] Avg episode reward: [(0, '0.618')] [2024-03-21 08:58:10,521][03784] Fps is (10 sec: 49152.1, 60 sec: 46421.2, 300 sec: 45430.9). Total num frames: 1706459136. Throughput: 0: 45931.1. Samples: 1707789600. Policy #0 lag: (min: 0.0, avg: 39.1, max: 83.0) [2024-03-21 08:58:10,522][03784] Avg episode reward: [(0, '1.600')] [2024-03-21 08:58:13,409][04017] Updated weights for policy 0, policy_version 52080 (0.0015) [2024-03-21 08:58:15,521][03784] Fps is (10 sec: 42598.4, 60 sec: 46421.4, 300 sec: 45986.3). Total num frames: 1706688512. Throughput: 0: 45704.5. Samples: 1707928800. Policy #0 lag: (min: 0.0, avg: 39.1, max: 83.0) [2024-03-21 08:58:15,522][03784] Avg episode reward: [(0, '1.477')] [2024-03-21 08:58:19,535][04017] Updated weights for policy 0, policy_version 52090 (0.0021) [2024-03-21 08:58:20,391][03995] Signal inference workers to stop experience collection... (34400 times) [2024-03-21 08:58:20,392][03995] Signal inference workers to resume experience collection... (34400 times) [2024-03-21 08:58:20,454][04017] InferenceWorker_p0-w0: stopping experience collection (34400 times) [2024-03-21 08:58:20,455][04017] InferenceWorker_p0-w0: resuming experience collection (34400 times) [2024-03-21 08:58:20,521][03784] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 1706917888. Throughput: 0: 45968.9. Samples: 1708204000. Policy #0 lag: (min: 2.0, avg: 30.8, max: 59.0) [2024-03-21 08:58:20,522][03784] Avg episode reward: [(0, '0.887')] [2024-03-21 08:58:25,521][03784] Fps is (10 sec: 49152.1, 60 sec: 44783.0, 300 sec: 45986.3). Total num frames: 1707180032. Throughput: 0: 45673.4. Samples: 1708471900. Policy #0 lag: (min: 2.0, avg: 30.8, max: 59.0) [2024-03-21 08:58:25,522][03784] Avg episode reward: [(0, '1.378')] [2024-03-21 08:58:25,651][04017] Updated weights for policy 0, policy_version 52100 (0.0019) [2024-03-21 08:58:30,521][03784] Fps is (10 sec: 55705.6, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1707474944. Throughput: 0: 45091.1. Samples: 1708592400. Policy #0 lag: (min: 2.0, avg: 30.8, max: 59.0) [2024-03-21 08:58:30,522][03784] Avg episode reward: [(0, '1.476')] [2024-03-21 08:58:32,938][04017] Updated weights for policy 0, policy_version 52110 (0.0016) [2024-03-21 08:58:35,521][03784] Fps is (10 sec: 52428.7, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 1707704320. Throughput: 0: 44440.1. Samples: 1708856300. Policy #0 lag: (min: 2.0, avg: 30.8, max: 59.0) [2024-03-21 08:58:35,522][03784] Avg episode reward: [(0, '1.353')] [2024-03-21 08:58:37,750][04017] Updated weights for policy 0, policy_version 52120 (0.0017) [2024-03-21 08:58:40,521][03784] Fps is (10 sec: 58982.6, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 1708064768. Throughput: 0: 44173.3. Samples: 1709115500. Policy #0 lag: (min: 2.0, avg: 30.8, max: 59.0) [2024-03-21 08:58:40,522][03784] Avg episode reward: [(0, '0.635')] [2024-03-21 08:58:45,521][03784] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 45208.8). Total num frames: 1708163072. Throughput: 0: 44404.5. Samples: 1709262800. Policy #0 lag: (min: 2.0, avg: 30.8, max: 59.0) [2024-03-21 08:58:45,530][03784] Avg episode reward: [(0, '1.312')] [2024-03-21 08:58:47,437][04017] Updated weights for policy 0, policy_version 52130 (0.0017) [2024-03-21 08:58:50,521][03784] Fps is (10 sec: 22937.5, 60 sec: 46421.4, 300 sec: 45208.7). Total num frames: 1708294144. Throughput: 0: 44517.7. Samples: 1709544800. Policy #0 lag: (min: 0.0, avg: 37.7, max: 99.0) [2024-03-21 08:58:50,522][03784] Avg episode reward: [(0, '1.150')] [2024-03-21 08:58:55,521][03784] Fps is (10 sec: 19660.8, 60 sec: 43144.5, 300 sec: 44764.4). Total num frames: 1708359680. Throughput: 0: 45880.1. Samples: 1709854200. Policy #0 lag: (min: 0.0, avg: 37.7, max: 99.0) [2024-03-21 08:58:55,522][03784] Avg episode reward: [(0, '1.238')] [2024-03-21 08:58:58,273][04017] Updated weights for policy 0, policy_version 52140 (0.0015) [2024-03-21 08:59:00,521][03784] Fps is (10 sec: 39321.1, 60 sec: 45329.0, 300 sec: 45097.7). Total num frames: 1708687360. Throughput: 0: 45555.4. Samples: 1709978800. Policy #0 lag: (min: 0.0, avg: 37.7, max: 99.0) [2024-03-21 08:59:00,522][03784] Avg episode reward: [(0, '1.617')] [2024-03-21 08:59:03,418][04017] Updated weights for policy 0, policy_version 52150 (0.0012) [2024-03-21 08:59:05,521][03784] Fps is (10 sec: 62258.8, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 1708982272. Throughput: 0: 44955.5. Samples: 1710227000. Policy #0 lag: (min: 0.0, avg: 37.7, max: 99.0) [2024-03-21 08:59:05,522][03784] Avg episode reward: [(0, '1.617')] [2024-03-21 08:59:09,494][04017] Updated weights for policy 0, policy_version 52160 (0.0012) [2024-03-21 08:59:10,096][03995] Signal inference workers to stop experience collection... (34450 times) [2024-03-21 08:59:10,152][04017] InferenceWorker_p0-w0: stopping experience collection (34450 times) [2024-03-21 08:59:10,224][03995] Signal inference workers to resume experience collection... (34450 times) [2024-03-21 08:59:10,225][04017] InferenceWorker_p0-w0: resuming experience collection (34450 times) [2024-03-21 08:59:10,521][03784] Fps is (10 sec: 55706.3, 60 sec: 46421.4, 300 sec: 45653.0). Total num frames: 1709244416. Throughput: 0: 45057.7. Samples: 1710499500. Policy #0 lag: (min: 0.0, avg: 37.7, max: 99.0) [2024-03-21 08:59:10,522][03784] Avg episode reward: [(0, '1.460')] [2024-03-21 08:59:15,521][03784] Fps is (10 sec: 42598.9, 60 sec: 45329.1, 300 sec: 45653.1). Total num frames: 1709408256. Throughput: 0: 45642.3. Samples: 1710646300. Policy #0 lag: (min: 0.0, avg: 37.7, max: 99.0) [2024-03-21 08:59:15,522][03784] Avg episode reward: [(0, '1.296')] [2024-03-21 08:59:16,491][04017] Updated weights for policy 0, policy_version 52170 (0.0018) [2024-03-21 08:59:20,521][03784] Fps is (10 sec: 49151.7, 60 sec: 46967.4, 300 sec: 45875.2). Total num frames: 1709735936. Throughput: 0: 45991.0. Samples: 1710925900. Policy #0 lag: (min: 0.0, avg: 37.7, max: 99.0) [2024-03-21 08:59:20,522][03784] Avg episode reward: [(0, '1.052')] [2024-03-21 08:59:22,204][04017] Updated weights for policy 0, policy_version 52180 (0.0012) [2024-03-21 08:59:25,521][03784] Fps is (10 sec: 58982.2, 60 sec: 46967.4, 300 sec: 45542.0). Total num frames: 1709998080. Throughput: 0: 46433.3. Samples: 1711205000. Policy #0 lag: (min: 0.0, avg: 34.0, max: 63.0) [2024-03-21 08:59:25,522][03784] Avg episode reward: [(0, '1.128')] [2024-03-21 08:59:29,825][04017] Updated weights for policy 0, policy_version 52190 (0.0017) [2024-03-21 08:59:30,521][03784] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 1710161920. Throughput: 0: 46571.0. Samples: 1711358500. Policy #0 lag: (min: 0.0, avg: 34.0, max: 63.0) [2024-03-21 08:59:30,522][03784] Avg episode reward: [(0, '1.396')] [2024-03-21 08:59:35,521][03784] Fps is (10 sec: 42597.7, 60 sec: 45328.9, 300 sec: 45319.8). Total num frames: 1710424064. Throughput: 0: 46408.8. Samples: 1711633200. Policy #0 lag: (min: 0.0, avg: 34.0, max: 63.0) [2024-03-21 08:59:35,522][03784] Avg episode reward: [(0, '1.396')] [2024-03-21 08:59:38,629][04017] Updated weights for policy 0, policy_version 52200 (0.0021) [2024-03-21 08:59:40,521][03784] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 1710620672. Throughput: 0: 45108.9. Samples: 1711884100. Policy #0 lag: (min: 0.0, avg: 34.0, max: 63.0) [2024-03-21 08:59:40,522][03784] Avg episode reward: [(0, '1.305')] [2024-03-21 08:59:45,521][03784] Fps is (10 sec: 36045.5, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 1710784512. Throughput: 0: 45533.5. Samples: 1712027800. Policy #0 lag: (min: 0.0, avg: 34.0, max: 63.0) [2024-03-21 08:59:45,522][03784] Avg episode reward: [(0, '1.166')] [2024-03-21 08:59:47,115][04017] Updated weights for policy 0, policy_version 52210 (0.0016) [2024-03-21 08:59:50,521][03784] Fps is (10 sec: 42598.2, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 1711046656. Throughput: 0: 46191.2. Samples: 1712305600. Policy #0 lag: (min: 0.0, avg: 34.0, max: 63.0) [2024-03-21 08:59:50,522][03784] Avg episode reward: [(0, '1.065')] [2024-03-21 08:59:51,627][04017] Updated weights for policy 0, policy_version 52220 (0.0031) [2024-03-21 08:59:54,882][03995] Signal inference workers to stop experience collection... (34500 times) [2024-03-21 08:59:54,883][03995] Signal inference workers to resume experience collection... (34500 times) [2024-03-21 08:59:54,958][04017] InferenceWorker_p0-w0: stopping experience collection (34500 times) [2024-03-21 08:59:54,958][04017] InferenceWorker_p0-w0: resuming experience collection (34500 times) [2024-03-21 08:59:55,264][04017] Updated weights for policy 0, policy_version 52230 (0.0013) [2024-03-21 08:59:55,521][03784] Fps is (10 sec: 68812.6, 60 sec: 51882.7, 300 sec: 46097.4). Total num frames: 1711472640. Throughput: 0: 45495.6. Samples: 1712546800. Policy #0 lag: (min: 1.0, avg: 64.6, max: 113.0) [2024-03-21 08:59:55,522][03784] Avg episode reward: [(0, '1.561')] [2024-03-21 09:00:00,521][03784] Fps is (10 sec: 58982.3, 60 sec: 49152.1, 300 sec: 45764.1). Total num frames: 1711636480. Throughput: 0: 45548.8. Samples: 1712696000. Policy #0 lag: (min: 1.0, avg: 64.6, max: 113.0) [2024-03-21 09:00:00,522][03784] Avg episode reward: [(0, '1.155')] [2024-03-21 09:00:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052235_1711636480.pth... [2024-03-21 09:00:00,683][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000051903_1700757504.pth [2024-03-21 09:00:03,859][04017] Updated weights for policy 0, policy_version 52240 (0.0011) [2024-03-21 09:00:05,521][03784] Fps is (10 sec: 32767.7, 60 sec: 46967.5, 300 sec: 45653.0). Total num frames: 1711800320. Throughput: 0: 45553.3. Samples: 1712975800. Policy #0 lag: (min: 1.0, avg: 64.6, max: 113.0) [2024-03-21 09:00:05,522][03784] Avg episode reward: [(0, '0.892')] [2024-03-21 09:00:10,521][03784] Fps is (10 sec: 36045.1, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 1711996928. Throughput: 0: 45606.7. Samples: 1713257300. Policy #0 lag: (min: 1.0, avg: 64.6, max: 113.0) [2024-03-21 09:00:10,522][03784] Avg episode reward: [(0, '1.450')] [2024-03-21 09:00:12,031][04017] Updated weights for policy 0, policy_version 52250 (0.0019) [2024-03-21 09:00:15,521][03784] Fps is (10 sec: 45875.2, 60 sec: 47513.5, 300 sec: 45986.3). Total num frames: 1712259072. Throughput: 0: 45453.4. Samples: 1713403900. Policy #0 lag: (min: 1.0, avg: 64.6, max: 113.0) [2024-03-21 09:00:15,522][03784] Avg episode reward: [(0, '1.225')] [2024-03-21 09:00:20,521][03784] Fps is (10 sec: 42598.2, 60 sec: 44783.0, 300 sec: 45875.2). Total num frames: 1712422912. Throughput: 0: 46155.7. Samples: 1713710200. Policy #0 lag: (min: 1.0, avg: 64.6, max: 113.0) [2024-03-21 09:00:20,522][03784] Avg episode reward: [(0, '1.057')] [2024-03-21 09:00:21,566][04017] Updated weights for policy 0, policy_version 52260 (0.0012) [2024-03-21 09:00:25,521][03784] Fps is (10 sec: 45875.7, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 1712717824. Throughput: 0: 47002.2. Samples: 1713999200. Policy #0 lag: (min: 1.0, avg: 64.6, max: 113.0) [2024-03-21 09:00:25,522][03784] Avg episode reward: [(0, '1.057')] [2024-03-21 09:00:28,961][04017] Updated weights for policy 0, policy_version 52270 (0.0011) [2024-03-21 09:00:30,521][03784] Fps is (10 sec: 36044.7, 60 sec: 43690.7, 300 sec: 45542.0). Total num frames: 1712783360. Throughput: 0: 47140.0. Samples: 1714149100. Policy #0 lag: (min: 0.0, avg: 32.6, max: 78.0) [2024-03-21 09:00:30,522][03784] Avg episode reward: [(0, '1.322')] [2024-03-21 09:00:35,494][04017] Updated weights for policy 0, policy_version 52280 (0.0020) [2024-03-21 09:00:35,521][03784] Fps is (10 sec: 39320.7, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 1713111040. Throughput: 0: 46855.3. Samples: 1714414100. Policy #0 lag: (min: 0.0, avg: 32.6, max: 78.0) [2024-03-21 09:00:35,522][03784] Avg episode reward: [(0, '0.764')] [2024-03-21 09:00:40,521][03784] Fps is (10 sec: 55705.2, 60 sec: 45329.0, 300 sec: 46208.4). Total num frames: 1713340416. Throughput: 0: 47817.7. Samples: 1714698600. Policy #0 lag: (min: 0.0, avg: 32.6, max: 78.0) [2024-03-21 09:00:40,522][03784] Avg episode reward: [(0, '1.230')] [2024-03-21 09:00:41,209][04017] Updated weights for policy 0, policy_version 52290 (0.0017) [2024-03-21 09:00:45,521][03784] Fps is (10 sec: 52430.0, 60 sec: 47513.6, 300 sec: 46430.6). Total num frames: 1713635328. Throughput: 0: 47180.0. Samples: 1714819100. Policy #0 lag: (min: 0.0, avg: 32.6, max: 78.0) [2024-03-21 09:00:45,522][03784] Avg episode reward: [(0, '0.509')] [2024-03-21 09:00:49,552][04017] Updated weights for policy 0, policy_version 52300 (0.0018) [2024-03-21 09:00:50,521][03784] Fps is (10 sec: 49152.3, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 1713831936. Throughput: 0: 46897.9. Samples: 1715086200. Policy #0 lag: (min: 0.0, avg: 32.6, max: 78.0) [2024-03-21 09:00:50,522][03784] Avg episode reward: [(0, '1.407')] [2024-03-21 09:00:51,147][03995] Signal inference workers to stop experience collection... (34550 times) [2024-03-21 09:00:51,229][04017] InferenceWorker_p0-w0: stopping experience collection (34550 times) [2024-03-21 09:00:51,382][03995] Signal inference workers to resume experience collection... (34550 times) [2024-03-21 09:00:51,382][04017] InferenceWorker_p0-w0: resuming experience collection (34550 times) [2024-03-21 09:00:55,521][03784] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 45208.7). Total num frames: 1714061312. Throughput: 0: 46615.5. Samples: 1715355000. Policy #0 lag: (min: 0.0, avg: 32.6, max: 78.0) [2024-03-21 09:00:55,522][03784] Avg episode reward: [(0, '0.690')] [2024-03-21 09:00:56,824][04017] Updated weights for policy 0, policy_version 52310 (0.0015) [2024-03-21 09:01:00,521][03784] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 44986.6). Total num frames: 1714225152. Throughput: 0: 46366.7. Samples: 1715490400. Policy #0 lag: (min: 0.0, avg: 32.6, max: 78.0) [2024-03-21 09:01:00,522][03784] Avg episode reward: [(0, '0.528')] [2024-03-21 09:01:03,793][04017] Updated weights for policy 0, policy_version 52320 (0.0012) [2024-03-21 09:01:05,521][03784] Fps is (10 sec: 39321.4, 60 sec: 44236.8, 300 sec: 45208.7). Total num frames: 1714454528. Throughput: 0: 45322.1. Samples: 1715749700. Policy #0 lag: (min: 1.0, avg: 32.8, max: 67.0) [2024-03-21 09:01:05,522][03784] Avg episode reward: [(0, '1.196')] [2024-03-21 09:01:10,521][03784] Fps is (10 sec: 49152.7, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 1714716672. Throughput: 0: 43911.2. Samples: 1715975200. Policy #0 lag: (min: 1.0, avg: 32.8, max: 67.0) [2024-03-21 09:01:10,521][03784] Avg episode reward: [(0, '0.938')] [2024-03-21 09:01:10,584][04017] Updated weights for policy 0, policy_version 52330 (0.0013) [2024-03-21 09:01:15,521][03784] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 1714913280. Throughput: 0: 43335.5. Samples: 1716099200. Policy #0 lag: (min: 1.0, avg: 32.8, max: 67.0) [2024-03-21 09:01:15,522][03784] Avg episode reward: [(0, '0.748')] [2024-03-21 09:01:17,079][04017] Updated weights for policy 0, policy_version 52340 (0.0013) [2024-03-21 09:01:20,521][03784] Fps is (10 sec: 49151.4, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 1715208192. Throughput: 0: 43349.1. Samples: 1716364800. Policy #0 lag: (min: 1.0, avg: 32.8, max: 67.0) [2024-03-21 09:01:20,522][03784] Avg episode reward: [(0, '0.748')] [2024-03-21 09:01:22,428][04017] Updated weights for policy 0, policy_version 52350 (0.0017) [2024-03-21 09:01:25,521][03784] Fps is (10 sec: 58982.2, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 1715503104. Throughput: 0: 43591.1. Samples: 1716660200. Policy #0 lag: (min: 1.0, avg: 32.8, max: 67.0) [2024-03-21 09:01:25,522][03784] Avg episode reward: [(0, '0.748')] [2024-03-21 09:01:30,521][03784] Fps is (10 sec: 45875.5, 60 sec: 48059.8, 300 sec: 46097.4). Total num frames: 1715666944. Throughput: 0: 43677.8. Samples: 1716784600. Policy #0 lag: (min: 1.0, avg: 32.8, max: 67.0) [2024-03-21 09:01:30,521][03784] Avg episode reward: [(0, '1.416')] [2024-03-21 09:01:32,485][04017] Updated weights for policy 0, policy_version 52360 (0.0011) [2024-03-21 09:01:35,521][03784] Fps is (10 sec: 29491.5, 60 sec: 44783.1, 300 sec: 45986.3). Total num frames: 1715798016. Throughput: 0: 44246.7. Samples: 1717077300. Policy #0 lag: (min: 1.0, avg: 32.8, max: 67.0) [2024-03-21 09:01:35,522][03784] Avg episode reward: [(0, '1.276')] [2024-03-21 09:01:40,521][03784] Fps is (10 sec: 29491.1, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1715961856. Throughput: 0: 44551.1. Samples: 1717359800. Policy #0 lag: (min: 0.0, avg: 36.1, max: 78.0) [2024-03-21 09:01:40,522][03784] Avg episode reward: [(0, '0.801')] [2024-03-21 09:01:43,056][04017] Updated weights for policy 0, policy_version 52370 (0.0011) [2024-03-21 09:01:45,521][03784] Fps is (10 sec: 36044.6, 60 sec: 42052.2, 300 sec: 44875.5). Total num frames: 1716158464. Throughput: 0: 44413.3. Samples: 1717489000. Policy #0 lag: (min: 0.0, avg: 36.1, max: 78.0) [2024-03-21 09:01:45,522][03784] Avg episode reward: [(0, '1.335')] [2024-03-21 09:01:50,521][03784] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 44875.5). Total num frames: 1716355072. Throughput: 0: 44464.5. Samples: 1717750600. Policy #0 lag: (min: 0.0, avg: 36.1, max: 78.0) [2024-03-21 09:01:50,522][03784] Avg episode reward: [(0, '0.877')] [2024-03-21 09:01:52,339][04017] Updated weights for policy 0, policy_version 52380 (0.0010) [2024-03-21 09:01:55,329][03995] Signal inference workers to stop experience collection... (34600 times) [2024-03-21 09:01:55,383][03995] Signal inference workers to resume experience collection... (34600 times) [2024-03-21 09:01:55,399][04017] InferenceWorker_p0-w0: stopping experience collection (34600 times) [2024-03-21 09:01:55,450][04017] InferenceWorker_p0-w0: resuming experience collection (34600 times) [2024-03-21 09:01:55,521][03784] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 45208.7). Total num frames: 1716617216. Throughput: 0: 45573.0. Samples: 1718026000. Policy #0 lag: (min: 0.0, avg: 36.1, max: 78.0) [2024-03-21 09:01:55,522][03784] Avg episode reward: [(0, '1.298')] [2024-03-21 09:01:56,662][04017] Updated weights for policy 0, policy_version 52390 (0.0012) [2024-03-21 09:02:00,521][03784] Fps is (10 sec: 52428.3, 60 sec: 44236.7, 300 sec: 45541.9). Total num frames: 1716879360. Throughput: 0: 45793.2. Samples: 1718159900. Policy #0 lag: (min: 0.0, avg: 36.1, max: 78.0) [2024-03-21 09:02:00,522][03784] Avg episode reward: [(0, '0.886')] [2024-03-21 09:02:00,743][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052396_1716912128.pth... [2024-03-21 09:02:00,869][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052063_1706000384.pth [2024-03-21 09:02:02,452][04017] Updated weights for policy 0, policy_version 52400 (0.0016) [2024-03-21 09:02:05,521][03784] Fps is (10 sec: 55706.8, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 1717174272. Throughput: 0: 45711.2. Samples: 1718421800. Policy #0 lag: (min: 0.0, avg: 36.1, max: 78.0) [2024-03-21 09:02:05,522][03784] Avg episode reward: [(0, '1.330')] [2024-03-21 09:02:09,550][04017] Updated weights for policy 0, policy_version 52410 (0.0011) [2024-03-21 09:02:10,521][03784] Fps is (10 sec: 55706.2, 60 sec: 45329.0, 300 sec: 45875.2). Total num frames: 1717436416. Throughput: 0: 45408.9. Samples: 1718703600. Policy #0 lag: (min: 0.0, avg: 44.1, max: 114.0) [2024-03-21 09:02:10,522][03784] Avg episode reward: [(0, '1.565')] [2024-03-21 09:02:15,521][03784] Fps is (10 sec: 49151.5, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 1717665792. Throughput: 0: 45611.0. Samples: 1718837100. Policy #0 lag: (min: 0.0, avg: 44.1, max: 114.0) [2024-03-21 09:02:15,522][03784] Avg episode reward: [(0, '1.565')] [2024-03-21 09:02:15,794][04017] Updated weights for policy 0, policy_version 52420 (0.0015) [2024-03-21 09:02:20,521][03784] Fps is (10 sec: 55705.2, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 1717993472. Throughput: 0: 45188.8. Samples: 1719110800. Policy #0 lag: (min: 0.0, avg: 44.1, max: 114.0) [2024-03-21 09:02:20,522][03784] Avg episode reward: [(0, '0.597')] [2024-03-21 09:02:24,524][04017] Updated weights for policy 0, policy_version 52430 (0.0011) [2024-03-21 09:02:25,521][03784] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 45097.6). Total num frames: 1718059008. Throughput: 0: 45379.9. Samples: 1719401900. Policy #0 lag: (min: 0.0, avg: 44.1, max: 114.0) [2024-03-21 09:02:25,522][03784] Avg episode reward: [(0, '0.597')] [2024-03-21 09:02:30,529][03784] Fps is (10 sec: 29467.3, 60 sec: 43684.6, 300 sec: 44874.3). Total num frames: 1718288384. Throughput: 0: 45449.5. Samples: 1719534600. Policy #0 lag: (min: 0.0, avg: 44.1, max: 114.0) [2024-03-21 09:02:30,531][03784] Avg episode reward: [(0, '0.984')] [2024-03-21 09:02:31,188][04017] Updated weights for policy 0, policy_version 52440 (0.0015) [2024-03-21 09:02:34,601][04017] Updated weights for policy 0, policy_version 52450 (0.0021) [2024-03-21 09:02:35,521][03784] Fps is (10 sec: 65536.8, 60 sec: 48605.8, 300 sec: 45430.9). Total num frames: 1718714368. Throughput: 0: 45475.6. Samples: 1719797000. Policy #0 lag: (min: 0.0, avg: 44.1, max: 114.0) [2024-03-21 09:02:35,522][03784] Avg episode reward: [(0, '1.328')] [2024-03-21 09:02:40,521][03784] Fps is (10 sec: 62310.5, 60 sec: 49152.0, 300 sec: 45764.1). Total num frames: 1718910976. Throughput: 0: 45184.6. Samples: 1720059300. Policy #0 lag: (min: 0.0, avg: 44.1, max: 114.0) [2024-03-21 09:02:40,522][03784] Avg episode reward: [(0, '1.305')] [2024-03-21 09:02:45,521][03784] Fps is (10 sec: 26214.4, 60 sec: 46967.5, 300 sec: 45653.1). Total num frames: 1718976512. Throughput: 0: 45753.5. Samples: 1720218800. Policy #0 lag: (min: 0.0, avg: 39.8, max: 120.0) [2024-03-21 09:02:45,522][03784] Avg episode reward: [(0, '1.168')] [2024-03-21 09:02:45,574][04017] Updated weights for policy 0, policy_version 52460 (0.0015) [2024-03-21 09:02:45,933][03995] Signal inference workers to stop experience collection... (34650 times) [2024-03-21 09:02:45,995][03995] Signal inference workers to resume experience collection... (34650 times) [2024-03-21 09:02:46,002][04017] InferenceWorker_p0-w0: stopping experience collection (34650 times) [2024-03-21 09:02:46,055][04017] InferenceWorker_p0-w0: resuming experience collection (34650 times) [2024-03-21 09:02:50,521][03784] Fps is (10 sec: 36044.7, 60 sec: 48605.9, 300 sec: 45764.1). Total num frames: 1719271424. Throughput: 0: 46433.3. Samples: 1720511300. Policy #0 lag: (min: 0.0, avg: 39.8, max: 120.0) [2024-03-21 09:02:50,522][03784] Avg episode reward: [(0, '1.269')] [2024-03-21 09:02:52,775][04017] Updated weights for policy 0, policy_version 52470 (0.0010) [2024-03-21 09:02:55,521][03784] Fps is (10 sec: 49152.0, 60 sec: 47513.7, 300 sec: 45764.1). Total num frames: 1719468032. Throughput: 0: 46760.0. Samples: 1720807800. Policy #0 lag: (min: 0.0, avg: 39.8, max: 120.0) [2024-03-21 09:02:55,522][03784] Avg episode reward: [(0, '1.269')] [2024-03-21 09:03:00,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45875.3, 300 sec: 45319.8). Total num frames: 1719631872. Throughput: 0: 46969.0. Samples: 1720950700. Policy #0 lag: (min: 0.0, avg: 39.8, max: 120.0) [2024-03-21 09:03:00,522][03784] Avg episode reward: [(0, '1.259')] [2024-03-21 09:03:00,636][04017] Updated weights for policy 0, policy_version 52480 (0.0010) [2024-03-21 09:03:05,521][03784] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 1719828480. Throughput: 0: 47284.6. Samples: 1721238600. Policy #0 lag: (min: 0.0, avg: 39.8, max: 120.0) [2024-03-21 09:03:05,522][03784] Avg episode reward: [(0, '1.343')] [2024-03-21 09:03:10,508][04017] Updated weights for policy 0, policy_version 52490 (0.0012) [2024-03-21 09:03:10,521][03784] Fps is (10 sec: 36044.4, 60 sec: 42598.4, 300 sec: 45097.6). Total num frames: 1719992320. Throughput: 0: 47095.6. Samples: 1721521200. Policy #0 lag: (min: 0.0, avg: 39.8, max: 120.0) [2024-03-21 09:03:10,522][03784] Avg episode reward: [(0, '1.142')] [2024-03-21 09:03:15,298][04017] Updated weights for policy 0, policy_version 52500 (0.0013) [2024-03-21 09:03:15,521][03784] Fps is (10 sec: 49151.9, 60 sec: 44236.9, 300 sec: 45430.9). Total num frames: 1720320000. Throughput: 0: 47355.3. Samples: 1721665200. Policy #0 lag: (min: 4.0, avg: 50.1, max: 103.0) [2024-03-21 09:03:15,522][03784] Avg episode reward: [(0, '1.133')] [2024-03-21 09:03:20,521][03784] Fps is (10 sec: 62259.7, 60 sec: 43690.7, 300 sec: 45542.0). Total num frames: 1720614912. Throughput: 0: 47397.8. Samples: 1721929900. Policy #0 lag: (min: 4.0, avg: 50.1, max: 103.0) [2024-03-21 09:03:20,522][03784] Avg episode reward: [(0, '1.470')] [2024-03-21 09:03:21,540][04017] Updated weights for policy 0, policy_version 52510 (0.0016) [2024-03-21 09:03:25,521][03784] Fps is (10 sec: 58983.2, 60 sec: 47513.8, 300 sec: 45542.0). Total num frames: 1720909824. Throughput: 0: 47473.4. Samples: 1722195600. Policy #0 lag: (min: 4.0, avg: 50.1, max: 103.0) [2024-03-21 09:03:25,521][03784] Avg episode reward: [(0, '1.186')] [2024-03-21 09:03:25,836][04017] Updated weights for policy 0, policy_version 52520 (0.0012) [2024-03-21 09:03:30,521][03784] Fps is (10 sec: 58981.4, 60 sec: 48612.4, 300 sec: 45764.1). Total num frames: 1721204736. Throughput: 0: 46862.0. Samples: 1722327600. Policy #0 lag: (min: 4.0, avg: 50.1, max: 103.0) [2024-03-21 09:03:30,522][03784] Avg episode reward: [(0, '1.451')] [2024-03-21 09:03:31,663][04017] Updated weights for policy 0, policy_version 52530 (0.0011) [2024-03-21 09:03:34,018][03995] Signal inference workers to stop experience collection... (34700 times) [2024-03-21 09:03:34,088][04017] InferenceWorker_p0-w0: stopping experience collection (34700 times) [2024-03-21 09:03:34,095][03995] Signal inference workers to resume experience collection... (34700 times) [2024-03-21 09:03:34,151][04017] InferenceWorker_p0-w0: resuming experience collection (34700 times) [2024-03-21 09:03:35,521][03784] Fps is (10 sec: 68812.1, 60 sec: 48059.8, 300 sec: 45875.2). Total num frames: 1721597952. Throughput: 0: 46537.8. Samples: 1722605500. Policy #0 lag: (min: 4.0, avg: 50.1, max: 103.0) [2024-03-21 09:03:35,522][03784] Avg episode reward: [(0, '1.273')] [2024-03-21 09:03:36,870][04017] Updated weights for policy 0, policy_version 52540 (0.0012) [2024-03-21 09:03:40,521][03784] Fps is (10 sec: 42599.0, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 1721630720. Throughput: 0: 46924.4. Samples: 1722919400. Policy #0 lag: (min: 4.0, avg: 50.1, max: 103.0) [2024-03-21 09:03:40,522][03784] Avg episode reward: [(0, '1.237')] [2024-03-21 09:03:45,521][03784] Fps is (10 sec: 29491.0, 60 sec: 48605.8, 300 sec: 46097.4). Total num frames: 1721892864. Throughput: 0: 46555.5. Samples: 1723045700. Policy #0 lag: (min: 4.0, avg: 50.1, max: 103.0) [2024-03-21 09:03:45,522][03784] Avg episode reward: [(0, '1.310')] [2024-03-21 09:03:45,909][04017] Updated weights for policy 0, policy_version 52550 (0.0028) [2024-03-21 09:03:50,521][03784] Fps is (10 sec: 52429.2, 60 sec: 48059.8, 300 sec: 46763.8). Total num frames: 1722155008. Throughput: 0: 46668.9. Samples: 1723338700. Policy #0 lag: (min: 0.0, avg: 48.3, max: 100.0) [2024-03-21 09:03:50,522][03784] Avg episode reward: [(0, '1.310')] [2024-03-21 09:03:55,521][03784] Fps is (10 sec: 29491.7, 60 sec: 45329.2, 300 sec: 45764.2). Total num frames: 1722187776. Throughput: 0: 46998.0. Samples: 1723636100. Policy #0 lag: (min: 0.0, avg: 48.3, max: 100.0) [2024-03-21 09:03:55,522][03784] Avg episode reward: [(0, '0.793')] [2024-03-21 09:03:57,771][04017] Updated weights for policy 0, policy_version 52560 (0.0022) [2024-03-21 09:04:00,521][03784] Fps is (10 sec: 26214.2, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1722417152. Throughput: 0: 46846.7. Samples: 1723773300. Policy #0 lag: (min: 0.0, avg: 48.3, max: 100.0) [2024-03-21 09:04:00,522][03784] Avg episode reward: [(0, '0.721')] [2024-03-21 09:04:00,534][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052564_1722417152.pth... [2024-03-21 09:04:00,665][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052235_1711636480.pth [2024-03-21 09:04:05,521][03784] Fps is (10 sec: 39321.0, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1722580992. Throughput: 0: 47155.5. Samples: 1724051900. Policy #0 lag: (min: 0.0, avg: 48.3, max: 100.0) [2024-03-21 09:04:05,522][03784] Avg episode reward: [(0, '1.653')] [2024-03-21 09:04:05,685][04017] Updated weights for policy 0, policy_version 52570 (0.0016) [2024-03-21 09:04:10,521][03784] Fps is (10 sec: 36044.7, 60 sec: 46421.4, 300 sec: 45319.8). Total num frames: 1722777600. Throughput: 0: 47082.0. Samples: 1724314300. Policy #0 lag: (min: 0.0, avg: 48.3, max: 100.0) [2024-03-21 09:04:10,522][03784] Avg episode reward: [(0, '0.826')] [2024-03-21 09:04:12,347][04017] Updated weights for policy 0, policy_version 52580 (0.0011) [2024-03-21 09:04:15,521][03784] Fps is (10 sec: 52429.2, 60 sec: 46421.4, 300 sec: 45319.8). Total num frames: 1723105280. Throughput: 0: 47122.5. Samples: 1724448100. Policy #0 lag: (min: 0.0, avg: 48.3, max: 100.0) [2024-03-21 09:04:15,522][03784] Avg episode reward: [(0, '0.996')] [2024-03-21 09:04:19,276][04017] Updated weights for policy 0, policy_version 52590 (0.0011) [2024-03-21 09:04:20,521][03784] Fps is (10 sec: 58982.1, 60 sec: 45875.1, 300 sec: 45319.8). Total num frames: 1723367424. Throughput: 0: 47004.3. Samples: 1724720700. Policy #0 lag: (min: 0.0, avg: 48.3, max: 100.0) [2024-03-21 09:04:20,522][03784] Avg episode reward: [(0, '1.342')] [2024-03-21 09:04:25,521][03784] Fps is (10 sec: 49152.0, 60 sec: 44782.9, 300 sec: 45542.0). Total num frames: 1723596800. Throughput: 0: 46015.6. Samples: 1724990100. Policy #0 lag: (min: 2.0, avg: 35.2, max: 69.0) [2024-03-21 09:04:25,522][03784] Avg episode reward: [(0, '1.097')] [2024-03-21 09:04:25,524][04017] Updated weights for policy 0, policy_version 52600 (0.0010) [2024-03-21 09:04:30,521][03784] Fps is (10 sec: 45875.4, 60 sec: 43690.8, 300 sec: 45430.9). Total num frames: 1723826176. Throughput: 0: 46177.8. Samples: 1725123700. Policy #0 lag: (min: 2.0, avg: 35.2, max: 69.0) [2024-03-21 09:04:30,522][03784] Avg episode reward: [(0, '1.412')] [2024-03-21 09:04:30,757][03995] Signal inference workers to stop experience collection... (34750 times) [2024-03-21 09:04:30,758][03995] Signal inference workers to resume experience collection... (34750 times) [2024-03-21 09:04:30,857][04017] InferenceWorker_p0-w0: stopping experience collection (34750 times) [2024-03-21 09:04:30,857][04017] InferenceWorker_p0-w0: resuming experience collection (34750 times) [2024-03-21 09:04:32,663][04017] Updated weights for policy 0, policy_version 52610 (0.0016) [2024-03-21 09:04:35,521][03784] Fps is (10 sec: 52428.3, 60 sec: 42052.2, 300 sec: 45764.1). Total num frames: 1724121088. Throughput: 0: 45537.7. Samples: 1725387900. Policy #0 lag: (min: 2.0, avg: 35.2, max: 69.0) [2024-03-21 09:04:35,522][03784] Avg episode reward: [(0, '1.420')] [2024-03-21 09:04:37,887][04017] Updated weights for policy 0, policy_version 52620 (0.0013) [2024-03-21 09:04:40,521][03784] Fps is (10 sec: 52428.3, 60 sec: 45329.0, 300 sec: 45986.2). Total num frames: 1724350464. Throughput: 0: 44482.0. Samples: 1725637800. Policy #0 lag: (min: 2.0, avg: 35.2, max: 69.0) [2024-03-21 09:04:40,522][03784] Avg episode reward: [(0, '1.233')] [2024-03-21 09:04:45,521][03784] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 45653.1). Total num frames: 1724514304. Throughput: 0: 44546.7. Samples: 1725777900. Policy #0 lag: (min: 2.0, avg: 35.2, max: 69.0) [2024-03-21 09:04:45,522][03784] Avg episode reward: [(0, '0.571')] [2024-03-21 09:04:47,516][04017] Updated weights for policy 0, policy_version 52630 (0.0016) [2024-03-21 09:04:50,521][03784] Fps is (10 sec: 36045.0, 60 sec: 42598.3, 300 sec: 44875.5). Total num frames: 1724710912. Throughput: 0: 44382.2. Samples: 1726049100. Policy #0 lag: (min: 2.0, avg: 35.2, max: 69.0) [2024-03-21 09:04:50,522][03784] Avg episode reward: [(0, '1.531')] [2024-03-21 09:04:54,234][04017] Updated weights for policy 0, policy_version 52640 (0.0011) [2024-03-21 09:04:55,521][03784] Fps is (10 sec: 49152.3, 60 sec: 46967.5, 300 sec: 45319.8). Total num frames: 1725005824. Throughput: 0: 44202.4. Samples: 1726303400. Policy #0 lag: (min: 2.0, avg: 35.2, max: 69.0) [2024-03-21 09:04:55,521][03784] Avg episode reward: [(0, '1.058')] [2024-03-21 09:04:58,512][04017] Updated weights for policy 0, policy_version 52650 (0.0018) [2024-03-21 09:05:00,521][03784] Fps is (10 sec: 55705.8, 60 sec: 47513.6, 300 sec: 45653.0). Total num frames: 1725267968. Throughput: 0: 44259.9. Samples: 1726439800. Policy #0 lag: (min: 1.0, avg: 53.5, max: 119.0) [2024-03-21 09:05:00,522][03784] Avg episode reward: [(0, '1.494')] [2024-03-21 09:05:05,521][03784] Fps is (10 sec: 39321.3, 60 sec: 46967.5, 300 sec: 45430.9). Total num frames: 1725399040. Throughput: 0: 44620.1. Samples: 1726728600. Policy #0 lag: (min: 1.0, avg: 53.5, max: 119.0) [2024-03-21 09:05:05,522][03784] Avg episode reward: [(0, '1.160')] [2024-03-21 09:05:07,601][04017] Updated weights for policy 0, policy_version 52660 (0.0011) [2024-03-21 09:05:10,521][03784] Fps is (10 sec: 42598.4, 60 sec: 48605.9, 300 sec: 45542.0). Total num frames: 1725693952. Throughput: 0: 44413.2. Samples: 1726988700. Policy #0 lag: (min: 1.0, avg: 53.5, max: 119.0) [2024-03-21 09:05:10,522][03784] Avg episode reward: [(0, '1.352')] [2024-03-21 09:05:15,521][03784] Fps is (10 sec: 42598.5, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1725825024. Throughput: 0: 44617.9. Samples: 1727131500. Policy #0 lag: (min: 1.0, avg: 53.5, max: 119.0) [2024-03-21 09:05:15,522][03784] Avg episode reward: [(0, '1.576')] [2024-03-21 09:05:16,658][04017] Updated weights for policy 0, policy_version 52670 (0.0014) [2024-03-21 09:05:20,521][03784] Fps is (10 sec: 39321.5, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1726087168. Throughput: 0: 45060.0. Samples: 1727415600. Policy #0 lag: (min: 1.0, avg: 53.5, max: 119.0) [2024-03-21 09:05:20,522][03784] Avg episode reward: [(0, '1.576')] [2024-03-21 09:05:23,224][04017] Updated weights for policy 0, policy_version 52680 (0.0012) [2024-03-21 09:05:25,521][03784] Fps is (10 sec: 55705.6, 60 sec: 46421.3, 300 sec: 46097.4). Total num frames: 1726382080. Throughput: 0: 44871.3. Samples: 1727657000. Policy #0 lag: (min: 1.0, avg: 53.5, max: 119.0) [2024-03-21 09:05:25,522][03784] Avg episode reward: [(0, '1.426')] [2024-03-21 09:05:29,378][03995] Signal inference workers to stop experience collection... (34800 times) [2024-03-21 09:05:29,444][04017] InferenceWorker_p0-w0: stopping experience collection (34800 times) [2024-03-21 09:05:29,459][03995] Signal inference workers to resume experience collection... (34800 times) [2024-03-21 09:05:29,492][04017] InferenceWorker_p0-w0: resuming experience collection (34800 times) [2024-03-21 09:05:29,868][04017] Updated weights for policy 0, policy_version 52690 (0.0017) [2024-03-21 09:05:30,521][03784] Fps is (10 sec: 49152.3, 60 sec: 45875.2, 300 sec: 45653.1). Total num frames: 1726578688. Throughput: 0: 44671.1. Samples: 1727788100. Policy #0 lag: (min: 2.0, avg: 35.4, max: 70.0) [2024-03-21 09:05:30,522][03784] Avg episode reward: [(0, '1.466')] [2024-03-21 09:05:35,521][03784] Fps is (10 sec: 42597.9, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 1726808064. Throughput: 0: 44504.5. Samples: 1728051800. Policy #0 lag: (min: 2.0, avg: 35.4, max: 70.0) [2024-03-21 09:05:35,522][03784] Avg episode reward: [(0, '1.579')] [2024-03-21 09:05:37,153][04017] Updated weights for policy 0, policy_version 52700 (0.0018) [2024-03-21 09:05:40,521][03784] Fps is (10 sec: 29491.1, 60 sec: 42052.3, 300 sec: 44875.5). Total num frames: 1726873600. Throughput: 0: 45297.6. Samples: 1728341800. Policy #0 lag: (min: 2.0, avg: 35.4, max: 70.0) [2024-03-21 09:05:40,522][03784] Avg episode reward: [(0, '1.087')] [2024-03-21 09:05:45,521][03784] Fps is (10 sec: 19661.0, 60 sec: 41506.1, 300 sec: 44653.3). Total num frames: 1727004672. Throughput: 0: 45415.6. Samples: 1728483500. Policy #0 lag: (min: 2.0, avg: 35.4, max: 70.0) [2024-03-21 09:05:45,522][03784] Avg episode reward: [(0, '1.258')] [2024-03-21 09:05:50,521][03784] Fps is (10 sec: 29491.3, 60 sec: 40960.1, 300 sec: 44431.2). Total num frames: 1727168512. Throughput: 0: 45084.4. Samples: 1728757400. Policy #0 lag: (min: 2.0, avg: 35.4, max: 70.0) [2024-03-21 09:05:50,522][03784] Avg episode reward: [(0, '0.887')] [2024-03-21 09:05:51,236][04017] Updated weights for policy 0, policy_version 52710 (0.0011) [2024-03-21 09:05:54,558][04017] Updated weights for policy 0, policy_version 52720 (0.0014) [2024-03-21 09:05:55,521][03784] Fps is (10 sec: 58981.7, 60 sec: 43144.4, 300 sec: 45319.8). Total num frames: 1727594496. Throughput: 0: 44806.6. Samples: 1729005000. Policy #0 lag: (min: 2.0, avg: 35.4, max: 70.0) [2024-03-21 09:05:55,522][03784] Avg episode reward: [(0, '1.251')] [2024-03-21 09:05:59,542][04017] Updated weights for policy 0, policy_version 52730 (0.0010) [2024-03-21 09:06:00,521][03784] Fps is (10 sec: 72091.0, 60 sec: 43690.9, 300 sec: 45542.0). Total num frames: 1727889408. Throughput: 0: 44346.8. Samples: 1729127100. Policy #0 lag: (min: 2.0, avg: 35.4, max: 70.0) [2024-03-21 09:06:00,521][03784] Avg episode reward: [(0, '1.404')] [2024-03-21 09:06:00,581][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052732_1727922176.pth... [2024-03-21 09:06:00,724][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052396_1716912128.pth [2024-03-21 09:06:05,294][04017] Updated weights for policy 0, policy_version 52740 (0.0016) [2024-03-21 09:06:05,521][03784] Fps is (10 sec: 58982.8, 60 sec: 46421.3, 300 sec: 45653.0). Total num frames: 1728184320. Throughput: 0: 44048.9. Samples: 1729397800. Policy #0 lag: (min: 0.0, avg: 47.4, max: 94.0) [2024-03-21 09:06:05,522][03784] Avg episode reward: [(0, '1.107')] [2024-03-21 09:06:10,521][03784] Fps is (10 sec: 55704.6, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 1728446464. Throughput: 0: 44302.2. Samples: 1729650600. Policy #0 lag: (min: 0.0, avg: 47.4, max: 94.0) [2024-03-21 09:06:10,522][03784] Avg episode reward: [(0, '1.197')] [2024-03-21 09:06:11,039][04017] Updated weights for policy 0, policy_version 52750 (0.0013) [2024-03-21 09:06:15,521][03784] Fps is (10 sec: 52428.8, 60 sec: 48059.7, 300 sec: 45764.1). Total num frames: 1728708608. Throughput: 0: 44337.8. Samples: 1729783300. Policy #0 lag: (min: 0.0, avg: 47.4, max: 94.0) [2024-03-21 09:06:15,522][03784] Avg episode reward: [(0, '1.389')] [2024-03-21 09:06:17,851][03995] Signal inference workers to stop experience collection... (34850 times) [2024-03-21 09:06:18,006][04017] InferenceWorker_p0-w0: stopping experience collection (34850 times) [2024-03-21 09:06:18,061][03995] Signal inference workers to resume experience collection... (34850 times) [2024-03-21 09:06:18,067][04017] InferenceWorker_p0-w0: resuming experience collection (34850 times) [2024-03-21 09:06:19,236][04017] Updated weights for policy 0, policy_version 52760 (0.0011) [2024-03-21 09:06:20,521][03784] Fps is (10 sec: 49151.7, 60 sec: 47513.6, 300 sec: 45542.0). Total num frames: 1728937984. Throughput: 0: 44824.5. Samples: 1730068900. Policy #0 lag: (min: 0.0, avg: 47.4, max: 94.0) [2024-03-21 09:06:20,522][03784] Avg episode reward: [(0, '0.840')] [2024-03-21 09:06:25,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 1729101824. Throughput: 0: 44882.3. Samples: 1730361500. Policy #0 lag: (min: 0.0, avg: 47.4, max: 94.0) [2024-03-21 09:06:25,522][03784] Avg episode reward: [(0, '1.148')] [2024-03-21 09:06:29,210][04017] Updated weights for policy 0, policy_version 52770 (0.0020) [2024-03-21 09:06:30,521][03784] Fps is (10 sec: 26214.5, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 1729200128. Throughput: 0: 45215.5. Samples: 1730518200. Policy #0 lag: (min: 0.0, avg: 47.4, max: 94.0) [2024-03-21 09:06:30,522][03784] Avg episode reward: [(0, '1.148')] [2024-03-21 09:06:35,521][03784] Fps is (10 sec: 26214.2, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 1729363968. Throughput: 0: 45040.0. Samples: 1730784200. Policy #0 lag: (min: 0.0, avg: 47.4, max: 94.0) [2024-03-21 09:06:35,522][03784] Avg episode reward: [(0, '1.410')] [2024-03-21 09:06:38,560][04017] Updated weights for policy 0, policy_version 52780 (0.0028) [2024-03-21 09:06:40,521][03784] Fps is (10 sec: 39321.8, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 1729593344. Throughput: 0: 45611.2. Samples: 1731057500. Policy #0 lag: (min: 0.0, avg: 27.4, max: 74.0) [2024-03-21 09:06:40,522][03784] Avg episode reward: [(0, '1.273')] [2024-03-21 09:06:44,420][04017] Updated weights for policy 0, policy_version 52790 (0.0011) [2024-03-21 09:06:45,521][03784] Fps is (10 sec: 49151.3, 60 sec: 47513.4, 300 sec: 45764.1). Total num frames: 1729855488. Throughput: 0: 46024.1. Samples: 1731198200. Policy #0 lag: (min: 0.0, avg: 27.4, max: 74.0) [2024-03-21 09:06:45,522][03784] Avg episode reward: [(0, '1.423')] [2024-03-21 09:06:50,521][03784] Fps is (10 sec: 45874.8, 60 sec: 48059.7, 300 sec: 45542.0). Total num frames: 1730052096. Throughput: 0: 46293.3. Samples: 1731481000. Policy #0 lag: (min: 0.0, avg: 27.4, max: 74.0) [2024-03-21 09:06:50,522][03784] Avg episode reward: [(0, '1.422')] [2024-03-21 09:06:52,256][04017] Updated weights for policy 0, policy_version 52800 (0.0011) [2024-03-21 09:06:55,521][03784] Fps is (10 sec: 55706.9, 60 sec: 46967.6, 300 sec: 45875.2). Total num frames: 1730412544. Throughput: 0: 46417.8. Samples: 1731739400. Policy #0 lag: (min: 0.0, avg: 27.4, max: 74.0) [2024-03-21 09:06:55,522][03784] Avg episode reward: [(0, '1.176')] [2024-03-21 09:06:56,403][04017] Updated weights for policy 0, policy_version 52810 (0.0011) [2024-03-21 09:07:00,521][03784] Fps is (10 sec: 68812.8, 60 sec: 47513.4, 300 sec: 45986.3). Total num frames: 1730740224. Throughput: 0: 46506.6. Samples: 1731876100. Policy #0 lag: (min: 0.0, avg: 27.4, max: 74.0) [2024-03-21 09:07:00,522][03784] Avg episode reward: [(0, '1.176')] [2024-03-21 09:07:01,183][04017] Updated weights for policy 0, policy_version 52820 (0.0013) [2024-03-21 09:07:05,521][03784] Fps is (10 sec: 65535.5, 60 sec: 48059.7, 300 sec: 46208.4). Total num frames: 1731067904. Throughput: 0: 46122.3. Samples: 1732144400. Policy #0 lag: (min: 0.0, avg: 27.4, max: 74.0) [2024-03-21 09:07:05,522][03784] Avg episode reward: [(0, '0.732')] [2024-03-21 09:07:07,214][04017] Updated weights for policy 0, policy_version 52830 (0.0012) [2024-03-21 09:07:10,087][03995] Signal inference workers to stop experience collection... (34900 times) [2024-03-21 09:07:10,088][03995] Signal inference workers to resume experience collection... (34900 times) [2024-03-21 09:07:10,162][04017] InferenceWorker_p0-w0: stopping experience collection (34900 times) [2024-03-21 09:07:10,162][04017] InferenceWorker_p0-w0: resuming experience collection (34900 times) [2024-03-21 09:07:10,521][03784] Fps is (10 sec: 55705.9, 60 sec: 47513.6, 300 sec: 46208.4). Total num frames: 1731297280. Throughput: 0: 45860.0. Samples: 1732425200. Policy #0 lag: (min: 2.0, avg: 48.4, max: 100.0) [2024-03-21 09:07:10,522][03784] Avg episode reward: [(0, '1.664')] [2024-03-21 09:07:15,521][03784] Fps is (10 sec: 36044.9, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 1731428352. Throughput: 0: 45704.5. Samples: 1732574900. Policy #0 lag: (min: 2.0, avg: 48.4, max: 100.0) [2024-03-21 09:07:15,522][03784] Avg episode reward: [(0, '0.677')] [2024-03-21 09:07:16,036][04017] Updated weights for policy 0, policy_version 52840 (0.0012) [2024-03-21 09:07:20,521][03784] Fps is (10 sec: 26214.5, 60 sec: 43690.7, 300 sec: 45764.1). Total num frames: 1731559424. Throughput: 0: 46240.1. Samples: 1732865000. Policy #0 lag: (min: 2.0, avg: 48.4, max: 100.0) [2024-03-21 09:07:20,522][03784] Avg episode reward: [(0, '1.222')] [2024-03-21 09:07:25,521][03784] Fps is (10 sec: 29491.2, 60 sec: 43690.7, 300 sec: 45543.2). Total num frames: 1731723264. Throughput: 0: 46020.0. Samples: 1733128400. Policy #0 lag: (min: 2.0, avg: 48.4, max: 100.0) [2024-03-21 09:07:25,522][03784] Avg episode reward: [(0, '1.477')] [2024-03-21 09:07:26,805][04017] Updated weights for policy 0, policy_version 52850 (0.0012) [2024-03-21 09:07:30,521][03784] Fps is (10 sec: 36044.4, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 1731919872. Throughput: 0: 46211.2. Samples: 1733277700. Policy #0 lag: (min: 2.0, avg: 48.4, max: 100.0) [2024-03-21 09:07:30,522][03784] Avg episode reward: [(0, '1.290')] [2024-03-21 09:07:35,521][03784] Fps is (10 sec: 32768.1, 60 sec: 44783.0, 300 sec: 44542.3). Total num frames: 1732050944. Throughput: 0: 46624.6. Samples: 1733579100. Policy #0 lag: (min: 2.0, avg: 48.4, max: 100.0) [2024-03-21 09:07:35,522][03784] Avg episode reward: [(0, '1.373')] [2024-03-21 09:07:36,748][04017] Updated weights for policy 0, policy_version 52860 (0.0013) [2024-03-21 09:07:40,521][03784] Fps is (10 sec: 42598.7, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1732345856. Throughput: 0: 46991.0. Samples: 1733854000. Policy #0 lag: (min: 2.0, avg: 48.4, max: 100.0) [2024-03-21 09:07:40,522][03784] Avg episode reward: [(0, '1.550')] [2024-03-21 09:07:41,163][04017] Updated weights for policy 0, policy_version 52870 (0.0011) [2024-03-21 09:07:45,521][03784] Fps is (10 sec: 65536.1, 60 sec: 47513.8, 300 sec: 45542.0). Total num frames: 1732706304. Throughput: 0: 46400.1. Samples: 1733964100. Policy #0 lag: (min: 0.0, avg: 41.7, max: 95.0) [2024-03-21 09:07:45,522][03784] Avg episode reward: [(0, '0.692')] [2024-03-21 09:07:47,010][04017] Updated weights for policy 0, policy_version 52880 (0.0021) [2024-03-21 09:07:50,347][04017] Updated weights for policy 0, policy_version 52890 (0.0019) [2024-03-21 09:07:50,521][03784] Fps is (10 sec: 75365.3, 60 sec: 50790.3, 300 sec: 46208.4). Total num frames: 1733099520. Throughput: 0: 46246.5. Samples: 1734225500. Policy #0 lag: (min: 0.0, avg: 41.7, max: 95.0) [2024-03-21 09:07:50,522][03784] Avg episode reward: [(0, '1.454')] [2024-03-21 09:07:55,521][03784] Fps is (10 sec: 58981.5, 60 sec: 48059.6, 300 sec: 46319.5). Total num frames: 1733296128. Throughput: 0: 45948.8. Samples: 1734492900. Policy #0 lag: (min: 0.0, avg: 41.7, max: 95.0) [2024-03-21 09:07:55,522][03784] Avg episode reward: [(0, '1.425')] [2024-03-21 09:07:57,883][04017] Updated weights for policy 0, policy_version 52900 (0.0010) [2024-03-21 09:08:00,422][03995] Signal inference workers to stop experience collection... (34950 times) [2024-03-21 09:08:00,423][03995] Signal inference workers to resume experience collection... (34950 times) [2024-03-21 09:08:00,492][04017] InferenceWorker_p0-w0: stopping experience collection (34950 times) [2024-03-21 09:08:00,499][04017] InferenceWorker_p0-w0: resuming experience collection (34950 times) [2024-03-21 09:08:00,521][03784] Fps is (10 sec: 45875.0, 60 sec: 46967.4, 300 sec: 46541.6). Total num frames: 1733558272. Throughput: 0: 45750.9. Samples: 1734633700. Policy #0 lag: (min: 0.0, avg: 41.7, max: 95.0) [2024-03-21 09:08:00,522][03784] Avg episode reward: [(0, '1.425')] [2024-03-21 09:08:00,793][03995] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052905_1733591040.pth... [2024-03-21 09:08:00,947][03995] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052564_1722417152.pth [2024-03-21 09:08:05,521][03784] Fps is (10 sec: 42598.9, 60 sec: 44236.8, 300 sec: 46541.7). Total num frames: 1733722112. Throughput: 0: 45620.0. Samples: 1734917900. Policy #0 lag: (min: 0.0, avg: 41.7, max: 95.0) [2024-03-21 09:08:05,523][03784] Avg episode reward: [(0, '1.396')] [2024-03-21 09:08:05,606][04017] Updated weights for policy 0, policy_version 52910 (0.0011) [2024-03-21 09:08:10,521][03784] Fps is (10 sec: 36045.4, 60 sec: 43690.7, 300 sec: 46097.4). Total num frames: 1733918720. Throughput: 0: 45915.5. Samples: 1735194600. Policy #0 lag: (min: 0.0, avg: 41.7, max: 95.0) [2024-03-21 09:08:10,522][03784] Avg episode reward: [(0, '1.866')] [2024-03-21 09:08:15,521][03784] Fps is (10 sec: 29490.9, 60 sec: 43144.5, 300 sec: 45430.9). Total num frames: 1734017024. Throughput: 0: 45866.7. Samples: 1735341700. Policy #0 lag: (min: 0.0, avg: 41.7, max: 95.0) [2024-03-21 09:08:15,522][03784] Avg episode reward: [(0, '0.591')] [2024-03-21 10:06:55,762][11055] Saving configuration to /workspace/metta/train_dir/p2.objt_atn.4/config.json... [2024-03-21 10:06:55,778][11055] Rollout worker 0 uses device cpu [2024-03-21 10:06:55,778][11055] Rollout worker 1 uses device cpu [2024-03-21 10:06:55,778][11055] Rollout worker 2 uses device cpu [2024-03-21 10:06:55,778][11055] Rollout worker 3 uses device cpu [2024-03-21 10:06:55,778][11055] Rollout worker 4 uses device cpu [2024-03-21 10:06:55,778][11055] Rollout worker 5 uses device cpu [2024-03-21 10:06:55,778][11055] Rollout worker 6 uses device cpu [2024-03-21 10:06:55,778][11055] Rollout worker 7 uses device cpu [2024-03-21 10:06:55,778][11055] Rollout worker 8 uses device cpu [2024-03-21 10:06:55,778][11055] Rollout worker 9 uses device cpu [2024-03-21 10:06:55,778][11055] Rollout worker 10 uses device cpu [2024-03-21 10:06:55,778][11055] Rollout worker 11 uses device cpu [2024-03-21 10:06:55,778][11055] Rollout worker 12 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 13 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 14 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 15 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 16 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 17 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 18 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 19 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 20 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 21 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 22 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 23 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 24 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 25 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 26 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 27 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 28 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 29 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 30 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 31 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 32 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 33 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 34 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 35 uses device cpu [2024-03-21 10:06:55,779][11055] Rollout worker 36 uses device cpu [2024-03-21 10:06:55,780][11055] Rollout worker 37 uses device cpu [2024-03-21 10:06:55,780][11055] Rollout worker 38 uses device cpu [2024-03-21 10:06:55,780][11055] Rollout worker 39 uses device cpu [2024-03-21 10:06:55,780][11055] Rollout worker 40 uses device cpu [2024-03-21 10:06:55,780][11055] Rollout worker 41 uses device cpu [2024-03-21 10:06:55,780][11055] Rollout worker 42 uses device cpu [2024-03-21 10:06:55,780][11055] Rollout worker 43 uses device cpu [2024-03-21 10:06:55,780][11055] Rollout worker 44 uses device cpu [2024-03-21 10:06:55,780][11055] Rollout worker 45 uses device cpu [2024-03-21 10:06:55,780][11055] Rollout worker 46 uses device cpu [2024-03-21 10:06:55,780][11055] Rollout worker 47 uses device cpu [2024-03-21 10:06:55,780][11055] Rollout worker 48 uses device cpu [2024-03-21 10:06:55,780][11055] Rollout worker 49 uses device cpu [2024-03-21 10:07:00,041][11055] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-21 10:07:00,041][11055] InferenceWorker_p0-w0: min num requests: 16 [2024-03-21 10:07:00,103][11055] Starting all processes... [2024-03-21 10:07:00,103][11055] Starting process learner_proc0 [2024-03-21 10:07:00,201][11055] Starting all processes... [2024-03-21 10:07:00,204][11055] Starting process inference_proc0-0 [2024-03-21 10:07:00,204][11055] Starting process rollout_proc0 [2024-03-21 10:07:00,204][11055] Starting process rollout_proc1 [2024-03-21 10:07:00,204][11055] Starting process rollout_proc2 [2024-03-21 10:07:00,205][11055] Starting process rollout_proc3 [2024-03-21 10:07:00,205][11055] Starting process rollout_proc4 [2024-03-21 10:07:00,205][11055] Starting process rollout_proc5 [2024-03-21 10:07:00,205][11055] Starting process rollout_proc6 [2024-03-21 10:07:00,205][11055] Starting process rollout_proc7 [2024-03-21 10:07:00,206][11055] Starting process rollout_proc8 [2024-03-21 10:07:00,206][11055] Starting process rollout_proc9 [2024-03-21 10:07:00,206][11055] Starting process rollout_proc10 [2024-03-21 10:07:00,206][11055] Starting process rollout_proc11 [2024-03-21 10:07:00,206][11055] Starting process rollout_proc12 [2024-03-21 10:07:00,206][11055] Starting process rollout_proc13 [2024-03-21 10:07:00,206][11055] Starting process rollout_proc14 [2024-03-21 10:07:00,207][11055] Starting process rollout_proc15 [2024-03-21 10:07:00,208][11055] Starting process rollout_proc16 [2024-03-21 10:07:00,208][11055] Starting process rollout_proc17 [2024-03-21 10:07:00,209][11055] Starting process rollout_proc18 [2024-03-21 10:07:00,210][11055] Starting process rollout_proc19 [2024-03-21 10:07:00,212][11055] Starting process rollout_proc20 [2024-03-21 10:07:00,212][11055] Starting process rollout_proc21 [2024-03-21 10:07:00,215][11055] Starting process rollout_proc22 [2024-03-21 10:07:00,215][11055] Starting process rollout_proc23 [2024-03-21 10:07:00,217][11055] Starting process rollout_proc24 [2024-03-21 10:07:00,218][11055] Starting process rollout_proc25 [2024-03-21 10:07:00,219][11055] Starting process rollout_proc26 [2024-03-21 10:07:00,221][11055] Starting process rollout_proc27 [2024-03-21 10:07:00,224][11055] Starting process rollout_proc28 [2024-03-21 10:07:00,225][11055] Starting process rollout_proc29 [2024-03-21 10:07:00,296][11055] Starting process rollout_proc30 [2024-03-21 10:07:00,323][11055] Starting process rollout_proc31 [2024-03-21 10:07:00,339][11055] Starting process rollout_proc32 [2024-03-21 10:07:00,343][11055] Starting process rollout_proc33 [2024-03-21 10:07:00,343][11055] Starting process rollout_proc34 [2024-03-21 10:07:00,366][11055] Starting process rollout_proc35 [2024-03-21 10:07:00,367][11055] Starting process rollout_proc36 [2024-03-21 10:07:00,367][11055] Starting process rollout_proc37 [2024-03-21 10:07:00,368][11055] Starting process rollout_proc38 [2024-03-21 10:07:00,396][11055] Starting process rollout_proc41 [2024-03-21 10:07:00,396][11055] Starting process rollout_proc40 [2024-03-21 10:07:00,395][11055] Starting process rollout_proc39 [2024-03-21 10:07:00,429][11055] Starting process rollout_proc42 [2024-03-21 10:07:00,435][11055] Starting process rollout_proc43 [2024-03-21 10:07:00,436][11055] Starting process rollout_proc44 [2024-03-21 10:07:00,463][11055] Starting process rollout_proc45 [2024-03-21 10:07:00,465][11055] Starting process rollout_proc46 [2024-03-21 10:07:00,466][11055] Starting process rollout_proc47 [2024-03-21 10:07:00,466][11055] Starting process rollout_proc48 [2024-03-21 10:07:00,500][11055] Starting process rollout_proc49 [2024-03-21 10:07:03,210][11584] Worker 18 uses CPU cores [18] [2024-03-21 10:07:03,230][11842] Worker 28 uses CPU cores [28] [2024-03-21 10:07:03,302][11290] Worker 3 uses CPU cores [3] [2024-03-21 10:07:03,402][11288] Worker 1 uses CPU cores [1] [2024-03-21 10:07:03,518][11774] Worker 22 uses CPU cores [22] [2024-03-21 10:07:03,535][11974] Worker 37 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:03,614][11293] Worker 7 uses CPU cores [7] [2024-03-21 10:07:03,650][11942] Worker 32 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:03,712][11890] Worker 34 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:03,714][12575] Worker 49 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:03,730][11314] Worker 13 uses CPU cores [13] [2024-03-21 10:07:03,763][12072] Worker 40 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:03,806][11296] Worker 9 uses CPU cores [9] [2024-03-21 10:07:03,813][11332] Worker 14 uses CPU cores [14] [2024-03-21 10:07:03,814][11292] Worker 6 uses CPU cores [6] [2024-03-21 10:07:03,831][12480] Worker 46 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:03,880][11808] Worker 25 uses CPU cores [25] [2024-03-21 10:07:03,909][12574] Worker 47 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:03,982][11294] Worker 5 uses CPU cores [5] [2024-03-21 10:07:03,982][11291] Worker 4 uses CPU cores [4] [2024-03-21 10:07:03,995][12070] Worker 36 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:04,014][11266] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-21 10:07:04,014][11266] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-03-21 10:07:04,025][11266] Num visible devices: 1 [2024-03-21 10:07:04,062][11917] Worker 35 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:04,078][11773] Worker 21 uses CPU cores [21] [2024-03-21 10:07:04,086][11289] Worker 2 uses CPU cores [2] [2024-03-21 10:07:04,095][11266] Starting seed is not provided [2024-03-21 10:07:04,095][11266] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-21 10:07:04,095][11266] Initializing actor-critic model on device cuda:0 [2024-03-21 10:07:04,095][11266] RunningMeanStd input shape: (20,) [2024-03-21 10:07:04,095][11266] RunningMeanStd input shape: (24, 11, 11) [2024-03-21 10:07:04,096][11266] RunningMeanStd input shape: (1, 11, 11) [2024-03-21 10:07:04,096][11266] RunningMeanStd input shape: (2,) [2024-03-21 10:07:04,096][11266] RunningMeanStd input shape: (1,) [2024-03-21 10:07:04,096][11266] RunningMeanStd input shape: (1,) [2024-03-21 10:07:04,096][11297] Worker 10 uses CPU cores [10] [2024-03-21 10:07:04,115][11843] Worker 29 uses CPU cores [29] [2024-03-21 10:07:04,145][11295] Worker 8 uses CPU cores [8] [2024-03-21 10:07:04,158][11809] Worker 26 uses CPU cores [26] [2024-03-21 10:07:04,180][12321] Worker 42 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:04,208][11455] Worker 16 uses CPU cores [16] [2024-03-21 10:07:04,229][11307] Worker 12 uses CPU cores [12] [2024-03-21 10:07:04,230][11845] Worker 31 uses CPU cores [31] [2024-03-21 10:07:04,233][11807] Worker 24 uses CPU cores [24] [2024-03-21 10:07:04,246][11975] Worker 38 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:04,281][11806] Worker 23 uses CPU cores [23] [2024-03-21 10:07:04,282][11844] Worker 30 uses CPU cores [30] [2024-03-21 10:07:04,328][12576] Worker 48 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:04,408][12197] Worker 39 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:04,409][12038] Worker 41 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:04,419][11616] Worker 15 uses CPU cores [15] [2024-03-21 10:07:04,442][11552] Worker 17 uses CPU cores [17] [2024-03-21 10:07:04,464][11266] Created Actor Critic model with architecture: [2024-03-21 10:07:04,464][11266] PredictingActorCritic( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (global_vars): RunningMeanStdInPlace() (griddly_obs): RunningMeanStdInPlace() (kinship): RunningMeanStdInPlace() (last_action): RunningMeanStdInPlace() (last_reward): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): GriddlyEncoder( (object_embedding): Sequential( (0): Linear(in_features=52, out_features=64, bias=True) (1): ELU(alpha=1.0) (2): Sequential( (0): Linear(in_features=64, out_features=64, bias=True) (1): ELU(alpha=1.0) ) (3): Sequential( (0): Linear(in_features=64, out_features=64, bias=True) (1): ELU(alpha=1.0) ) (4): Sequential( (0): Linear(in_features=64, out_features=64, bias=True) (1): ELU(alpha=1.0) ) ) (encoder_head): Sequential( (0): Linear(in_features=7767, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) (3): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) (4): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) (5): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): GriddlyDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) [2024-03-21 10:07:04,509][12416] Worker 44 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:04,525][11886] Worker 33 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:04,528][11841] Worker 27 uses CPU cores [27] [2024-03-21 10:07:04,574][12479] Worker 45 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:04,596][11741] Worker 19 uses CPU cores [19] [2024-03-21 10:07:04,614][11520] Worker 20 uses CPU cores [20] [2024-03-21 10:07:04,663][11298] Worker 11 uses CPU cores [11] [2024-03-21 10:07:04,665][11287] Worker 0 uses CPU cores [0] [2024-03-21 10:07:04,665][11266] Using optimizer [2024-03-21 10:07:04,679][12356] Worker 43 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:07:04,684][11286] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-21 10:07:04,684][11286] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-03-21 10:07:04,692][11286] Num visible devices: 1 [2024-03-21 10:07:04,886][11266] Loading state from checkpoint /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052905_1733591040.pth... [2024-03-21 10:07:04,908][11266] Loading model from checkpoint [2024-03-21 10:07:04,909][11266] Loaded experiment state at self.train_step=52905, self.env_steps=1733591040 [2024-03-21 10:07:04,910][11266] Initialized policy 0 weights for model version 52905 [2024-03-21 10:07:04,910][11266] LearnerWorker_p0 finished initialization! [2024-03-21 10:07:04,911][11266] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-21 10:07:04,960][11286] RunningMeanStd input shape: (20,) [2024-03-21 10:07:04,961][11286] RunningMeanStd input shape: (24, 11, 11) [2024-03-21 10:07:04,961][11286] RunningMeanStd input shape: (1, 11, 11) [2024-03-21 10:07:04,961][11286] RunningMeanStd input shape: (2,) [2024-03-21 10:07:04,961][11286] RunningMeanStd input shape: (1,) [2024-03-21 10:07:04,961][11286] RunningMeanStd input shape: (1,) [2024-03-21 10:07:05,228][11055] Inference worker 0-0 is ready! [2024-03-21 10:07:05,228][11055] All inference workers are ready! Signal rollout workers to start! [2024-03-21 10:07:08,152][11055] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 1733591040. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:07:13,153][11055] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1733591040. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:07:13,610][11520] Decorrelating experience for 0 frames... [2024-03-21 10:07:13,688][11741] Decorrelating experience for 0 frames... [2024-03-21 10:07:13,785][11842] Decorrelating experience for 0 frames... [2024-03-21 10:07:14,618][11584] Decorrelating experience for 0 frames... [2024-03-21 10:07:16,035][11843] Decorrelating experience for 0 frames... [2024-03-21 10:07:16,420][11806] Decorrelating experience for 0 frames... [2024-03-21 10:07:17,954][11773] Decorrelating experience for 0 frames... [2024-03-21 10:07:18,152][11055] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1733591040. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:07:19,107][11297] Decorrelating experience for 0 frames... [2024-03-21 10:07:19,740][11293] Decorrelating experience for 0 frames... [2024-03-21 10:07:20,029][11552] Decorrelating experience for 0 frames... [2024-03-21 10:07:20,038][11055] Heartbeat connected on Batcher_0 [2024-03-21 10:07:20,039][11055] Heartbeat connected on LearnerWorker_p0 [2024-03-21 10:07:20,080][11055] Heartbeat connected on InferenceWorker_p0-w0 [2024-03-21 10:07:20,112][11292] Decorrelating experience for 0 frames... [2024-03-21 10:07:20,122][11455] Decorrelating experience for 0 frames... [2024-03-21 10:07:20,259][11845] Decorrelating experience for 0 frames... [2024-03-21 10:07:20,299][11298] Decorrelating experience for 0 frames... [2024-03-21 10:07:20,534][11294] Decorrelating experience for 0 frames... [2024-03-21 10:07:20,955][11289] Decorrelating experience for 0 frames... [2024-03-21 10:07:20,959][11290] Decorrelating experience for 0 frames... [2024-03-21 10:07:21,209][11332] Decorrelating experience for 0 frames... [2024-03-21 10:07:21,867][11841] Decorrelating experience for 0 frames... [2024-03-21 10:07:22,429][11314] Decorrelating experience for 0 frames... [2024-03-21 10:07:22,562][11291] Decorrelating experience for 0 frames... [2024-03-21 10:07:22,772][11307] Decorrelating experience for 0 frames... [2024-03-21 10:07:22,925][11296] Decorrelating experience for 0 frames... [2024-03-21 10:07:23,152][11055] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1733591040. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:07:23,225][11616] Decorrelating experience for 0 frames... [2024-03-21 10:07:23,449][11520] Decorrelating experience for 256 frames... [2024-03-21 10:07:23,946][11295] Decorrelating experience for 0 frames... [2024-03-21 10:07:24,039][11808] Decorrelating experience for 0 frames... [2024-03-21 10:07:24,329][11807] Decorrelating experience for 0 frames... [2024-03-21 10:07:24,716][11844] Decorrelating experience for 0 frames... [2024-03-21 10:07:25,147][11741] Decorrelating experience for 256 frames... [2024-03-21 10:07:25,795][11288] Decorrelating experience for 0 frames... [2024-03-21 10:07:26,725][11774] Decorrelating experience for 0 frames... [2024-03-21 10:07:26,778][11287] Decorrelating experience for 0 frames... [2024-03-21 10:07:27,203][11809] Decorrelating experience for 0 frames... [2024-03-21 10:07:28,152][11055] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1733591040. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:07:28,327][11842] Decorrelating experience for 256 frames... [2024-03-21 10:07:28,981][11552] Decorrelating experience for 256 frames... [2024-03-21 10:07:29,525][12038] Decorrelating experience for 0 frames... [2024-03-21 10:07:29,757][11773] Decorrelating experience for 256 frames... [2024-03-21 10:07:29,815][12070] Decorrelating experience for 0 frames... [2024-03-21 10:07:30,117][11890] Decorrelating experience for 0 frames... [2024-03-21 10:07:31,067][12321] Decorrelating experience for 0 frames... [2024-03-21 10:07:31,094][12479] Decorrelating experience for 0 frames... [2024-03-21 10:07:32,281][12574] Decorrelating experience for 0 frames... [2024-03-21 10:07:32,583][11584] Decorrelating experience for 256 frames... [2024-03-21 10:07:32,615][12575] Decorrelating experience for 0 frames... [2024-03-21 10:07:32,884][11808] Decorrelating experience for 256 frames... [2024-03-21 10:07:33,051][12576] Decorrelating experience for 0 frames... [2024-03-21 10:07:33,152][11055] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1733591040. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:07:33,362][12072] Decorrelating experience for 0 frames... [2024-03-21 10:07:33,779][11055] Heartbeat connected on RolloutWorker_w19 [2024-03-21 10:07:33,982][11845] Decorrelating experience for 256 frames... [2024-03-21 10:07:34,293][12480] Decorrelating experience for 0 frames... [2024-03-21 10:07:34,349][11055] Heartbeat connected on RolloutWorker_w20 [2024-03-21 10:07:35,292][11455] Decorrelating experience for 256 frames... [2024-03-21 10:07:35,803][11886] Decorrelating experience for 0 frames... [2024-03-21 10:07:36,388][11806] Decorrelating experience for 256 frames... [2024-03-21 10:07:36,474][11975] Decorrelating experience for 0 frames... [2024-03-21 10:07:36,662][12416] Decorrelating experience for 0 frames... [2024-03-21 10:07:37,076][11297] Decorrelating experience for 256 frames... [2024-03-21 10:07:37,213][11055] Heartbeat connected on RolloutWorker_w17 [2024-03-21 10:07:37,246][11293] Decorrelating experience for 256 frames... [2024-03-21 10:07:37,460][11298] Decorrelating experience for 256 frames... [2024-03-21 10:07:37,558][11942] Decorrelating experience for 0 frames... [2024-03-21 10:07:37,582][12356] Decorrelating experience for 0 frames... [2024-03-21 10:07:37,631][11917] Decorrelating experience for 0 frames... [2024-03-21 10:07:38,046][11974] Decorrelating experience for 0 frames... [2024-03-21 10:07:38,118][12197] Decorrelating experience for 0 frames... [2024-03-21 10:07:38,152][11055] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1733591040. Throughput: 0: 140.0. Samples: 4200. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:07:38,160][11055] Heartbeat connected on RolloutWorker_w21 [2024-03-21 10:07:38,223][11843] Decorrelating experience for 256 frames... [2024-03-21 10:07:38,231][11294] Decorrelating experience for 256 frames... [2024-03-21 10:07:39,019][11292] Decorrelating experience for 256 frames... [2024-03-21 10:07:39,095][11807] Decorrelating experience for 256 frames... [2024-03-21 10:07:39,579][11055] Heartbeat connected on RolloutWorker_w28 [2024-03-21 10:07:40,289][11296] Decorrelating experience for 256 frames... [2024-03-21 10:07:40,337][11290] Decorrelating experience for 256 frames... [2024-03-21 10:07:40,684][11289] Decorrelating experience for 256 frames... [2024-03-21 10:07:41,293][11055] Heartbeat connected on RolloutWorker_w25 [2024-03-21 10:07:41,342][11295] Decorrelating experience for 256 frames... [2024-03-21 10:07:42,418][11314] Decorrelating experience for 256 frames... [2024-03-21 10:07:42,455][11332] Decorrelating experience for 256 frames... [2024-03-21 10:07:42,486][11841] Decorrelating experience for 256 frames... [2024-03-21 10:07:42,662][11616] Decorrelating experience for 256 frames... [2024-03-21 10:07:42,869][11307] Decorrelating experience for 256 frames... [2024-03-21 10:07:43,152][11055] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1733591040. Throughput: 0: 1148.6. Samples: 40200. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:07:43,573][11055] Heartbeat connected on RolloutWorker_w16 [2024-03-21 10:07:43,666][11288] Decorrelating experience for 256 frames... [2024-03-21 10:07:43,986][11291] Decorrelating experience for 256 frames... [2024-03-21 10:07:44,250][11287] Decorrelating experience for 256 frames... [2024-03-21 10:07:45,571][11809] Decorrelating experience for 256 frames... [2024-03-21 10:07:46,235][11055] Heartbeat connected on RolloutWorker_w31 [2024-03-21 10:07:46,344][11774] Decorrelating experience for 256 frames... [2024-03-21 10:07:46,941][11055] Heartbeat connected on RolloutWorker_w10 [2024-03-21 10:07:47,127][11055] Heartbeat connected on RolloutWorker_w18 [2024-03-21 10:07:47,258][11844] Decorrelating experience for 256 frames... [2024-03-21 10:07:47,334][11055] Heartbeat connected on RolloutWorker_w11 [2024-03-21 10:07:47,541][11055] Heartbeat connected on RolloutWorker_w24 [2024-03-21 10:07:48,152][11055] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1733591040. Throughput: 0: 2527.5. Samples: 101100. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:07:49,502][11055] Heartbeat connected on RolloutWorker_w23 [2024-03-21 10:07:50,949][11055] Heartbeat connected on RolloutWorker_w9 [2024-03-21 10:07:51,888][11055] Heartbeat connected on RolloutWorker_w3 [2024-03-21 10:07:52,149][11055] Heartbeat connected on RolloutWorker_w2 [2024-03-21 10:07:52,230][11055] Heartbeat connected on RolloutWorker_w7 [2024-03-21 10:07:52,296][11890] Decorrelating experience for 256 frames... [2024-03-21 10:07:52,397][11055] Heartbeat connected on RolloutWorker_w8 [2024-03-21 10:07:52,442][11055] Heartbeat connected on RolloutWorker_w29 [2024-03-21 10:07:52,603][11055] Heartbeat connected on RolloutWorker_w13 [2024-03-21 10:07:52,998][11055] Heartbeat connected on RolloutWorker_w12 [2024-03-21 10:07:53,152][11055] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1733591040. Throughput: 0: 3526.7. Samples: 158700. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:07:53,719][11055] Heartbeat connected on RolloutWorker_w14 [2024-03-21 10:07:53,844][11055] Heartbeat connected on RolloutWorker_w6 [2024-03-21 10:07:54,321][11055] Heartbeat connected on RolloutWorker_w15 [2024-03-21 10:07:54,687][11055] Heartbeat connected on RolloutWorker_w5 [2024-03-21 10:07:55,701][12574] Decorrelating experience for 256 frames... [2024-03-21 10:07:55,769][11055] Heartbeat connected on RolloutWorker_w0 [2024-03-21 10:07:55,901][11055] Heartbeat connected on RolloutWorker_w1 [2024-03-21 10:07:55,954][12070] Decorrelating experience for 256 frames... [2024-03-21 10:07:56,127][12038] Decorrelating experience for 256 frames... [2024-03-21 10:07:56,397][11055] Heartbeat connected on RolloutWorker_w27 [2024-03-21 10:07:56,893][11055] Heartbeat connected on RolloutWorker_w4 [2024-03-21 10:07:57,649][11055] Heartbeat connected on RolloutWorker_w26 [2024-03-21 10:07:58,152][11055] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1733591040. Throughput: 0: 7466.7. Samples: 336000. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:07:58,219][11055] Heartbeat connected on RolloutWorker_w22 [2024-03-21 10:07:58,807][12321] Decorrelating experience for 256 frames... [2024-03-21 10:07:59,444][12480] Decorrelating experience for 256 frames... [2024-03-21 10:07:59,507][11974] Decorrelating experience for 256 frames... [2024-03-21 10:07:59,863][12575] Decorrelating experience for 256 frames... [2024-03-21 10:08:00,733][11975] Decorrelating experience for 256 frames... [2024-03-21 10:08:00,884][11520] Worker 20, sleep for 60.000 sec to decorrelate experience collection [2024-03-21 10:08:01,232][11055] Heartbeat connected on RolloutWorker_w30 [2024-03-21 10:08:03,152][11055] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1733591040. Throughput: 0: 13253.4. Samples: 596400. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:08:03,298][12197] Decorrelating experience for 256 frames... [2024-03-21 10:08:04,306][12576] Decorrelating experience for 256 frames... [2024-03-21 10:08:04,356][12479] Decorrelating experience for 256 frames... [2024-03-21 10:08:04,503][11842] Worker 28, sleep for 84.000 sec to decorrelate experience collection [2024-03-21 10:08:04,561][12072] Decorrelating experience for 256 frames... [2024-03-21 10:08:05,585][11886] Decorrelating experience for 256 frames... [2024-03-21 10:08:08,152][11055] Fps is (10 sec: 3276.8, 60 sec: 546.1, 300 sec: 546.1). Total num frames: 1733623808. Throughput: 0: 16200.0. Samples: 729000. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-21 10:08:08,534][11055] Heartbeat connected on RolloutWorker_w46 [2024-03-21 10:08:09,031][11917] Decorrelating experience for 256 frames... [2024-03-21 10:08:09,797][11055] Heartbeat connected on RolloutWorker_w34 [2024-03-21 10:08:09,994][11773] Worker 21, sleep for 63.000 sec to decorrelate experience collection [2024-03-21 10:08:10,028][11055] Heartbeat connected on RolloutWorker_w36 [2024-03-21 10:08:10,083][12356] Decorrelating experience for 256 frames... [2024-03-21 10:08:10,376][12416] Decorrelating experience for 256 frames... [2024-03-21 10:08:11,297][11942] Decorrelating experience for 256 frames... [2024-03-21 10:08:11,470][11741] Worker 19, sleep for 57.000 sec to decorrelate experience collection [2024-03-21 10:08:11,483][11055] Heartbeat connected on RolloutWorker_w47 [2024-03-21 10:08:12,876][11055] Heartbeat connected on RolloutWorker_w41 [2024-03-21 10:08:13,152][11055] Fps is (10 sec: 9830.3, 60 sec: 1638.4, 300 sec: 1512.4). Total num frames: 1733689344. Throughput: 0: 22171.1. Samples: 997700. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-21 10:08:14,655][11055] Heartbeat connected on RolloutWorker_w37 [2024-03-21 10:08:15,631][11055] Heartbeat connected on RolloutWorker_w42 [2024-03-21 10:08:15,702][11055] Heartbeat connected on RolloutWorker_w39 [2024-03-21 10:08:16,231][11055] Heartbeat connected on RolloutWorker_w49 [2024-03-21 10:08:16,715][11552] Worker 17, sleep for 51.000 sec to decorrelate experience collection [2024-03-21 10:08:17,281][11055] Heartbeat connected on RolloutWorker_w38 [2024-03-21 10:08:17,934][11807] Worker 24, sleep for 72.000 sec to decorrelate experience collection [2024-03-21 10:08:18,152][11055] Fps is (10 sec: 6553.6, 60 sec: 1638.4, 300 sec: 1404.3). Total num frames: 1733689344. Throughput: 0: 28637.8. Samples: 1288700. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-21 10:08:18,532][11055] Heartbeat connected on RolloutWorker_w48 [2024-03-21 10:08:19,867][11055] Heartbeat connected on RolloutWorker_w33 [2024-03-21 10:08:20,191][11055] Heartbeat connected on RolloutWorker_w45 [2024-03-21 10:08:20,348][11055] Heartbeat connected on RolloutWorker_w40 [2024-03-21 10:08:20,581][11808] Worker 25, sleep for 75.000 sec to decorrelate experience collection [2024-03-21 10:08:20,726][11298] Worker 11, sleep for 33.000 sec to decorrelate experience collection [2024-03-21 10:08:20,780][11266] Signal inference workers to stop experience collection... [2024-03-21 10:08:20,800][11584] Worker 18, sleep for 54.000 sec to decorrelate experience collection [2024-03-21 10:08:20,840][11286] InferenceWorker_p0-w0: stopping experience collection [2024-03-21 10:08:21,015][11266] Signal inference workers to resume experience collection... [2024-03-21 10:08:21,015][11286] InferenceWorker_p0-w0: resuming experience collection [2024-03-21 10:08:21,181][11845] Worker 31, sleep for 93.000 sec to decorrelate experience collection [2024-03-21 10:08:21,249][11055] Heartbeat connected on RolloutWorker_w43 [2024-03-21 10:08:21,334][11055] Heartbeat connected on RolloutWorker_w35 [2024-03-21 10:08:21,730][11806] Worker 23, sleep for 69.000 sec to decorrelate experience collection [2024-03-21 10:08:22,071][11455] Worker 16, sleep for 48.000 sec to decorrelate experience collection [2024-03-21 10:08:22,917][11843] Worker 29, sleep for 87.000 sec to decorrelate experience collection [2024-03-21 10:08:23,152][11055] Fps is (10 sec: 19660.8, 60 sec: 4915.2, 300 sec: 3932.2). Total num frames: 1733885952. Throughput: 0: 31993.3. Samples: 1443900. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-21 10:08:23,306][11055] Heartbeat connected on RolloutWorker_w32 [2024-03-21 10:08:23,426][11055] Heartbeat connected on RolloutWorker_w44 [2024-03-21 10:08:23,431][11286] Updated weights for policy 0, policy_version 52915 (0.0010) [2024-03-21 10:08:24,255][11297] Worker 10, sleep for 30.000 sec to decorrelate experience collection [2024-03-21 10:08:27,307][11841] Worker 27, sleep for 81.000 sec to decorrelate experience collection [2024-03-21 10:08:27,470][11290] Worker 3, sleep for 9.000 sec to decorrelate experience collection [2024-03-21 10:08:27,909][11288] Worker 1, sleep for 3.000 sec to decorrelate experience collection [2024-03-21 10:08:28,129][11774] Worker 22, sleep for 66.000 sec to decorrelate experience collection [2024-03-21 10:08:28,152][11055] Fps is (10 sec: 29491.2, 60 sec: 6553.6, 300 sec: 4915.2). Total num frames: 1733984256. Throughput: 0: 38613.4. Samples: 1777800. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-21 10:08:28,244][11289] Worker 2, sleep for 6.000 sec to decorrelate experience collection [2024-03-21 10:08:28,308][11332] Worker 14, sleep for 42.000 sec to decorrelate experience collection [2024-03-21 10:08:28,310][11616] Worker 15, sleep for 45.000 sec to decorrelate experience collection [2024-03-21 10:08:29,175][11844] Worker 30, sleep for 90.000 sec to decorrelate experience collection [2024-03-21 10:08:29,480][11296] Worker 9, sleep for 27.000 sec to decorrelate experience collection [2024-03-21 10:08:29,656][11314] Worker 13, sleep for 39.000 sec to decorrelate experience collection [2024-03-21 10:08:30,813][11809] Worker 26, sleep for 78.000 sec to decorrelate experience collection [2024-03-21 10:08:30,922][11288] Worker 1 awakens! [2024-03-21 10:08:30,926][11286] Updated weights for policy 0, policy_version 52925 (0.0009) [2024-03-21 10:08:31,247][11295] Worker 8, sleep for 24.000 sec to decorrelate experience collection [2024-03-21 10:08:31,366][11294] Worker 5, sleep for 15.000 sec to decorrelate experience collection [2024-03-21 10:08:31,487][11307] Worker 12, sleep for 36.000 sec to decorrelate experience collection [2024-03-21 10:08:32,483][11293] Worker 7, sleep for 21.000 sec to decorrelate experience collection [2024-03-21 10:08:33,152][11055] Fps is (10 sec: 52429.4, 60 sec: 13653.4, 300 sec: 9637.7). Total num frames: 1734410240. Throughput: 0: 43322.3. Samples: 2050600. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-21 10:08:33,279][11292] Worker 6, sleep for 18.000 sec to decorrelate experience collection [2024-03-21 10:08:34,278][11289] Worker 2 awakens! [2024-03-21 10:08:34,453][11291] Worker 4, sleep for 12.000 sec to decorrelate experience collection [2024-03-21 10:08:36,514][11290] Worker 3 awakens! [2024-03-21 10:08:38,152][11055] Fps is (10 sec: 52429.6, 60 sec: 15291.8, 300 sec: 10194.5). Total num frames: 1734508544. Throughput: 0: 45755.6. Samples: 2217700. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-21 10:08:38,949][12480] Worker 46, sleep for 138.000 sec to decorrelate experience collection [2024-03-21 10:08:39,925][12070] Worker 36, sleep for 108.000 sec to decorrelate experience collection [2024-03-21 10:08:40,242][11890] Worker 34, sleep for 102.000 sec to decorrelate experience collection [2024-03-21 10:08:40,441][11974] Worker 37, sleep for 111.000 sec to decorrelate experience collection [2024-03-21 10:08:40,606][11286] Updated weights for policy 0, policy_version 52935 (0.0011) [2024-03-21 10:08:40,745][12574] Worker 47, sleep for 141.000 sec to decorrelate experience collection [2024-03-21 10:08:40,939][12038] Worker 41, sleep for 123.000 sec to decorrelate experience collection [2024-03-21 10:08:41,324][12197] Worker 39, sleep for 117.000 sec to decorrelate experience collection [2024-03-21 10:08:42,865][12575] Worker 49, sleep for 147.000 sec to decorrelate experience collection [2024-03-21 10:08:43,038][12321] Worker 42, sleep for 126.000 sec to decorrelate experience collection [2024-03-21 10:08:43,152][11055] Fps is (10 sec: 26214.3, 60 sec: 18022.4, 300 sec: 11382.6). Total num frames: 1734672384. Throughput: 0: 49064.5. Samples: 2543900. Policy #0 lag: (min: 0.0, avg: 15.2, max: 28.0) [2024-03-21 10:08:43,282][12576] Worker 48, sleep for 144.000 sec to decorrelate experience collection [2024-03-21 10:08:43,674][12072] Worker 40, sleep for 120.000 sec to decorrelate experience collection [2024-03-21 10:08:44,138][12356] Worker 43, sleep for 129.000 sec to decorrelate experience collection [2024-03-21 10:08:44,230][11975] Worker 38, sleep for 114.000 sec to decorrelate experience collection [2024-03-21 10:08:44,515][11886] Worker 33, sleep for 99.000 sec to decorrelate experience collection [2024-03-21 10:08:44,661][12479] Worker 45, sleep for 135.000 sec to decorrelate experience collection [2024-03-21 10:08:44,974][11942] Worker 32, sleep for 96.000 sec to decorrelate experience collection [2024-03-21 10:08:45,067][11917] Worker 35, sleep for 105.000 sec to decorrelate experience collection [2024-03-21 10:08:45,283][11286] Updated weights for policy 0, policy_version 52945 (0.0009) [2024-03-21 10:08:45,971][12416] Worker 44, sleep for 132.000 sec to decorrelate experience collection [2024-03-21 10:08:46,442][11294] Worker 5 awakens! [2024-03-21 10:08:46,513][11291] Worker 4 awakens! [2024-03-21 10:08:48,152][11055] Fps is (10 sec: 49152.0, 60 sec: 23483.8, 300 sec: 14090.3). Total num frames: 1735000064. Throughput: 0: 47217.9. Samples: 2721200. Policy #0 lag: (min: 0.0, avg: 15.2, max: 28.0) [2024-03-21 10:08:51,370][11292] Worker 6 awakens! [2024-03-21 10:08:53,152][11055] Fps is (10 sec: 52428.3, 60 sec: 26760.5, 300 sec: 15291.7). Total num frames: 1735196672. Throughput: 0: 45715.5. Samples: 2786200. Policy #0 lag: (min: 0.0, avg: 15.2, max: 28.0) [2024-03-21 10:08:53,311][11266] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052955_1735229440.pth... [2024-03-21 10:08:53,317][11286] Updated weights for policy 0, policy_version 52955 (0.0006) [2024-03-21 10:08:53,371][11266] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052732_1727922176.pth [2024-03-21 10:08:53,582][11293] Worker 7 awakens! [2024-03-21 10:08:53,826][11298] Worker 11 awakens! [2024-03-21 10:08:54,295][11297] Worker 10 awakens! [2024-03-21 10:08:55,354][11295] Worker 8 awakens! [2024-03-21 10:08:56,577][11296] Worker 9 awakens! [2024-03-21 10:08:58,152][11055] Fps is (10 sec: 36044.6, 60 sec: 29491.2, 300 sec: 16086.1). Total num frames: 1735360512. Throughput: 0: 43437.8. Samples: 2952400. Policy #0 lag: (min: 0.0, avg: 15.2, max: 28.0) [2024-03-21 10:09:00,985][11520] Worker 20 awakens! [2024-03-21 10:09:03,152][11055] Fps is (10 sec: 29491.8, 60 sec: 31675.8, 300 sec: 16526.5). Total num frames: 1735491584. Throughput: 0: 41642.4. Samples: 3162600. Policy #0 lag: (min: 0.0, avg: 15.2, max: 28.0) [2024-03-21 10:09:07,042][11286] Updated weights for policy 0, policy_version 52965 (0.0007) [2024-03-21 10:09:07,533][11307] Worker 12 awakens! [2024-03-21 10:09:07,753][11552] Worker 17 awakens! [2024-03-21 10:09:08,152][11055] Fps is (10 sec: 22937.6, 60 sec: 32768.0, 300 sec: 16657.1). Total num frames: 1735589888. Throughput: 0: 40615.6. Samples: 3271600. Policy #0 lag: (min: 0.0, avg: 15.2, max: 28.0) [2024-03-21 10:09:08,160][11055] Avg episode reward: [(0, '0.902')] [2024-03-21 10:09:08,571][11741] Worker 19 awakens! [2024-03-21 10:09:08,757][11314] Worker 13 awakens! [2024-03-21 10:09:10,170][11455] Worker 16 awakens! [2024-03-21 10:09:10,410][11332] Worker 14 awakens! [2024-03-21 10:09:12,340][11286] Updated weights for policy 0, policy_version 52975 (0.0010) [2024-03-21 10:09:13,003][11773] Worker 21 awakens! [2024-03-21 10:09:13,152][11055] Fps is (10 sec: 42597.1, 60 sec: 37137.0, 300 sec: 18612.2). Total num frames: 1735917568. Throughput: 0: 37439.9. Samples: 3462600. Policy #0 lag: (min: 0.0, avg: 15.2, max: 28.0) [2024-03-21 10:09:13,153][11055] Avg episode reward: [(0, '0.993')] [2024-03-21 10:09:13,410][11616] Worker 15 awakens! [2024-03-21 10:09:14,811][11584] Worker 18 awakens! [2024-03-21 10:09:18,152][11055] Fps is (10 sec: 58981.9, 60 sec: 41506.1, 300 sec: 19912.9). Total num frames: 1736179712. Throughput: 0: 36897.7. Samples: 3711000. Policy #0 lag: (min: 0.0, avg: 58.3, max: 71.0) [2024-03-21 10:09:18,153][11055] Avg episode reward: [(0, '0.993')] [2024-03-21 10:09:18,545][11286] Updated weights for policy 0, policy_version 52985 (0.0015) [2024-03-21 10:09:23,152][11055] Fps is (10 sec: 32767.8, 60 sec: 39321.5, 300 sec: 19660.8). Total num frames: 1736245248. Throughput: 0: 36428.7. Samples: 3857000. Policy #0 lag: (min: 0.0, avg: 58.3, max: 71.0) [2024-03-21 10:09:23,153][11055] Avg episode reward: [(0, '0.993')] [2024-03-21 10:09:28,152][11055] Fps is (10 sec: 19660.8, 60 sec: 39867.7, 300 sec: 19894.9). Total num frames: 1736376320. Throughput: 0: 35935.4. Samples: 4161000. Policy #0 lag: (min: 0.0, avg: 58.3, max: 71.0) [2024-03-21 10:09:28,153][11055] Avg episode reward: [(0, '1.180')] [2024-03-21 10:09:28,604][11842] Worker 28 awakens! [2024-03-21 10:09:29,967][11807] Worker 24 awakens! [2024-03-21 10:09:30,831][11806] Worker 23 awakens! [2024-03-21 10:09:31,826][11286] Updated weights for policy 0, policy_version 52995 (0.0011) [2024-03-21 10:09:33,153][11055] Fps is (10 sec: 39321.7, 60 sec: 37136.9, 300 sec: 21016.7). Total num frames: 1736638464. Throughput: 0: 37919.8. Samples: 4427600. Policy #0 lag: (min: 0.0, avg: 58.3, max: 71.0) [2024-03-21 10:09:33,153][11055] Avg episode reward: [(0, '0.920')] [2024-03-21 10:09:34,227][11774] Worker 22 awakens! [2024-03-21 10:09:35,682][11808] Worker 25 awakens! [2024-03-21 10:09:37,095][11266] Signal inference workers to stop experience collection... (50 times) [2024-03-21 10:09:37,165][11266] Signal inference workers to resume experience collection... (50 times) [2024-03-21 10:09:37,182][11286] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-03-21 10:09:37,184][11286] Updated weights for policy 0, policy_version 53005 (0.0012) [2024-03-21 10:09:37,234][11286] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-03-21 10:09:38,152][11055] Fps is (10 sec: 58983.8, 60 sec: 40960.0, 300 sec: 22500.7). Total num frames: 1736966144. Throughput: 0: 38935.7. Samples: 4538300. Policy #0 lag: (min: 0.0, avg: 58.3, max: 71.0) [2024-03-21 10:09:38,152][11055] Avg episode reward: [(0, '0.953')] [2024-03-21 10:09:40,826][11286] Updated weights for policy 0, policy_version 53015 (0.0046) [2024-03-21 10:09:43,152][11055] Fps is (10 sec: 75368.3, 60 sec: 45329.1, 300 sec: 24523.2). Total num frames: 1737392128. Throughput: 0: 40620.0. Samples: 4780300. Policy #0 lag: (min: 0.0, avg: 58.3, max: 71.0) [2024-03-21 10:09:43,153][11055] Avg episode reward: [(0, '0.597')] [2024-03-21 10:09:48,152][11055] Fps is (10 sec: 52427.8, 60 sec: 41506.0, 300 sec: 24371.2). Total num frames: 1737490432. Throughput: 0: 41966.5. Samples: 5051100. Policy #0 lag: (min: 0.0, avg: 58.3, max: 71.0) [2024-03-21 10:09:48,153][11055] Avg episode reward: [(0, '0.489')] [2024-03-21 10:09:48,408][11841] Worker 27 awakens! [2024-03-21 10:09:48,814][11286] Updated weights for policy 0, policy_version 53025 (0.0011) [2024-03-21 10:09:48,914][11809] Worker 26 awakens! [2024-03-21 10:09:50,018][11843] Worker 29 awakens! [2024-03-21 10:09:53,152][11055] Fps is (10 sec: 16383.9, 60 sec: 39321.6, 300 sec: 24029.9). Total num frames: 1737555968. Throughput: 0: 43262.2. Samples: 5218400. Policy #0 lag: (min: 2.0, avg: 69.2, max: 119.0) [2024-03-21 10:09:53,153][11055] Avg episode reward: [(0, '0.549')] [2024-03-21 10:09:54,233][11845] Worker 31 awakens! [2024-03-21 10:09:58,152][11055] Fps is (10 sec: 19660.6, 60 sec: 38775.3, 300 sec: 24094.1). Total num frames: 1737687040. Throughput: 0: 46557.8. Samples: 5557700. Policy #0 lag: (min: 2.0, avg: 69.2, max: 119.0) [2024-03-21 10:09:58,155][11055] Avg episode reward: [(0, '1.429')] [2024-03-21 10:09:59,238][11844] Worker 30 awakens! [2024-03-21 10:10:01,033][11286] Updated weights for policy 0, policy_version 53035 (0.0011) [2024-03-21 10:10:03,152][11055] Fps is (10 sec: 42598.4, 60 sec: 41506.0, 300 sec: 25090.9). Total num frames: 1737981952. Throughput: 0: 47673.4. Samples: 5856300. Policy #0 lag: (min: 2.0, avg: 69.2, max: 119.0) [2024-03-21 10:10:03,153][11055] Avg episode reward: [(0, '0.572')] [2024-03-21 10:10:05,061][11286] Updated weights for policy 0, policy_version 53045 (0.0047) [2024-03-21 10:10:08,152][11055] Fps is (10 sec: 65537.0, 60 sec: 45875.2, 300 sec: 26396.5). Total num frames: 1738342400. Throughput: 0: 47146.9. Samples: 5978600. Policy #0 lag: (min: 2.0, avg: 69.2, max: 119.0) [2024-03-21 10:10:08,153][11055] Avg episode reward: [(0, '1.080')] [2024-03-21 10:10:09,517][11286] Updated weights for policy 0, policy_version 53055 (0.0020) [2024-03-21 10:10:13,152][11055] Fps is (10 sec: 75366.2, 60 sec: 46967.6, 300 sec: 27808.5). Total num frames: 1738735616. Throughput: 0: 46469.0. Samples: 6252100. Policy #0 lag: (min: 2.0, avg: 69.2, max: 119.0) [2024-03-21 10:10:13,153][11055] Avg episode reward: [(0, '0.881')] [2024-03-21 10:10:14,178][11286] Updated weights for policy 0, policy_version 53065 (0.0031) [2024-03-21 10:10:18,152][11055] Fps is (10 sec: 62259.0, 60 sec: 46421.4, 300 sec: 28284.0). Total num frames: 1738964992. Throughput: 0: 47064.6. Samples: 6545500. Policy #0 lag: (min: 2.0, avg: 69.2, max: 119.0) [2024-03-21 10:10:18,153][11055] Avg episode reward: [(0, '0.693')] [2024-03-21 10:10:21,074][11942] Worker 32 awakens! [2024-03-21 10:10:21,640][11286] Updated weights for policy 0, policy_version 53076 (0.0010) [2024-03-21 10:10:22,338][11890] Worker 34 awakens! [2024-03-21 10:10:23,152][11055] Fps is (10 sec: 45875.6, 60 sec: 49152.2, 300 sec: 28735.0). Total num frames: 1739194368. Throughput: 0: 47717.7. Samples: 6685600. Policy #0 lag: (min: 2.0, avg: 69.2, max: 119.0) [2024-03-21 10:10:23,161][11055] Avg episode reward: [(0, '0.693')] [2024-03-21 10:10:23,618][11886] Worker 33 awakens! [2024-03-21 10:10:28,026][12070] Worker 36 awakens! [2024-03-21 10:10:28,152][11055] Fps is (10 sec: 32768.0, 60 sec: 48605.9, 300 sec: 28508.2). Total num frames: 1739292672. Throughput: 0: 50155.5. Samples: 7037300. Policy #0 lag: (min: 132.0, avg: 153.7, max: 170.0) [2024-03-21 10:10:28,162][11055] Avg episode reward: [(0, '1.347')] [2024-03-21 10:10:30,166][11917] Worker 35 awakens! [2024-03-21 10:10:30,606][11266] Signal inference workers to stop experience collection... (100 times) [2024-03-21 10:10:30,678][11286] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-03-21 10:10:30,717][11266] Signal inference workers to resume experience collection... (100 times) [2024-03-21 10:10:30,748][11286] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-03-21 10:10:31,542][11974] Worker 37 awakens! [2024-03-21 10:10:32,101][11286] Updated weights for policy 0, policy_version 53086 (0.0009) [2024-03-21 10:10:33,152][11055] Fps is (10 sec: 39321.3, 60 sec: 49152.2, 300 sec: 29251.5). Total num frames: 1739587584. Throughput: 0: 50900.1. Samples: 7341600. Policy #0 lag: (min: 132.0, avg: 153.7, max: 170.0) [2024-03-21 10:10:33,153][11055] Avg episode reward: [(0, '0.937')] [2024-03-21 10:10:38,152][11055] Fps is (10 sec: 52428.9, 60 sec: 47513.5, 300 sec: 29647.3). Total num frames: 1739816960. Throughput: 0: 50686.7. Samples: 7499300. Policy #0 lag: (min: 132.0, avg: 153.7, max: 170.0) [2024-03-21 10:10:38,161][11055] Avg episode reward: [(0, '1.132')] [2024-03-21 10:10:38,329][11975] Worker 38 awakens! [2024-03-21 10:10:38,426][12197] Worker 39 awakens! [2024-03-21 10:10:38,881][11286] Updated weights for policy 0, policy_version 53096 (0.0009) [2024-03-21 10:10:43,152][11055] Fps is (10 sec: 49152.3, 60 sec: 44782.9, 300 sec: 30177.1). Total num frames: 1740079104. Throughput: 0: 49289.1. Samples: 7775700. Policy #0 lag: (min: 132.0, avg: 153.7, max: 170.0) [2024-03-21 10:10:43,153][11055] Avg episode reward: [(0, '0.694')] [2024-03-21 10:10:43,774][12072] Worker 40 awakens! [2024-03-21 10:10:43,981][11286] Updated weights for policy 0, policy_version 53106 (0.0031) [2024-03-21 10:10:44,042][12038] Worker 41 awakens! [2024-03-21 10:10:48,152][11055] Fps is (10 sec: 65535.9, 60 sec: 49698.2, 300 sec: 31278.6). Total num frames: 1740472320. Throughput: 0: 48824.4. Samples: 8053400. Policy #0 lag: (min: 132.0, avg: 153.7, max: 170.0) [2024-03-21 10:10:48,153][11055] Avg episode reward: [(0, '0.985')] [2024-03-21 10:10:48,577][11286] Updated weights for policy 0, policy_version 53116 (0.0010) [2024-03-21 10:10:49,138][12321] Worker 42 awakens! [2024-03-21 10:10:51,907][11286] Updated weights for policy 0, policy_version 53126 (0.0025) [2024-03-21 10:10:53,152][11055] Fps is (10 sec: 78642.6, 60 sec: 55159.5, 300 sec: 32331.1). Total num frames: 1740865536. Throughput: 0: 48957.8. Samples: 8181700. Policy #0 lag: (min: 132.0, avg: 153.7, max: 170.0) [2024-03-21 10:10:53,154][11055] Avg episode reward: [(0, '0.388')] [2024-03-21 10:10:53,238][12356] Worker 43 awakens! [2024-03-21 10:10:53,407][11266] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000053128_1740898304.pth... [2024-03-21 10:10:53,579][11266] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052905_1733591040.pth [2024-03-21 10:10:57,050][12480] Worker 46 awakens! [2024-03-21 10:10:58,074][12416] Worker 44 awakens! [2024-03-21 10:10:58,152][11055] Fps is (10 sec: 58982.8, 60 sec: 56251.9, 300 sec: 32483.1). Total num frames: 1741062144. Throughput: 0: 49215.6. Samples: 8466800. Policy #0 lag: (min: 3.0, avg: 115.8, max: 225.0) [2024-03-21 10:10:58,153][11055] Avg episode reward: [(0, '0.979')] [2024-03-21 10:10:59,758][12479] Worker 45 awakens! [2024-03-21 10:11:00,818][11286] Updated weights for policy 0, policy_version 53136 (0.0024) [2024-03-21 10:11:01,842][12574] Worker 47 awakens! [2024-03-21 10:11:03,152][11055] Fps is (10 sec: 39321.8, 60 sec: 54613.4, 300 sec: 32628.6). Total num frames: 1741258752. Throughput: 0: 50002.3. Samples: 8795600. Policy #0 lag: (min: 3.0, avg: 115.8, max: 225.0) [2024-03-21 10:11:03,153][11055] Avg episode reward: [(0, '0.979')] [2024-03-21 10:11:07,382][12576] Worker 48 awakens! [2024-03-21 10:11:08,153][11055] Fps is (10 sec: 32767.1, 60 sec: 50790.2, 300 sec: 32494.9). Total num frames: 1741389824. Throughput: 0: 50784.1. Samples: 8970900. Policy #0 lag: (min: 3.0, avg: 115.8, max: 225.0) [2024-03-21 10:11:08,153][11055] Avg episode reward: [(0, '0.758')] [2024-03-21 10:11:09,962][12575] Worker 49 awakens! [2024-03-21 10:11:11,562][11286] Updated weights for policy 0, policy_version 53146 (0.0014) [2024-03-21 10:11:13,152][11055] Fps is (10 sec: 29491.2, 60 sec: 46967.5, 300 sec: 32500.5). Total num frames: 1741553664. Throughput: 0: 50211.2. Samples: 9296800. Policy #0 lag: (min: 3.0, avg: 115.8, max: 225.0) [2024-03-21 10:11:13,153][11055] Avg episode reward: [(0, '1.447')] [2024-03-21 10:11:18,152][11055] Fps is (10 sec: 32768.9, 60 sec: 45875.2, 300 sec: 32505.9). Total num frames: 1741717504. Throughput: 0: 50306.7. Samples: 9605400. Policy #0 lag: (min: 3.0, avg: 115.8, max: 225.0) [2024-03-21 10:11:18,153][11055] Avg episode reward: [(0, '1.123')] [2024-03-21 10:11:20,050][11286] Updated weights for policy 0, policy_version 53156 (0.0032) [2024-03-21 10:11:23,152][11055] Fps is (10 sec: 42598.4, 60 sec: 46421.3, 300 sec: 32896.5). Total num frames: 1741979648. Throughput: 0: 50302.3. Samples: 9762900. Policy #0 lag: (min: 3.0, avg: 115.8, max: 225.0) [2024-03-21 10:11:23,153][11055] Avg episode reward: [(0, '0.818')] [2024-03-21 10:11:25,446][11266] Signal inference workers to stop experience collection... (150 times) [2024-03-21 10:11:25,501][11286] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-03-21 10:11:25,724][11266] Signal inference workers to resume experience collection... (150 times) [2024-03-21 10:11:25,725][11286] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-03-21 10:11:25,726][11286] Updated weights for policy 0, policy_version 53166 (0.0020) [2024-03-21 10:11:28,152][11055] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 33272.1). Total num frames: 1742241792. Throughput: 0: 50353.3. Samples: 10041600. Policy #0 lag: (min: 3.0, avg: 115.8, max: 225.0) [2024-03-21 10:11:28,153][11055] Avg episode reward: [(0, '0.818')] [2024-03-21 10:11:31,645][11286] Updated weights for policy 0, policy_version 53176 (0.0010) [2024-03-21 10:11:33,152][11055] Fps is (10 sec: 62259.4, 60 sec: 50244.3, 300 sec: 34004.6). Total num frames: 1742602240. Throughput: 0: 50653.4. Samples: 10332800. Policy #0 lag: (min: 1.0, avg: 33.2, max: 66.0) [2024-03-21 10:11:33,153][11055] Avg episode reward: [(0, '0.854')] [2024-03-21 10:11:35,962][11286] Updated weights for policy 0, policy_version 53186 (0.0019) [2024-03-21 10:11:38,152][11055] Fps is (10 sec: 72088.8, 60 sec: 52428.7, 300 sec: 34709.8). Total num frames: 1742962688. Throughput: 0: 50726.6. Samples: 10464400. Policy #0 lag: (min: 1.0, avg: 33.2, max: 66.0) [2024-03-21 10:11:38,154][11055] Avg episode reward: [(0, '0.886')] [2024-03-21 10:11:39,731][11286] Updated weights for policy 0, policy_version 53196 (0.0049) [2024-03-21 10:11:43,152][11055] Fps is (10 sec: 68812.7, 60 sec: 53521.1, 300 sec: 35270.3). Total num frames: 1743290368. Throughput: 0: 50280.0. Samples: 10729400. Policy #0 lag: (min: 1.0, avg: 33.2, max: 66.0) [2024-03-21 10:11:43,153][11055] Avg episode reward: [(0, '0.468')] [2024-03-21 10:11:46,532][11286] Updated weights for policy 0, policy_version 53206 (0.0013) [2024-03-21 10:11:48,152][11055] Fps is (10 sec: 55706.1, 60 sec: 50790.4, 300 sec: 35459.7). Total num frames: 1743519744. Throughput: 0: 50259.9. Samples: 11057300. Policy #0 lag: (min: 1.0, avg: 33.2, max: 66.0) [2024-03-21 10:11:48,153][11055] Avg episode reward: [(0, '0.468')] [2024-03-21 10:11:53,152][11055] Fps is (10 sec: 39320.9, 60 sec: 46967.4, 300 sec: 35412.4). Total num frames: 1743683584. Throughput: 0: 50211.2. Samples: 11230400. Policy #0 lag: (min: 1.0, avg: 33.2, max: 66.0) [2024-03-21 10:11:53,153][11055] Avg episode reward: [(0, '1.428')] [2024-03-21 10:11:54,058][11286] Updated weights for policy 0, policy_version 53216 (0.0020) [2024-03-21 10:11:58,152][11055] Fps is (10 sec: 39322.0, 60 sec: 47513.6, 300 sec: 35592.9). Total num frames: 1743912960. Throughput: 0: 49484.5. Samples: 11523600. Policy #0 lag: (min: 1.0, avg: 33.2, max: 66.0) [2024-03-21 10:11:58,153][11055] Avg episode reward: [(0, '1.428')] [2024-03-21 10:12:03,152][11055] Fps is (10 sec: 39322.1, 60 sec: 46967.4, 300 sec: 35545.0). Total num frames: 1744076800. Throughput: 0: 49846.6. Samples: 11848500. Policy #0 lag: (min: 1.0, avg: 33.2, max: 66.0) [2024-03-21 10:12:03,153][11055] Avg episode reward: [(0, '1.428')] [2024-03-21 10:12:03,888][11286] Updated weights for policy 0, policy_version 53226 (0.0019) [2024-03-21 10:12:08,152][11055] Fps is (10 sec: 26214.4, 60 sec: 46421.6, 300 sec: 35878.2). Total num frames: 1744175104. Throughput: 0: 50106.7. Samples: 12017700. Policy #0 lag: (min: 0.0, avg: 41.2, max: 84.0) [2024-03-21 10:12:08,153][11055] Avg episode reward: [(0, '1.428')] [2024-03-21 10:12:13,152][11055] Fps is (10 sec: 29491.3, 60 sec: 46967.5, 300 sec: 36544.7). Total num frames: 1744371712. Throughput: 0: 50655.6. Samples: 12321100. Policy #0 lag: (min: 0.0, avg: 41.2, max: 84.0) [2024-03-21 10:12:13,154][11055] Avg episode reward: [(0, '1.469')] [2024-03-21 10:12:14,696][11286] Updated weights for policy 0, policy_version 53236 (0.0031) [2024-03-21 10:12:16,892][11266] Signal inference workers to stop experience collection... (200 times) [2024-03-21 10:12:16,960][11286] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-03-21 10:12:17,017][11266] Signal inference workers to resume experience collection... (200 times) [2024-03-21 10:12:17,019][11286] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-03-21 10:12:18,152][11055] Fps is (10 sec: 52428.5, 60 sec: 49698.1, 300 sec: 37655.4). Total num frames: 1744699392. Throughput: 0: 49851.1. Samples: 12576100. Policy #0 lag: (min: 0.0, avg: 41.2, max: 84.0) [2024-03-21 10:12:18,153][11055] Avg episode reward: [(0, '1.397')] [2024-03-21 10:12:19,227][11286] Updated weights for policy 0, policy_version 53246 (0.0032) [2024-03-21 10:12:22,828][11286] Updated weights for policy 0, policy_version 53256 (0.0057) [2024-03-21 10:12:23,152][11055] Fps is (10 sec: 72089.7, 60 sec: 51882.7, 300 sec: 38988.4). Total num frames: 1745092608. Throughput: 0: 49958.0. Samples: 12712500. Policy #0 lag: (min: 0.0, avg: 41.2, max: 84.0) [2024-03-21 10:12:23,153][11055] Avg episode reward: [(0, '0.345')] [2024-03-21 10:12:28,152][11055] Fps is (10 sec: 68811.6, 60 sec: 52428.7, 300 sec: 39988.1). Total num frames: 1745387520. Throughput: 0: 49757.6. Samples: 12968500. Policy #0 lag: (min: 0.0, avg: 41.2, max: 84.0) [2024-03-21 10:12:28,153][11055] Avg episode reward: [(0, '1.220')] [2024-03-21 10:12:28,880][11286] Updated weights for policy 0, policy_version 53266 (0.0011) [2024-03-21 10:12:33,152][11055] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 40543.5). Total num frames: 1745551360. Throughput: 0: 49037.8. Samples: 13264000. Policy #0 lag: (min: 0.0, avg: 41.2, max: 84.0) [2024-03-21 10:12:33,153][11055] Avg episode reward: [(0, '1.220')] [2024-03-21 10:12:36,542][11286] Updated weights for policy 0, policy_version 53276 (0.0019) [2024-03-21 10:12:38,152][11055] Fps is (10 sec: 42599.2, 60 sec: 47513.7, 300 sec: 41432.1). Total num frames: 1745813504. Throughput: 0: 47846.8. Samples: 13383500. Policy #0 lag: (min: 0.0, avg: 41.2, max: 84.0) [2024-03-21 10:12:38,153][11055] Avg episode reward: [(0, '0.732')] [2024-03-21 10:12:43,152][11055] Fps is (10 sec: 42597.8, 60 sec: 44782.8, 300 sec: 41987.5). Total num frames: 1745977344. Throughput: 0: 48066.5. Samples: 13686600. Policy #0 lag: (min: 0.0, avg: 56.8, max: 105.0) [2024-03-21 10:12:43,153][11055] Avg episode reward: [(0, '0.581')] [2024-03-21 10:12:43,883][11286] Updated weights for policy 0, policy_version 53286 (0.0018) [2024-03-21 10:12:48,152][11055] Fps is (10 sec: 45875.2, 60 sec: 45875.2, 300 sec: 42987.2). Total num frames: 1746272256. Throughput: 0: 47475.6. Samples: 13984900. Policy #0 lag: (min: 0.0, avg: 56.8, max: 105.0) [2024-03-21 10:12:48,153][11055] Avg episode reward: [(0, '0.581')] [2024-03-21 10:12:53,152][11055] Fps is (10 sec: 39322.0, 60 sec: 44783.0, 300 sec: 43320.4). Total num frames: 1746370560. Throughput: 0: 47051.0. Samples: 14135000. Policy #0 lag: (min: 0.0, avg: 56.8, max: 105.0) [2024-03-21 10:12:53,153][11055] Avg episode reward: [(0, '0.535')] [2024-03-21 10:12:53,168][11266] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000053295_1746370560.pth... [2024-03-21 10:12:53,283][11266] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000052955_1735229440.pth [2024-03-21 10:12:54,379][11286] Updated weights for policy 0, policy_version 53296 (0.0029) [2024-03-21 10:12:58,152][11055] Fps is (10 sec: 36044.7, 60 sec: 45329.0, 300 sec: 44209.0). Total num frames: 1746632704. Throughput: 0: 46506.7. Samples: 14413900. Policy #0 lag: (min: 0.0, avg: 56.8, max: 105.0) [2024-03-21 10:12:58,153][11055] Avg episode reward: [(0, '1.222')] [2024-03-21 10:13:02,019][11286] Updated weights for policy 0, policy_version 53306 (0.0029) [2024-03-21 10:13:03,152][11055] Fps is (10 sec: 42598.6, 60 sec: 45329.1, 300 sec: 44653.3). Total num frames: 1746796544. Throughput: 0: 46815.5. Samples: 14682800. Policy #0 lag: (min: 0.0, avg: 56.8, max: 105.0) [2024-03-21 10:13:03,153][11055] Avg episode reward: [(0, '0.763')] [2024-03-21 10:13:07,686][11286] Updated weights for policy 0, policy_version 53316 (0.0013) [2024-03-21 10:13:08,152][11055] Fps is (10 sec: 45875.1, 60 sec: 48605.8, 300 sec: 45430.9). Total num frames: 1747091456. Throughput: 0: 46799.9. Samples: 14818500. Policy #0 lag: (min: 0.0, avg: 56.8, max: 105.0) [2024-03-21 10:13:08,153][11055] Avg episode reward: [(0, '0.844')] [2024-03-21 10:13:08,724][11266] Signal inference workers to stop experience collection... (250 times) [2024-03-21 10:13:08,727][11266] Signal inference workers to resume experience collection... (250 times) [2024-03-21 10:13:08,789][11286] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-03-21 10:13:08,790][11286] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-03-21 10:13:12,711][11286] Updated weights for policy 0, policy_version 53326 (0.0011) [2024-03-21 10:13:13,152][11055] Fps is (10 sec: 58982.0, 60 sec: 50244.2, 300 sec: 46430.6). Total num frames: 1747386368. Throughput: 0: 47233.4. Samples: 15094000. Policy #0 lag: (min: 0.0, avg: 56.8, max: 105.0) [2024-03-21 10:13:13,153][11055] Avg episode reward: [(0, '1.164')] [2024-03-21 10:13:17,965][11286] Updated weights for policy 0, policy_version 53336 (0.0014) [2024-03-21 10:13:18,152][11055] Fps is (10 sec: 62258.7, 60 sec: 50244.2, 300 sec: 46874.9). Total num frames: 1747714048. Throughput: 0: 47068.8. Samples: 15382100. Policy #0 lag: (min: 1.0, avg: 44.5, max: 87.0) [2024-03-21 10:13:18,153][11055] Avg episode reward: [(0, '1.506')] [2024-03-21 10:13:23,152][11055] Fps is (10 sec: 45875.5, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 1747845120. Throughput: 0: 47466.6. Samples: 15519500. Policy #0 lag: (min: 1.0, avg: 44.5, max: 87.0) [2024-03-21 10:13:23,153][11055] Avg episode reward: [(0, '1.442')] [2024-03-21 10:13:27,976][11286] Updated weights for policy 0, policy_version 53346 (0.0029) [2024-03-21 10:13:28,152][11055] Fps is (10 sec: 32768.3, 60 sec: 44236.9, 300 sec: 46208.4). Total num frames: 1748041728. Throughput: 0: 46729.0. Samples: 15789400. Policy #0 lag: (min: 1.0, avg: 44.5, max: 87.0) [2024-03-21 10:13:28,153][11055] Avg episode reward: [(0, '1.054')] [2024-03-21 10:13:33,152][11055] Fps is (10 sec: 42598.3, 60 sec: 45329.0, 300 sec: 46652.7). Total num frames: 1748271104. Throughput: 0: 46213.3. Samples: 16064500. Policy #0 lag: (min: 1.0, avg: 44.5, max: 87.0) [2024-03-21 10:13:33,153][11055] Avg episode reward: [(0, '0.958')] [2024-03-21 10:13:35,009][11286] Updated weights for policy 0, policy_version 53356 (0.0019) [2024-03-21 10:13:38,152][11055] Fps is (10 sec: 52428.7, 60 sec: 45875.2, 300 sec: 47097.1). Total num frames: 1748566016. Throughput: 0: 46008.9. Samples: 16205400. Policy #0 lag: (min: 1.0, avg: 44.5, max: 87.0) [2024-03-21 10:13:38,153][11055] Avg episode reward: [(0, '1.237')] [2024-03-21 10:13:40,250][11286] Updated weights for policy 0, policy_version 53366 (0.0038) [2024-03-21 10:13:43,152][11055] Fps is (10 sec: 52428.5, 60 sec: 46967.5, 300 sec: 46763.8). Total num frames: 1748795392. Throughput: 0: 45773.2. Samples: 16473700. Policy #0 lag: (min: 1.0, avg: 44.5, max: 87.0) [2024-03-21 10:13:43,153][11055] Avg episode reward: [(0, '0.991')] [2024-03-21 10:13:48,152][11055] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 46541.7). Total num frames: 1748926464. Throughput: 0: 45906.7. Samples: 16748600. Policy #0 lag: (min: 1.0, avg: 44.5, max: 87.0) [2024-03-21 10:13:48,153][11055] Avg episode reward: [(0, '1.525')] [2024-03-21 10:13:49,688][11286] Updated weights for policy 0, policy_version 53376 (0.0011) [2024-03-21 10:13:53,152][11055] Fps is (10 sec: 29491.3, 60 sec: 45329.1, 300 sec: 46541.7). Total num frames: 1749090304. Throughput: 0: 46293.3. Samples: 16901700. Policy #0 lag: (min: 0.0, avg: 46.9, max: 92.0) [2024-03-21 10:13:53,153][11055] Avg episode reward: [(0, '0.938')] [2024-03-21 10:13:57,238][11286] Updated weights for policy 0, policy_version 53386 (0.0010) [2024-03-21 10:13:58,152][11055] Fps is (10 sec: 45874.9, 60 sec: 45875.1, 300 sec: 47097.0). Total num frames: 1749385216. Throughput: 0: 46211.1. Samples: 17173500. Policy #0 lag: (min: 0.0, avg: 46.9, max: 92.0) [2024-03-21 10:13:58,153][11055] Avg episode reward: [(0, '0.726')] [2024-03-21 10:14:01,806][11286] Updated weights for policy 0, policy_version 53396 (0.0012) [2024-03-21 10:14:03,152][11055] Fps is (10 sec: 65535.5, 60 sec: 49151.9, 300 sec: 47985.7). Total num frames: 1749745664. Throughput: 0: 45286.6. Samples: 17420000. Policy #0 lag: (min: 0.0, avg: 46.9, max: 92.0) [2024-03-21 10:14:03,153][11055] Avg episode reward: [(0, '0.999')] [2024-03-21 10:14:05,156][11266] Signal inference workers to stop experience collection... (300 times) [2024-03-21 10:14:05,203][11286] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-03-21 10:14:05,404][11266] Signal inference workers to resume experience collection... (300 times) [2024-03-21 10:14:05,404][11286] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-03-21 10:14:07,702][11286] Updated weights for policy 0, policy_version 53406 (0.0011) [2024-03-21 10:14:08,152][11055] Fps is (10 sec: 65536.6, 60 sec: 49152.0, 300 sec: 47874.6). Total num frames: 1750040576. Throughput: 0: 45466.7. Samples: 17565500. Policy #0 lag: (min: 0.0, avg: 46.9, max: 92.0) [2024-03-21 10:14:08,153][11055] Avg episode reward: [(0, '0.845')] [2024-03-21 10:14:13,152][11055] Fps is (10 sec: 49152.0, 60 sec: 47513.6, 300 sec: 47652.4). Total num frames: 1750237184. Throughput: 0: 45577.6. Samples: 17840400. Policy #0 lag: (min: 0.0, avg: 46.9, max: 92.0) [2024-03-21 10:14:13,153][11055] Avg episode reward: [(0, '1.210')] [2024-03-21 10:14:14,686][11286] Updated weights for policy 0, policy_version 53416 (0.0026) [2024-03-21 10:14:18,152][11055] Fps is (10 sec: 39321.3, 60 sec: 45329.1, 300 sec: 48096.8). Total num frames: 1750433792. Throughput: 0: 45735.5. Samples: 18122600. Policy #0 lag: (min: 0.0, avg: 46.9, max: 92.0) [2024-03-21 10:14:18,153][11055] Avg episode reward: [(0, '1.590')] [2024-03-21 10:14:23,152][11055] Fps is (10 sec: 36045.5, 60 sec: 45875.2, 300 sec: 48207.9). Total num frames: 1750597632. Throughput: 0: 46206.7. Samples: 18284700. Policy #0 lag: (min: 1.0, avg: 58.1, max: 106.0) [2024-03-21 10:14:23,153][11055] Avg episode reward: [(0, '1.688')] [2024-03-21 10:14:23,470][11286] Updated weights for policy 0, policy_version 53426 (0.0046) [2024-03-21 10:14:28,152][11055] Fps is (10 sec: 39322.1, 60 sec: 46421.4, 300 sec: 48096.8). Total num frames: 1750827008. Throughput: 0: 46462.4. Samples: 18564500. Policy #0 lag: (min: 1.0, avg: 58.1, max: 106.0) [2024-03-21 10:14:28,153][11055] Avg episode reward: [(0, '1.688')] [2024-03-21 10:14:33,152][11055] Fps is (10 sec: 36044.3, 60 sec: 44782.9, 300 sec: 47430.3). Total num frames: 1750958080. Throughput: 0: 47268.8. Samples: 18875700. Policy #0 lag: (min: 1.0, avg: 58.1, max: 106.0) [2024-03-21 10:14:33,153][11055] Avg episode reward: [(0, '1.729')] [2024-03-21 10:14:33,497][11286] Updated weights for policy 0, policy_version 53436 (0.0016) [2024-03-21 10:14:38,152][11055] Fps is (10 sec: 36044.5, 60 sec: 43690.7, 300 sec: 46763.8). Total num frames: 1751187456. Throughput: 0: 47086.7. Samples: 19020600. Policy #0 lag: (min: 1.0, avg: 58.1, max: 106.0) [2024-03-21 10:14:38,153][11055] Avg episode reward: [(0, '0.857')] [2024-03-21 10:14:39,268][11286] Updated weights for policy 0, policy_version 53446 (0.0011) [2024-03-21 10:14:42,569][11286] Updated weights for policy 0, policy_version 53456 (0.0010) [2024-03-21 10:14:43,152][11055] Fps is (10 sec: 68813.1, 60 sec: 47513.6, 300 sec: 47985.7). Total num frames: 1751646208. Throughput: 0: 46806.7. Samples: 19279800. Policy #0 lag: (min: 1.0, avg: 58.1, max: 106.0) [2024-03-21 10:14:43,153][11055] Avg episode reward: [(0, '1.178')] [2024-03-21 10:14:48,155][11055] Fps is (10 sec: 75347.3, 60 sec: 50242.1, 300 sec: 48762.8). Total num frames: 1751941120. Throughput: 0: 47435.2. Samples: 19554700. Policy #0 lag: (min: 1.0, avg: 58.1, max: 106.0) [2024-03-21 10:14:48,155][11055] Avg episode reward: [(0, '1.397')] [2024-03-21 10:14:48,960][11286] Updated weights for policy 0, policy_version 53466 (0.0024) [2024-03-21 10:14:53,123][11266] Signal inference workers to stop experience collection... (350 times) [2024-03-21 10:14:53,152][11055] Fps is (10 sec: 55705.7, 60 sec: 51882.7, 300 sec: 49207.6). Total num frames: 1752203264. Throughput: 0: 47306.6. Samples: 19694300. Policy #0 lag: (min: 1.0, avg: 58.1, max: 106.0) [2024-03-21 10:14:53,153][11055] Avg episode reward: [(0, '1.397')] [2024-03-21 10:14:53,180][11286] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-03-21 10:14:53,346][11266] Signal inference workers to resume experience collection... (350 times) [2024-03-21 10:14:53,347][11286] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-03-21 10:14:53,602][11266] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000053475_1752268800.pth... [2024-03-21 10:14:53,716][11266] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000053128_1740898304.pth [2024-03-21 10:14:55,047][11286] Updated weights for policy 0, policy_version 53476 (0.0021) [2024-03-21 10:14:58,153][11055] Fps is (10 sec: 55717.2, 60 sec: 51882.3, 300 sec: 49207.5). Total num frames: 1752498176. Throughput: 0: 47766.3. Samples: 19989900. Policy #0 lag: (min: 2.0, avg: 40.9, max: 70.0) [2024-03-21 10:14:58,153][11055] Avg episode reward: [(0, '1.397')] [2024-03-21 10:15:01,874][11286] Updated weights for policy 0, policy_version 53486 (0.0018) [2024-03-21 10:15:03,152][11055] Fps is (10 sec: 52428.9, 60 sec: 49698.2, 300 sec: 48763.2). Total num frames: 1752727552. Throughput: 0: 48111.2. Samples: 20287600. Policy #0 lag: (min: 2.0, avg: 40.9, max: 70.0) [2024-03-21 10:15:03,153][11055] Avg episode reward: [(0, '0.815')] [2024-03-21 10:15:08,152][11055] Fps is (10 sec: 39323.1, 60 sec: 47513.5, 300 sec: 47985.7). Total num frames: 1752891392. Throughput: 0: 48133.2. Samples: 20450700. Policy #0 lag: (min: 2.0, avg: 40.9, max: 70.0) [2024-03-21 10:15:08,153][11055] Avg episode reward: [(0, '0.815')] [2024-03-21 10:15:09,067][11286] Updated weights for policy 0, policy_version 53496 (0.0011) [2024-03-21 10:15:13,152][11055] Fps is (10 sec: 36044.9, 60 sec: 47513.7, 300 sec: 47874.6). Total num frames: 1753088000. Throughput: 0: 48479.9. Samples: 20746100. Policy #0 lag: (min: 2.0, avg: 40.9, max: 70.0) [2024-03-21 10:15:13,153][11055] Avg episode reward: [(0, '0.815')] [2024-03-21 10:15:18,158][11055] Fps is (10 sec: 29475.0, 60 sec: 45871.0, 300 sec: 47429.4). Total num frames: 1753186304. Throughput: 0: 48305.2. Samples: 21049700. Policy #0 lag: (min: 2.0, avg: 40.9, max: 70.0) [2024-03-21 10:15:18,158][11055] Avg episode reward: [(0, '1.000')] [2024-03-21 10:15:19,910][11286] Updated weights for policy 0, policy_version 53506 (0.0026) [2024-03-21 10:15:23,152][11055] Fps is (10 sec: 29491.0, 60 sec: 46421.2, 300 sec: 47763.5). Total num frames: 1753382912. Throughput: 0: 48393.3. Samples: 21198300. Policy #0 lag: (min: 2.0, avg: 40.9, max: 70.0) [2024-03-21 10:15:23,153][11055] Avg episode reward: [(0, '0.659')] [2024-03-21 10:15:26,327][11286] Updated weights for policy 0, policy_version 53516 (0.0013) [2024-03-21 10:15:28,152][11055] Fps is (10 sec: 59015.8, 60 sec: 49152.0, 300 sec: 48096.8). Total num frames: 1753776128. Throughput: 0: 48922.4. Samples: 21481300. Policy #0 lag: (min: 2.0, avg: 40.9, max: 70.0) [2024-03-21 10:15:28,152][11055] Avg episode reward: [(0, '0.659')] [2024-03-21 10:15:31,282][11286] Updated weights for policy 0, policy_version 53526 (0.0025) [2024-03-21 10:15:33,152][11055] Fps is (10 sec: 58982.8, 60 sec: 50244.3, 300 sec: 47985.7). Total num frames: 1753972736. Throughput: 0: 49127.2. Samples: 21765300. Policy #0 lag: (min: 2.0, avg: 37.5, max: 81.0) [2024-03-21 10:15:33,153][11055] Avg episode reward: [(0, '0.659')] [2024-03-21 10:15:37,876][11286] Updated weights for policy 0, policy_version 53536 (0.0011) [2024-03-21 10:15:38,152][11055] Fps is (10 sec: 49151.8, 60 sec: 51336.6, 300 sec: 48096.8). Total num frames: 1754267648. Throughput: 0: 48802.3. Samples: 21890400. Policy #0 lag: (min: 2.0, avg: 37.5, max: 81.0) [2024-03-21 10:15:38,153][11055] Avg episode reward: [(0, '1.094')] [2024-03-21 10:15:43,152][11055] Fps is (10 sec: 58982.2, 60 sec: 48605.9, 300 sec: 47763.5). Total num frames: 1754562560. Throughput: 0: 47882.7. Samples: 22144600. Policy #0 lag: (min: 2.0, avg: 37.5, max: 81.0) [2024-03-21 10:15:43,153][11055] Avg episode reward: [(0, '1.466')] [2024-03-21 10:15:44,369][11286] Updated weights for policy 0, policy_version 53546 (0.0015) [2024-03-21 10:15:48,152][11055] Fps is (10 sec: 55705.0, 60 sec: 48061.7, 300 sec: 47319.2). Total num frames: 1754824704. Throughput: 0: 47688.8. Samples: 22433600. Policy #0 lag: (min: 2.0, avg: 37.5, max: 81.0) [2024-03-21 10:15:48,153][11055] Avg episode reward: [(0, '0.708')] [2024-03-21 10:15:48,918][11266] Signal inference workers to stop experience collection... (400 times) [2024-03-21 10:15:48,966][11286] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-03-21 10:15:49,179][11266] Signal inference workers to resume experience collection... (400 times) [2024-03-21 10:15:49,179][11286] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-03-21 10:15:49,431][11286] Updated weights for policy 0, policy_version 53556 (0.0012) [2024-03-21 10:15:53,152][11055] Fps is (10 sec: 49151.6, 60 sec: 47513.5, 300 sec: 47430.3). Total num frames: 1755054080. Throughput: 0: 47346.6. Samples: 22581300. Policy #0 lag: (min: 2.0, avg: 37.5, max: 81.0) [2024-03-21 10:15:53,153][11055] Avg episode reward: [(0, '1.522')] [2024-03-21 10:15:57,281][11286] Updated weights for policy 0, policy_version 53566 (0.0011) [2024-03-21 10:15:58,152][11055] Fps is (10 sec: 45875.6, 60 sec: 46421.7, 300 sec: 47541.4). Total num frames: 1755283456. Throughput: 0: 47171.1. Samples: 22868800. Policy #0 lag: (min: 2.0, avg: 37.5, max: 81.0) [2024-03-21 10:15:58,153][11055] Avg episode reward: [(0, '1.522')] [2024-03-21 10:16:03,152][11055] Fps is (10 sec: 32767.9, 60 sec: 44236.7, 300 sec: 47430.3). Total num frames: 1755381760. Throughput: 0: 44205.4. Samples: 23038700. Policy #0 lag: (min: 2.0, avg: 37.5, max: 81.0) [2024-03-21 10:16:03,153][11055] Avg episode reward: [(0, '1.522')] [2024-03-21 10:16:08,152][11055] Fps is (10 sec: 9830.4, 60 sec: 41506.2, 300 sec: 46874.9). Total num frames: 1755381760. Throughput: 0: 42817.9. Samples: 23125100. Policy #0 lag: (min: 2.0, avg: 37.5, max: 81.0) [2024-03-21 10:16:08,153][11055] Avg episode reward: [(0, '1.522')] [2024-03-21 10:16:13,152][11055] Fps is (10 sec: 16384.2, 60 sec: 40960.0, 300 sec: 46874.9). Total num frames: 1755545600. Throughput: 0: 40179.9. Samples: 23289400. Policy #0 lag: (min: 0.0, avg: 34.9, max: 86.0) [2024-03-21 10:16:13,153][11055] Avg episode reward: [(0, '1.522')] [2024-03-21 10:16:14,338][11286] Updated weights for policy 0, policy_version 53576 (0.0021) [2024-03-21 10:16:18,156][11055] Fps is (10 sec: 29479.1, 60 sec: 41507.2, 300 sec: 46429.9). Total num frames: 1755676672. Throughput: 0: 37007.7. Samples: 23430800. Policy #0 lag: (min: 0.0, avg: 34.9, max: 86.0) [2024-03-21 10:16:18,157][11055] Avg episode reward: [(0, '0.885')] [2024-03-21 10:16:23,152][11055] Fps is (10 sec: 19660.6, 60 sec: 39321.6, 300 sec: 45764.1). Total num frames: 1755742208. Throughput: 0: 35726.5. Samples: 23498100. Policy #0 lag: (min: 0.0, avg: 34.9, max: 86.0) [2024-03-21 10:16:23,153][11055] Avg episode reward: [(0, '1.756')] [2024-03-21 10:16:28,152][11055] Fps is (10 sec: 13112.6, 60 sec: 33860.2, 300 sec: 44764.4). Total num frames: 1755807744. Throughput: 0: 33462.3. Samples: 23650400. Policy #0 lag: (min: 0.0, avg: 34.9, max: 86.0) [2024-03-21 10:16:28,153][11055] Avg episode reward: [(0, '1.565')] [2024-03-21 10:16:30,358][11286] Updated weights for policy 0, policy_version 53586 (0.0017) [2024-03-21 10:16:33,152][11055] Fps is (10 sec: 16384.1, 60 sec: 32221.8, 300 sec: 43875.8). Total num frames: 1755906048. Throughput: 0: 30660.0. Samples: 23813300. Policy #0 lag: (min: 0.0, avg: 34.9, max: 86.0) [2024-03-21 10:16:33,153][11055] Avg episode reward: [(0, '1.009')] [2024-03-21 10:16:38,152][11055] Fps is (10 sec: 26214.2, 60 sec: 30037.3, 300 sec: 43320.4). Total num frames: 1756069888. Throughput: 0: 29202.3. Samples: 23895400. Policy #0 lag: (min: 0.0, avg: 34.9, max: 86.0) [2024-03-21 10:16:38,153][11055] Avg episode reward: [(0, '1.541')] [2024-03-21 10:16:42,905][11286] Updated weights for policy 0, policy_version 53596 (0.0075) [2024-03-21 10:16:43,155][11055] Fps is (10 sec: 32759.7, 60 sec: 27851.6, 300 sec: 43097.9). Total num frames: 1756233728. Throughput: 0: 26305.1. Samples: 24052600. Policy #0 lag: (min: 0.0, avg: 34.9, max: 86.0) [2024-03-21 10:16:43,156][11055] Avg episode reward: [(0, '0.737')] [2024-03-21 10:16:48,152][11055] Fps is (10 sec: 22937.8, 60 sec: 24576.0, 300 sec: 42765.0). Total num frames: 1756299264. Throughput: 0: 26529.0. Samples: 24232500. Policy #0 lag: (min: 0.0, avg: 41.8, max: 86.0) [2024-03-21 10:16:48,153][11055] Avg episode reward: [(0, '1.472')] [2024-03-21 10:16:52,306][11286] Updated weights for policy 0, policy_version 53606 (0.0024) [2024-03-21 10:16:53,152][11055] Fps is (10 sec: 32776.3, 60 sec: 25122.2, 300 sec: 42876.1). Total num frames: 1756561408. Throughput: 0: 26591.1. Samples: 24321700. Policy #0 lag: (min: 0.0, avg: 41.8, max: 86.0) [2024-03-21 10:16:53,153][11055] Avg episode reward: [(0, '1.460')] [2024-03-21 10:16:53,163][11266] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000053606_1756561408.pth... [2024-03-21 10:16:53,283][11266] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000053295_1746370560.pth [2024-03-21 10:16:58,152][11055] Fps is (10 sec: 45874.9, 60 sec: 24576.0, 300 sec: 42987.2). Total num frames: 1756758016. Throughput: 0: 27264.4. Samples: 24516300. Policy #0 lag: (min: 0.0, avg: 41.8, max: 86.0) [2024-03-21 10:16:58,153][11055] Avg episode reward: [(0, '1.660')] [2024-03-21 10:17:02,730][11286] Updated weights for policy 0, policy_version 53616 (0.0018) [2024-03-21 10:17:03,152][11055] Fps is (10 sec: 36044.3, 60 sec: 25668.2, 300 sec: 43209.3). Total num frames: 1756921856. Throughput: 0: 29387.0. Samples: 24753100. Policy #0 lag: (min: 0.0, avg: 41.8, max: 86.0) [2024-03-21 10:17:03,153][11055] Avg episode reward: [(0, '1.279')] [2024-03-21 10:17:07,419][11286] Updated weights for policy 0, policy_version 53626 (0.0020) [2024-03-21 10:17:08,152][11055] Fps is (10 sec: 45875.7, 60 sec: 30583.5, 300 sec: 43542.6). Total num frames: 1757216768. Throughput: 0: 30220.1. Samples: 24858000. Policy #0 lag: (min: 0.0, avg: 41.8, max: 86.0) [2024-03-21 10:17:08,153][11055] Avg episode reward: [(0, '1.425')] [2024-03-21 10:17:13,152][11055] Fps is (10 sec: 55706.1, 60 sec: 32221.8, 300 sec: 43320.4). Total num frames: 1757478912. Throughput: 0: 32822.1. Samples: 25127400. Policy #0 lag: (min: 0.0, avg: 41.8, max: 86.0) [2024-03-21 10:17:13,153][11055] Avg episode reward: [(0, '0.711')] [2024-03-21 10:17:13,801][11286] Updated weights for policy 0, policy_version 53636 (0.0013) [2024-03-21 10:17:16,968][11266] Signal inference workers to stop experience collection... (450 times) [2024-03-21 10:17:16,971][11266] Signal inference workers to resume experience collection... (450 times) [2024-03-21 10:17:17,019][11286] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-03-21 10:17:17,019][11286] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-03-21 10:17:18,152][11055] Fps is (10 sec: 62258.6, 60 sec: 36047.2, 300 sec: 43209.3). Total num frames: 1757839360. Throughput: 0: 35546.7. Samples: 25412900. Policy #0 lag: (min: 0.0, avg: 41.8, max: 86.0) [2024-03-21 10:17:18,153][11055] Avg episode reward: [(0, '1.450')] [2024-03-21 10:17:18,562][11286] Updated weights for policy 0, policy_version 53646 (0.0020) [2024-03-21 10:17:23,152][11055] Fps is (10 sec: 55705.3, 60 sec: 38229.3, 300 sec: 42876.1). Total num frames: 1758035968. Throughput: 0: 37137.7. Samples: 25566600. Policy #0 lag: (min: 0.0, avg: 51.0, max: 101.0) [2024-03-21 10:17:23,153][11055] Avg episode reward: [(0, '1.450')] [2024-03-21 10:17:26,381][11286] Updated weights for policy 0, policy_version 53656 (0.0017) [2024-03-21 10:17:28,152][11055] Fps is (10 sec: 36044.9, 60 sec: 39867.7, 300 sec: 42876.1). Total num frames: 1758199808. Throughput: 0: 39473.4. Samples: 25828800. Policy #0 lag: (min: 0.0, avg: 51.0, max: 101.0) [2024-03-21 10:17:28,153][11055] Avg episode reward: [(0, '1.157')] [2024-03-21 10:17:33,152][11055] Fps is (10 sec: 16384.2, 60 sec: 38229.4, 300 sec: 41987.5). Total num frames: 1758199808. Throughput: 0: 41751.1. Samples: 26111300. Policy #0 lag: (min: 0.0, avg: 51.0, max: 101.0) [2024-03-21 10:17:33,153][11055] Avg episode reward: [(0, '1.639')] [2024-03-21 10:17:38,152][11055] Fps is (10 sec: 9830.4, 60 sec: 37137.1, 300 sec: 41765.3). Total num frames: 1758298112. Throughput: 0: 42720.0. Samples: 26244100. Policy #0 lag: (min: 0.0, avg: 51.0, max: 101.0) [2024-03-21 10:17:38,153][11055] Avg episode reward: [(0, '1.453')] [2024-03-21 10:17:42,632][11286] Updated weights for policy 0, policy_version 53666 (0.0013) [2024-03-21 10:17:43,152][11055] Fps is (10 sec: 36044.9, 60 sec: 38777.2, 300 sec: 41654.2). Total num frames: 1758560256. Throughput: 0: 43935.6. Samples: 26493400. Policy #0 lag: (min: 0.0, avg: 51.0, max: 101.0) [2024-03-21 10:17:43,153][11055] Avg episode reward: [(0, '0.881')] [2024-03-21 10:17:47,650][11286] Updated weights for policy 0, policy_version 53676 (0.0016) [2024-03-21 10:17:48,152][11055] Fps is (10 sec: 58982.6, 60 sec: 43144.5, 300 sec: 42431.8). Total num frames: 1758887936. Throughput: 0: 44018.0. Samples: 26733900. Policy #0 lag: (min: 0.0, avg: 51.0, max: 101.0) [2024-03-21 10:17:48,153][11055] Avg episode reward: [(0, '1.096')] [2024-03-21 10:17:52,730][11286] Updated weights for policy 0, policy_version 53686 (0.0010) [2024-03-21 10:17:53,152][11055] Fps is (10 sec: 62258.4, 60 sec: 43690.6, 300 sec: 42542.8). Total num frames: 1759182848. Throughput: 0: 44855.4. Samples: 26876500. Policy #0 lag: (min: 0.0, avg: 51.0, max: 101.0) [2024-03-21 10:17:53,153][11055] Avg episode reward: [(0, '1.502')] [2024-03-21 10:17:57,336][11286] Updated weights for policy 0, policy_version 53696 (0.0017) [2024-03-21 10:17:58,152][11055] Fps is (10 sec: 68812.4, 60 sec: 46967.5, 300 sec: 43320.4). Total num frames: 1759576064. Throughput: 0: 44788.9. Samples: 27142900. Policy #0 lag: (min: 1.0, avg: 46.8, max: 98.0) [2024-03-21 10:17:58,153][11055] Avg episode reward: [(0, '1.255')] [2024-03-21 10:18:00,688][11286] Updated weights for policy 0, policy_version 53706 (0.0022) [2024-03-21 10:18:03,152][11055] Fps is (10 sec: 68813.4, 60 sec: 49152.1, 300 sec: 43320.4). Total num frames: 1759870976. Throughput: 0: 44651.1. Samples: 27422200. Policy #0 lag: (min: 1.0, avg: 46.8, max: 98.0) [2024-03-21 10:18:03,153][11055] Avg episode reward: [(0, '1.175')] [2024-03-21 10:18:08,152][11055] Fps is (10 sec: 55706.5, 60 sec: 48605.9, 300 sec: 43209.4). Total num frames: 1760133120. Throughput: 0: 44473.6. Samples: 27567900. Policy #0 lag: (min: 1.0, avg: 46.8, max: 98.0) [2024-03-21 10:18:08,152][11055] Avg episode reward: [(0, '1.175')] [2024-03-21 10:18:08,181][11286] Updated weights for policy 0, policy_version 53716 (0.0011) [2024-03-21 10:18:13,152][11055] Fps is (10 sec: 42598.2, 60 sec: 46967.5, 300 sec: 42653.9). Total num frames: 1760296960. Throughput: 0: 44573.3. Samples: 27834600. Policy #0 lag: (min: 1.0, avg: 46.8, max: 98.0) [2024-03-21 10:18:13,153][11055] Avg episode reward: [(0, '1.175')] [2024-03-21 10:18:18,152][11055] Fps is (10 sec: 22937.3, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1760362496. Throughput: 0: 44719.9. Samples: 28123700. Policy #0 lag: (min: 1.0, avg: 46.8, max: 98.0) [2024-03-21 10:18:18,153][11055] Avg episode reward: [(0, '1.351')] [2024-03-21 10:18:22,530][11266] Signal inference workers to stop experience collection... (500 times) [2024-03-21 10:18:22,589][11286] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-03-21 10:18:22,833][11266] Signal inference workers to resume experience collection... (500 times) [2024-03-21 10:18:22,834][11286] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-03-21 10:18:23,152][11055] Fps is (10 sec: 16384.1, 60 sec: 40414.0, 300 sec: 42098.6). Total num frames: 1760460800. Throughput: 0: 45048.9. Samples: 28271300. Policy #0 lag: (min: 1.0, avg: 46.8, max: 98.0) [2024-03-21 10:18:23,153][11055] Avg episode reward: [(0, '1.111')] [2024-03-21 10:18:25,741][11286] Updated weights for policy 0, policy_version 53726 (0.0012) [2024-03-21 10:18:28,152][11055] Fps is (10 sec: 29491.3, 60 sec: 40960.0, 300 sec: 41987.5). Total num frames: 1760657408. Throughput: 0: 45966.6. Samples: 28561900. Policy #0 lag: (min: 1.0, avg: 46.8, max: 98.0) [2024-03-21 10:18:28,153][11055] Avg episode reward: [(0, '1.345')] [2024-03-21 10:18:30,656][11286] Updated weights for policy 0, policy_version 53736 (0.0024) [2024-03-21 10:18:33,152][11055] Fps is (10 sec: 52428.3, 60 sec: 46421.3, 300 sec: 42098.5). Total num frames: 1760985088. Throughput: 0: 46606.6. Samples: 28831200. Policy #0 lag: (min: 0.0, avg: 33.2, max: 77.0) [2024-03-21 10:18:33,153][11055] Avg episode reward: [(0, '1.345')] [2024-03-21 10:18:35,732][11286] Updated weights for policy 0, policy_version 53746 (0.0015) [2024-03-21 10:18:38,152][11055] Fps is (10 sec: 58981.9, 60 sec: 49151.9, 300 sec: 42209.6). Total num frames: 1761247232. Throughput: 0: 46460.0. Samples: 28967200. Policy #0 lag: (min: 0.0, avg: 33.2, max: 77.0) [2024-03-21 10:18:38,153][11055] Avg episode reward: [(0, '1.533')] [2024-03-21 10:18:41,316][11286] Updated weights for policy 0, policy_version 53756 (0.0015) [2024-03-21 10:18:43,152][11055] Fps is (10 sec: 55706.0, 60 sec: 49698.1, 300 sec: 42765.0). Total num frames: 1761542144. Throughput: 0: 46826.7. Samples: 29250100. Policy #0 lag: (min: 0.0, avg: 33.2, max: 77.0) [2024-03-21 10:18:43,153][11055] Avg episode reward: [(0, '1.533')] [2024-03-21 10:18:47,873][11286] Updated weights for policy 0, policy_version 53766 (0.0021) [2024-03-21 10:18:48,152][11055] Fps is (10 sec: 58983.0, 60 sec: 49152.0, 300 sec: 43209.3). Total num frames: 1761837056. Throughput: 0: 46215.6. Samples: 29501900. Policy #0 lag: (min: 0.0, avg: 33.2, max: 77.0) [2024-03-21 10:18:48,153][11055] Avg episode reward: [(0, '1.274')] [2024-03-21 10:18:53,152][11055] Fps is (10 sec: 45875.0, 60 sec: 46967.5, 300 sec: 42765.0). Total num frames: 1762000896. Throughput: 0: 46093.2. Samples: 29642100. Policy #0 lag: (min: 0.0, avg: 33.2, max: 77.0) [2024-03-21 10:18:53,153][11055] Avg episode reward: [(0, '1.469')] [2024-03-21 10:18:53,163][11266] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000053772_1762000896.pth... [2024-03-21 10:18:53,325][11266] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000053475_1752268800.pth [2024-03-21 10:18:56,828][11286] Updated weights for policy 0, policy_version 53776 (0.0011) [2024-03-21 10:18:58,152][11055] Fps is (10 sec: 42598.4, 60 sec: 44783.0, 300 sec: 42431.8). Total num frames: 1762263040. Throughput: 0: 46993.4. Samples: 29949300. Policy #0 lag: (min: 0.0, avg: 33.2, max: 77.0) [2024-03-21 10:18:58,153][11055] Avg episode reward: [(0, '1.469')] [2024-03-21 10:18:59,893][11286] Updated weights for policy 0, policy_version 53786 (0.0010) [2024-03-21 10:19:03,152][11055] Fps is (10 sec: 72089.9, 60 sec: 47513.6, 300 sec: 42987.2). Total num frames: 1762721792. Throughput: 0: 45737.8. Samples: 30181900. Policy #0 lag: (min: 0.0, avg: 33.2, max: 77.0) [2024-03-21 10:19:03,153][11055] Avg episode reward: [(0, '1.354')] [2024-03-21 10:19:03,755][11266] Signal inference workers to stop experience collection... (550 times) [2024-03-21 10:19:03,822][11266] Signal inference workers to resume experience collection... (550 times) [2024-03-21 10:19:03,902][11286] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-03-21 10:19:03,953][11286] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-03-21 10:19:04,158][11286] Updated weights for policy 0, policy_version 53796 (0.0020) [2024-03-21 10:19:08,152][11055] Fps is (10 sec: 58982.6, 60 sec: 45329.0, 300 sec: 42765.0). Total num frames: 1762852864. Throughput: 0: 45648.9. Samples: 30325500. Policy #0 lag: (min: 0.0, avg: 60.0, max: 108.0) [2024-03-21 10:19:08,153][11055] Avg episode reward: [(0, '1.354')] [2024-03-21 10:19:13,153][11055] Fps is (10 sec: 32766.6, 60 sec: 45874.9, 300 sec: 42765.0). Total num frames: 1763049472. Throughput: 0: 45739.6. Samples: 30620200. Policy #0 lag: (min: 0.0, avg: 60.0, max: 108.0) [2024-03-21 10:19:13,154][11055] Avg episode reward: [(0, '0.588')] [2024-03-21 10:19:13,993][11286] Updated weights for policy 0, policy_version 53806 (0.0017) [2024-03-21 10:19:18,152][11055] Fps is (10 sec: 42598.2, 60 sec: 48605.9, 300 sec: 42987.2). Total num frames: 1763278848. Throughput: 0: 46286.7. Samples: 30914100. Policy #0 lag: (min: 0.0, avg: 60.0, max: 108.0) [2024-03-21 10:19:18,153][11055] Avg episode reward: [(0, '0.588')] [2024-03-21 10:19:23,152][11055] Fps is (10 sec: 36046.4, 60 sec: 49152.0, 300 sec: 42653.9). Total num frames: 1763409920. Throughput: 0: 46486.8. Samples: 31059100. Policy #0 lag: (min: 0.0, avg: 60.0, max: 108.0) [2024-03-21 10:19:23,153][11055] Avg episode reward: [(0, '1.536')] [2024-03-21 10:19:25,028][11286] Updated weights for policy 0, policy_version 53816 (0.0010) [2024-03-21 10:19:28,152][11055] Fps is (10 sec: 36044.8, 60 sec: 49698.1, 300 sec: 42987.2). Total num frames: 1763639296. Throughput: 0: 46953.3. Samples: 31363000. Policy #0 lag: (min: 0.0, avg: 60.0, max: 108.0) [2024-03-21 10:19:28,153][11055] Avg episode reward: [(0, '1.637')] [2024-03-21 10:19:32,245][11286] Updated weights for policy 0, policy_version 53826 (0.0017) [2024-03-21 10:19:33,152][11055] Fps is (10 sec: 42597.9, 60 sec: 47513.6, 300 sec: 42876.1). Total num frames: 1763835904. Throughput: 0: 47133.3. Samples: 31622900. Policy #0 lag: (min: 0.0, avg: 60.0, max: 108.0) [2024-03-21 10:19:33,153][11055] Avg episode reward: [(0, '1.469')] [2024-03-21 10:19:38,152][11055] Fps is (10 sec: 39321.9, 60 sec: 46421.5, 300 sec: 41987.5). Total num frames: 1764032512. Throughput: 0: 47333.4. Samples: 31772100. Policy #0 lag: (min: 0.0, avg: 31.2, max: 68.0) [2024-03-21 10:19:38,153][11055] Avg episode reward: [(0, '1.235')] [2024-03-21 10:19:38,482][11286] Updated weights for policy 0, policy_version 53836 (0.0011) [2024-03-21 10:19:42,624][11286] Updated weights for policy 0, policy_version 53846 (0.0034) [2024-03-21 10:19:43,152][11055] Fps is (10 sec: 62259.9, 60 sec: 48605.9, 300 sec: 42432.2). Total num frames: 1764458496. Throughput: 0: 46600.0. Samples: 32046300. Policy #0 lag: (min: 0.0, avg: 31.2, max: 68.0) [2024-03-21 10:19:43,153][11055] Avg episode reward: [(0, '1.270')] [2024-03-21 10:19:47,244][11286] Updated weights for policy 0, policy_version 53856 (0.0028) [2024-03-21 10:19:48,152][11055] Fps is (10 sec: 72089.4, 60 sec: 48605.9, 300 sec: 42542.9). Total num frames: 1764753408. Throughput: 0: 47784.5. Samples: 32332200. Policy #0 lag: (min: 0.0, avg: 31.2, max: 68.0) [2024-03-21 10:19:48,161][11055] Avg episode reward: [(0, '1.270')] [2024-03-21 10:19:53,152][11055] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 42209.7). Total num frames: 1764950016. Throughput: 0: 48066.7. Samples: 32488500. Policy #0 lag: (min: 0.0, avg: 31.2, max: 68.0) [2024-03-21 10:19:53,161][11055] Avg episode reward: [(0, '1.162')] [2024-03-21 10:19:57,948][11286] Updated weights for policy 0, policy_version 53866 (0.0020) [2024-03-21 10:19:58,152][11055] Fps is (10 sec: 32767.5, 60 sec: 46967.4, 300 sec: 41876.4). Total num frames: 1765081088. Throughput: 0: 48178.1. Samples: 32788200. Policy #0 lag: (min: 0.0, avg: 31.2, max: 68.0) [2024-03-21 10:19:58,153][11055] Avg episode reward: [(0, '1.162')] [2024-03-21 10:20:01,452][11266] Signal inference workers to stop experience collection... (600 times) [2024-03-21 10:20:01,512][11286] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-03-21 10:20:01,777][11266] Signal inference workers to resume experience collection... (600 times) [2024-03-21 10:20:01,777][11286] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-03-21 10:20:03,152][11055] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 42320.7). Total num frames: 1765376000. Throughput: 0: 47802.3. Samples: 33065200. Policy #0 lag: (min: 0.0, avg: 31.2, max: 68.0) [2024-03-21 10:20:03,163][11055] Avg episode reward: [(0, '1.286')] [2024-03-21 10:20:04,253][11286] Updated weights for policy 0, policy_version 53876 (0.0018) [2024-03-21 10:20:08,152][11055] Fps is (10 sec: 62260.6, 60 sec: 47513.7, 300 sec: 42765.0). Total num frames: 1765703680. Throughput: 0: 47433.4. Samples: 33193600. Policy #0 lag: (min: 0.0, avg: 31.2, max: 68.0) [2024-03-21 10:20:08,152][11055] Avg episode reward: [(0, '1.128')] [2024-03-21 10:20:08,216][11286] Updated weights for policy 0, policy_version 53886 (0.0044) [2024-03-21 10:20:13,152][11055] Fps is (10 sec: 39321.7, 60 sec: 45329.4, 300 sec: 42654.7). Total num frames: 1765769216. Throughput: 0: 47546.7. Samples: 33502600. Policy #0 lag: (min: 0.0, avg: 31.2, max: 68.0) [2024-03-21 10:20:13,153][11055] Avg episode reward: [(0, '1.520')] [2024-03-21 10:20:18,152][11055] Fps is (10 sec: 32767.6, 60 sec: 45875.2, 300 sec: 42876.1). Total num frames: 1766031360. Throughput: 0: 47773.4. Samples: 33772700. Policy #0 lag: (min: 0.0, avg: 45.5, max: 87.0) [2024-03-21 10:20:18,153][11055] Avg episode reward: [(0, '1.149')] [2024-03-21 10:20:19,754][11286] Updated weights for policy 0, policy_version 53896 (0.0036) [2024-03-21 10:20:23,152][11055] Fps is (10 sec: 58982.8, 60 sec: 49152.0, 300 sec: 42653.9). Total num frames: 1766359040. Throughput: 0: 47993.4. Samples: 33931800. Policy #0 lag: (min: 0.0, avg: 45.5, max: 87.0) [2024-03-21 10:20:23,152][11055] Avg episode reward: [(0, '1.326')] [2024-03-21 10:20:23,167][11286] Updated weights for policy 0, policy_version 53906 (0.0012) [2024-03-21 10:20:28,152][11055] Fps is (10 sec: 52428.4, 60 sec: 48605.8, 300 sec: 42653.9). Total num frames: 1766555648. Throughput: 0: 47786.5. Samples: 34196700. Policy #0 lag: (min: 0.0, avg: 45.5, max: 87.0) [2024-03-21 10:20:28,153][11055] Avg episode reward: [(0, '1.743')] [2024-03-21 10:20:31,726][11286] Updated weights for policy 0, policy_version 53916 (0.0011) [2024-03-21 10:20:33,152][11055] Fps is (10 sec: 39321.2, 60 sec: 48605.9, 300 sec: 42320.7). Total num frames: 1766752256. Throughput: 0: 47777.7. Samples: 34482200. Policy #0 lag: (min: 0.0, avg: 45.5, max: 87.0) [2024-03-21 10:20:33,153][11055] Avg episode reward: [(0, '1.373')] [2024-03-21 10:20:37,432][11286] Updated weights for policy 0, policy_version 53926 (0.0015) [2024-03-21 10:20:38,152][11055] Fps is (10 sec: 55705.9, 60 sec: 51336.5, 300 sec: 42542.9). Total num frames: 1767112704. Throughput: 0: 47486.6. Samples: 34625400. Policy #0 lag: (min: 0.0, avg: 45.5, max: 87.0) [2024-03-21 10:20:38,153][11055] Avg episode reward: [(0, '1.022')] [2024-03-21 10:20:43,152][11055] Fps is (10 sec: 52428.1, 60 sec: 46967.3, 300 sec: 42209.6). Total num frames: 1767276544. Throughput: 0: 46797.8. Samples: 34894100. Policy #0 lag: (min: 0.0, avg: 45.5, max: 87.0) [2024-03-21 10:20:43,153][11055] Avg episode reward: [(0, '1.276')] [2024-03-21 10:20:44,130][11286] Updated weights for policy 0, policy_version 53936 (0.0012) [2024-03-21 10:20:48,152][11055] Fps is (10 sec: 36045.1, 60 sec: 45329.1, 300 sec: 42098.6). Total num frames: 1767473152. Throughput: 0: 47515.6. Samples: 35203400. Policy #0 lag: (min: 0.0, avg: 45.5, max: 87.0) [2024-03-21 10:20:48,153][11055] Avg episode reward: [(0, '1.276')] [2024-03-21 10:20:53,152][11055] Fps is (10 sec: 39321.9, 60 sec: 45329.0, 300 sec: 41987.5). Total num frames: 1767669760. Throughput: 0: 48186.5. Samples: 35362000. Policy #0 lag: (min: 0.0, avg: 37.0, max: 77.0) [2024-03-21 10:20:53,153][11055] Avg episode reward: [(0, '0.887')] [2024-03-21 10:20:53,428][11266] Signal inference workers to stop experience collection... (650 times) [2024-03-21 10:20:53,430][11266] Signal inference workers to resume experience collection... (650 times) [2024-03-21 10:20:53,432][11266] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000053946_1767702528.pth... [2024-03-21 10:20:53,449][11286] Updated weights for policy 0, policy_version 53946 (0.0009) [2024-03-21 10:20:53,483][11286] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-03-21 10:20:53,483][11286] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-03-21 10:20:53,545][11266] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000053606_1756561408.pth [2024-03-21 10:20:58,152][11055] Fps is (10 sec: 39321.3, 60 sec: 46421.4, 300 sec: 42320.7). Total num frames: 1767866368. Throughput: 0: 47666.6. Samples: 35647600. Policy #0 lag: (min: 0.0, avg: 37.0, max: 77.0) [2024-03-21 10:20:58,153][11055] Avg episode reward: [(0, '0.792')] [2024-03-21 10:20:59,471][11286] Updated weights for policy 0, policy_version 53956 (0.0023) [2024-03-21 10:21:03,152][11055] Fps is (10 sec: 58983.2, 60 sec: 48059.8, 300 sec: 43653.6). Total num frames: 1768259584. Throughput: 0: 47873.4. Samples: 35927000. Policy #0 lag: (min: 0.0, avg: 37.0, max: 77.0) [2024-03-21 10:21:03,153][11055] Avg episode reward: [(0, '1.553')] [2024-03-21 10:21:06,112][11286] Updated weights for policy 0, policy_version 53966 (0.0012) [2024-03-21 10:21:08,152][11055] Fps is (10 sec: 58983.0, 60 sec: 45875.2, 300 sec: 43764.7). Total num frames: 1768456192. Throughput: 0: 47868.9. Samples: 36085900. Policy #0 lag: (min: 0.0, avg: 37.0, max: 77.0) [2024-03-21 10:21:08,153][11055] Avg episode reward: [(0, '1.553')] [2024-03-21 10:21:12,546][11286] Updated weights for policy 0, policy_version 53976 (0.0021) [2024-03-21 10:21:13,152][11055] Fps is (10 sec: 45875.5, 60 sec: 49152.1, 300 sec: 44209.7). Total num frames: 1768718336. Throughput: 0: 48466.9. Samples: 36377700. Policy #0 lag: (min: 0.0, avg: 37.0, max: 77.0) [2024-03-21 10:21:13,152][11055] Avg episode reward: [(0, '1.040')] [2024-03-21 10:21:16,282][11286] Updated weights for policy 0, policy_version 53986 (0.0026) [2024-03-21 10:21:18,152][11055] Fps is (10 sec: 65535.4, 60 sec: 51336.5, 300 sec: 45319.8). Total num frames: 1769111552. Throughput: 0: 48071.1. Samples: 36645400. Policy #0 lag: (min: 0.0, avg: 37.0, max: 77.0) [2024-03-21 10:21:18,153][11055] Avg episode reward: [(0, '1.524')] [2024-03-21 10:21:23,152][11055] Fps is (10 sec: 52428.7, 60 sec: 48059.8, 300 sec: 45542.0). Total num frames: 1769242624. Throughput: 0: 48033.5. Samples: 36786900. Policy #0 lag: (min: 0.0, avg: 41.0, max: 89.0) [2024-03-21 10:21:23,153][11055] Avg episode reward: [(0, '1.524')] [2024-03-21 10:21:25,614][11286] Updated weights for policy 0, policy_version 53996 (0.0011) [2024-03-21 10:21:28,152][11055] Fps is (10 sec: 29491.4, 60 sec: 47513.7, 300 sec: 45764.1). Total num frames: 1769406464. Throughput: 0: 48575.8. Samples: 37080000. Policy #0 lag: (min: 0.0, avg: 41.0, max: 89.0) [2024-03-21 10:21:28,153][11055] Avg episode reward: [(0, '1.524')] [2024-03-21 10:21:32,393][11286] Updated weights for policy 0, policy_version 54006 (0.0011) [2024-03-21 10:21:33,152][11055] Fps is (10 sec: 42598.0, 60 sec: 48605.9, 300 sec: 46097.4). Total num frames: 1769668608. Throughput: 0: 47762.2. Samples: 37352700. Policy #0 lag: (min: 0.0, avg: 41.0, max: 89.0) [2024-03-21 10:21:33,153][11055] Avg episode reward: [(0, '0.839')] [2024-03-21 10:21:38,152][11055] Fps is (10 sec: 36044.7, 60 sec: 44236.9, 300 sec: 45875.6). Total num frames: 1769766912. Throughput: 0: 47533.5. Samples: 37501000. Policy #0 lag: (min: 0.0, avg: 41.0, max: 89.0) [2024-03-21 10:21:38,153][11055] Avg episode reward: [(0, '0.957')] [2024-03-21 10:21:43,152][11055] Fps is (10 sec: 29491.6, 60 sec: 44783.1, 300 sec: 46319.5). Total num frames: 1769963520. Throughput: 0: 47802.4. Samples: 37798700. Policy #0 lag: (min: 0.0, avg: 41.0, max: 89.0) [2024-03-21 10:21:43,153][11055] Avg episode reward: [(0, '1.243')] [2024-03-21 10:21:45,912][11286] Updated weights for policy 0, policy_version 54016 (0.0016) [2024-03-21 10:21:48,152][11055] Fps is (10 sec: 32768.0, 60 sec: 43690.7, 300 sec: 45875.2). Total num frames: 1770094592. Throughput: 0: 48380.0. Samples: 38104100. Policy #0 lag: (min: 0.0, avg: 41.0, max: 89.0) [2024-03-21 10:21:48,153][11055] Avg episode reward: [(0, '1.243')] [2024-03-21 10:21:49,569][11266] Signal inference workers to stop experience collection... (700 times) [2024-03-21 10:21:49,648][11286] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-03-21 10:21:49,795][11266] Signal inference workers to resume experience collection... (700 times) [2024-03-21 10:21:49,796][11286] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-03-21 10:21:50,511][11286] Updated weights for policy 0, policy_version 54026 (0.0014) [2024-03-21 10:21:53,152][11055] Fps is (10 sec: 49151.5, 60 sec: 46421.4, 300 sec: 46430.6). Total num frames: 1770455040. Throughput: 0: 47306.6. Samples: 38214700. Policy #0 lag: (min: 0.0, avg: 41.0, max: 89.0) [2024-03-21 10:21:53,153][11055] Avg episode reward: [(0, '1.347')] [2024-03-21 10:21:56,520][11286] Updated weights for policy 0, policy_version 54036 (0.0011) [2024-03-21 10:21:58,152][11055] Fps is (10 sec: 65536.9, 60 sec: 48059.9, 300 sec: 46875.0). Total num frames: 1770749952. Throughput: 0: 47444.5. Samples: 38512700. Policy #0 lag: (min: 2.0, avg: 31.3, max: 67.0) [2024-03-21 10:21:58,153][11055] Avg episode reward: [(0, '0.937')] [2024-03-21 10:22:00,970][11286] Updated weights for policy 0, policy_version 54046 (0.0016) [2024-03-21 10:22:03,152][11055] Fps is (10 sec: 68813.4, 60 sec: 48059.8, 300 sec: 47208.1). Total num frames: 1771143168. Throughput: 0: 47513.5. Samples: 38783500. Policy #0 lag: (min: 2.0, avg: 31.3, max: 67.0) [2024-03-21 10:22:03,153][11055] Avg episode reward: [(0, '0.937')] [2024-03-21 10:22:05,859][11286] Updated weights for policy 0, policy_version 54056 (0.0010) [2024-03-21 10:22:08,152][11055] Fps is (10 sec: 68812.1, 60 sec: 49698.1, 300 sec: 47319.2). Total num frames: 1771438080. Throughput: 0: 47482.2. Samples: 38923600. Policy #0 lag: (min: 2.0, avg: 31.3, max: 67.0) [2024-03-21 10:22:08,153][11055] Avg episode reward: [(0, '0.798')] [2024-03-21 10:22:09,974][11286] Updated weights for policy 0, policy_version 54066 (0.0028) [2024-03-21 10:22:13,152][11055] Fps is (10 sec: 58982.2, 60 sec: 50244.2, 300 sec: 47097.1). Total num frames: 1771732992. Throughput: 0: 47262.2. Samples: 39206800. Policy #0 lag: (min: 2.0, avg: 31.3, max: 67.0) [2024-03-21 10:22:13,153][11055] Avg episode reward: [(0, '0.798')] [2024-03-21 10:22:17,203][11286] Updated weights for policy 0, policy_version 54076 (0.0043) [2024-03-21 10:22:18,152][11055] Fps is (10 sec: 58982.7, 60 sec: 48606.0, 300 sec: 47430.3). Total num frames: 1772027904. Throughput: 0: 48093.5. Samples: 39516900. Policy #0 lag: (min: 2.0, avg: 31.3, max: 67.0) [2024-03-21 10:22:18,153][11055] Avg episode reward: [(0, '0.936')] [2024-03-21 10:22:22,696][11286] Updated weights for policy 0, policy_version 54086 (0.0012) [2024-03-21 10:22:23,152][11055] Fps is (10 sec: 55705.5, 60 sec: 50790.4, 300 sec: 47763.5). Total num frames: 1772290048. Throughput: 0: 47982.3. Samples: 39660200. Policy #0 lag: (min: 2.0, avg: 31.3, max: 67.0) [2024-03-21 10:22:23,153][11055] Avg episode reward: [(0, '0.936')] [2024-03-21 10:22:28,152][11055] Fps is (10 sec: 42598.4, 60 sec: 50790.5, 300 sec: 48318.9). Total num frames: 1772453888. Throughput: 0: 48155.5. Samples: 39965700. Policy #0 lag: (min: 2.0, avg: 31.3, max: 67.0) [2024-03-21 10:22:28,153][11055] Avg episode reward: [(0, '0.936')] [2024-03-21 10:22:33,152][11055] Fps is (10 sec: 19660.7, 60 sec: 46967.5, 300 sec: 48096.8). Total num frames: 1772486656. Throughput: 0: 48160.0. Samples: 40271300. Policy #0 lag: (min: 2.0, avg: 31.3, max: 67.0) [2024-03-21 10:22:33,153][11055] Avg episode reward: [(0, '0.653')] [2024-03-21 10:22:37,891][11286] Updated weights for policy 0, policy_version 54096 (0.0016) [2024-03-21 10:22:38,152][11055] Fps is (10 sec: 16384.0, 60 sec: 47513.7, 300 sec: 47652.5). Total num frames: 1772617728. Throughput: 0: 48711.2. Samples: 40406700. Policy #0 lag: (min: 0.0, avg: 36.3, max: 83.0) [2024-03-21 10:22:38,153][11055] Avg episode reward: [(0, '0.700')] [2024-03-21 10:22:38,264][11266] Signal inference workers to stop experience collection... (750 times) [2024-03-21 10:22:38,264][11266] Signal inference workers to resume experience collection... (750 times) [2024-03-21 10:22:38,314][11286] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-03-21 10:22:38,357][11286] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-03-21 10:22:43,152][11055] Fps is (10 sec: 29491.3, 60 sec: 46967.4, 300 sec: 47097.1). Total num frames: 1772781568. Throughput: 0: 47937.6. Samples: 40669900. Policy #0 lag: (min: 0.0, avg: 36.3, max: 83.0) [2024-03-21 10:22:43,153][11055] Avg episode reward: [(0, '1.431')] [2024-03-21 10:22:46,675][11286] Updated weights for policy 0, policy_version 54106 (0.0011) [2024-03-21 10:22:48,152][11055] Fps is (10 sec: 45874.8, 60 sec: 49698.2, 300 sec: 47097.1). Total num frames: 1773076480. Throughput: 0: 47371.1. Samples: 40915200. Policy #0 lag: (min: 0.0, avg: 36.3, max: 83.0) [2024-03-21 10:22:48,153][11055] Avg episode reward: [(0, '1.322')] [2024-03-21 10:22:52,715][11286] Updated weights for policy 0, policy_version 54116 (0.0013) [2024-03-21 10:22:53,152][11055] Fps is (10 sec: 52429.0, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 1773305856. Throughput: 0: 47022.2. Samples: 41039600. Policy #0 lag: (min: 0.0, avg: 36.3, max: 83.0) [2024-03-21 10:22:53,153][11055] Avg episode reward: [(0, '0.649')] [2024-03-21 10:22:53,165][11266] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000054117_1773305856.pth... [2024-03-21 10:22:53,322][11266] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000053772_1762000896.pth [2024-03-21 10:22:58,152][11055] Fps is (10 sec: 45875.7, 60 sec: 46421.3, 300 sec: 46319.5). Total num frames: 1773535232. Throughput: 0: 47355.7. Samples: 41337800. Policy #0 lag: (min: 0.0, avg: 36.3, max: 83.0) [2024-03-21 10:22:58,152][11055] Avg episode reward: [(0, '1.172')] [2024-03-21 10:22:58,474][11286] Updated weights for policy 0, policy_version 54126 (0.0016) [2024-03-21 10:23:03,152][11055] Fps is (10 sec: 49152.0, 60 sec: 44236.8, 300 sec: 46319.5). Total num frames: 1773797376. Throughput: 0: 46157.7. Samples: 41594000. Policy #0 lag: (min: 0.0, avg: 36.3, max: 83.0) [2024-03-21 10:23:03,153][11055] Avg episode reward: [(0, '0.639')] [2024-03-21 10:23:04,185][11286] Updated weights for policy 0, policy_version 54136 (0.0011) [2024-03-21 10:23:08,152][11055] Fps is (10 sec: 58981.9, 60 sec: 44783.0, 300 sec: 46874.9). Total num frames: 1774125056. Throughput: 0: 45626.7. Samples: 41713400. Policy #0 lag: (min: 0.0, avg: 36.3, max: 83.0) [2024-03-21 10:23:08,153][11055] Avg episode reward: [(0, '1.082')] [2024-03-21 10:23:11,072][11286] Updated weights for policy 0, policy_version 54146 (0.0015) [2024-03-21 10:23:13,152][11055] Fps is (10 sec: 58982.3, 60 sec: 44236.8, 300 sec: 47541.4). Total num frames: 1774387200. Throughput: 0: 44839.9. Samples: 41983500. Policy #0 lag: (min: 0.0, avg: 38.6, max: 72.0) [2024-03-21 10:23:13,153][11055] Avg episode reward: [(0, '0.965')] [2024-03-21 10:23:18,152][11055] Fps is (10 sec: 36044.7, 60 sec: 40960.0, 300 sec: 47541.4). Total num frames: 1774485504. Throughput: 0: 44126.7. Samples: 42257000. Policy #0 lag: (min: 0.0, avg: 38.6, max: 72.0) [2024-03-21 10:23:18,153][11055] Avg episode reward: [(0, '0.806')] [2024-03-21 10:23:18,887][11286] Updated weights for policy 0, policy_version 54156 (0.0011) [2024-03-21 10:23:23,152][11055] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 48096.8). Total num frames: 1774845952. Throughput: 0: 43504.4. Samples: 42364400. Policy #0 lag: (min: 0.0, avg: 38.6, max: 72.0) [2024-03-21 10:23:23,153][11055] Avg episode reward: [(0, '1.448')] [2024-03-21 10:23:23,752][11286] Updated weights for policy 0, policy_version 54166 (0.0015) [2024-03-21 10:23:24,133][11266] Signal inference workers to stop experience collection... (800 times) [2024-03-21 10:23:24,133][11266] Signal inference workers to resume experience collection... (800 times) [2024-03-21 10:23:24,327][11286] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-03-21 10:23:24,328][11286] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-03-21 10:23:28,152][11055] Fps is (10 sec: 52429.1, 60 sec: 42598.4, 300 sec: 47541.4). Total num frames: 1775009792. Throughput: 0: 43909.0. Samples: 42645800. Policy #0 lag: (min: 0.0, avg: 38.6, max: 72.0) [2024-03-21 10:23:28,161][11055] Avg episode reward: [(0, '1.470')] [2024-03-21 10:23:33,152][11055] Fps is (10 sec: 29491.2, 60 sec: 44236.9, 300 sec: 47097.1). Total num frames: 1775140864. Throughput: 0: 45251.2. Samples: 42951500. Policy #0 lag: (min: 0.0, avg: 38.6, max: 72.0) [2024-03-21 10:23:33,153][11055] Avg episode reward: [(0, '0.560')] [2024-03-21 10:23:34,352][11286] Updated weights for policy 0, policy_version 54176 (0.0011) [2024-03-21 10:23:38,152][11055] Fps is (10 sec: 45874.9, 60 sec: 47513.5, 300 sec: 47208.1). Total num frames: 1775468544. Throughput: 0: 45371.1. Samples: 43081300. Policy #0 lag: (min: 0.0, avg: 38.6, max: 72.0) [2024-03-21 10:23:38,153][11055] Avg episode reward: [(0, '1.670')] [2024-03-21 10:23:39,793][11286] Updated weights for policy 0, policy_version 54186 (0.0012) [2024-03-21 10:23:43,152][11055] Fps is (10 sec: 72089.3, 60 sec: 51336.6, 300 sec: 47541.4). Total num frames: 1775861760. Throughput: 0: 44766.6. Samples: 43352300. Policy #0 lag: (min: 2.0, avg: 40.1, max: 83.0) [2024-03-21 10:23:43,153][11055] Avg episode reward: [(0, '0.854')] [2024-03-21 10:23:45,222][11286] Updated weights for policy 0, policy_version 54197 (0.0016) [2024-03-21 10:23:48,152][11055] Fps is (10 sec: 55705.9, 60 sec: 49152.0, 300 sec: 47541.4). Total num frames: 1776025600. Throughput: 0: 45802.3. Samples: 43655100. Policy #0 lag: (min: 2.0, avg: 40.1, max: 83.0) [2024-03-21 10:23:48,153][11055] Avg episode reward: [(0, '0.867')] [2024-03-21 10:23:53,152][11055] Fps is (10 sec: 26214.3, 60 sec: 46967.4, 300 sec: 46986.0). Total num frames: 1776123904. Throughput: 0: 46899.9. Samples: 43823900. Policy #0 lag: (min: 2.0, avg: 40.1, max: 83.0) [2024-03-21 10:23:53,153][11055] Avg episode reward: [(0, '0.867')] [2024-03-21 10:23:54,263][11286] Updated weights for policy 0, policy_version 54207 (0.0018) [2024-03-21 10:23:58,152][11055] Fps is (10 sec: 42598.3, 60 sec: 48605.8, 300 sec: 46541.7). Total num frames: 1776451584. Throughput: 0: 47026.7. Samples: 44099700. Policy #0 lag: (min: 2.0, avg: 40.1, max: 83.0) [2024-03-21 10:23:58,153][11055] Avg episode reward: [(0, '1.150')] [2024-03-21 10:24:01,447][11286] Updated weights for policy 0, policy_version 54217 (0.0015) [2024-03-21 10:24:03,152][11055] Fps is (10 sec: 55705.9, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 1776680960. Throughput: 0: 47355.6. Samples: 44388000. Policy #0 lag: (min: 2.0, avg: 40.1, max: 83.0) [2024-03-21 10:24:03,153][11055] Avg episode reward: [(0, '1.274')] [2024-03-21 10:24:05,439][11286] Updated weights for policy 0, policy_version 54227 (0.0010) [2024-03-21 10:24:08,152][11055] Fps is (10 sec: 58982.1, 60 sec: 48605.8, 300 sec: 47430.4). Total num frames: 1777041408. Throughput: 0: 48037.7. Samples: 44526100. Policy #0 lag: (min: 2.0, avg: 40.1, max: 83.0) [2024-03-21 10:24:08,153][11055] Avg episode reward: [(0, '1.234')] [2024-03-21 10:24:13,152][11055] Fps is (10 sec: 36044.7, 60 sec: 44236.8, 300 sec: 46652.8). Total num frames: 1777041408. Throughput: 0: 48339.9. Samples: 44821100. Policy #0 lag: (min: 2.0, avg: 40.1, max: 83.0) [2024-03-21 10:24:13,153][11055] Avg episode reward: [(0, '1.480')] [2024-03-21 10:24:17,613][11286] Updated weights for policy 0, policy_version 54237 (0.0021) [2024-03-21 10:24:18,152][11055] Fps is (10 sec: 19660.9, 60 sec: 45875.2, 300 sec: 46874.9). Total num frames: 1777238016. Throughput: 0: 48100.0. Samples: 45116000. Policy #0 lag: (min: 1.0, avg: 33.0, max: 72.0) [2024-03-21 10:24:18,153][11055] Avg episode reward: [(0, '1.118')] [2024-03-21 10:24:18,512][11266] Signal inference workers to stop experience collection... (850 times) [2024-03-21 10:24:18,586][11286] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-03-21 10:24:18,770][11266] Signal inference workers to resume experience collection... (850 times) [2024-03-21 10:24:18,770][11286] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-03-21 10:24:20,858][11286] Updated weights for policy 0, policy_version 54247 (0.0029) [2024-03-21 10:24:23,152][11055] Fps is (10 sec: 62258.7, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 1777664000. Throughput: 0: 47762.1. Samples: 45230600. Policy #0 lag: (min: 1.0, avg: 33.0, max: 72.0) [2024-03-21 10:24:23,153][11055] Avg episode reward: [(0, '1.118')] [2024-03-21 10:24:26,313][11286] Updated weights for policy 0, policy_version 54257 (0.0014) [2024-03-21 10:24:28,152][11055] Fps is (10 sec: 68812.6, 60 sec: 48605.8, 300 sec: 47763.6). Total num frames: 1777926144. Throughput: 0: 48422.2. Samples: 45531300. Policy #0 lag: (min: 1.0, avg: 33.0, max: 72.0) [2024-03-21 10:24:28,153][11055] Avg episode reward: [(0, '1.118')] [2024-03-21 10:24:33,152][11055] Fps is (10 sec: 42599.0, 60 sec: 49152.0, 300 sec: 47652.5). Total num frames: 1778089984. Throughput: 0: 48580.0. Samples: 45841200. Policy #0 lag: (min: 1.0, avg: 33.0, max: 72.0) [2024-03-21 10:24:33,153][11055] Avg episode reward: [(0, '1.584')] [2024-03-21 10:24:34,683][11286] Updated weights for policy 0, policy_version 54267 (0.0009) [2024-03-21 10:24:38,152][11055] Fps is (10 sec: 55705.4, 60 sec: 50244.2, 300 sec: 47541.4). Total num frames: 1778483200. Throughput: 0: 48240.0. Samples: 45994700. Policy #0 lag: (min: 1.0, avg: 33.0, max: 72.0) [2024-03-21 10:24:38,153][11055] Avg episode reward: [(0, '1.407')] [2024-03-21 10:24:39,743][11286] Updated weights for policy 0, policy_version 54277 (0.0012) [2024-03-21 10:24:43,152][11055] Fps is (10 sec: 75365.7, 60 sec: 49698.1, 300 sec: 47763.5). Total num frames: 1778843648. Throughput: 0: 48017.7. Samples: 46260500. Policy #0 lag: (min: 1.0, avg: 33.0, max: 72.0) [2024-03-21 10:24:43,153][11055] Avg episode reward: [(0, '1.407')] [2024-03-21 10:24:43,434][11286] Updated weights for policy 0, policy_version 54287 (0.0012) [2024-03-21 10:24:48,152][11055] Fps is (10 sec: 45875.7, 60 sec: 48605.9, 300 sec: 47430.3). Total num frames: 1778941952. Throughput: 0: 48937.8. Samples: 46590200. Policy #0 lag: (min: 1.0, avg: 33.0, max: 72.0) [2024-03-21 10:24:48,153][11055] Avg episode reward: [(0, '1.407')] [2024-03-21 10:24:53,152][11055] Fps is (10 sec: 13107.1, 60 sec: 47513.6, 300 sec: 47097.1). Total num frames: 1778974720. Throughput: 0: 49711.0. Samples: 46763100. Policy #0 lag: (min: 0.0, avg: 37.0, max: 85.0) [2024-03-21 10:24:53,153][11055] Avg episode reward: [(0, '1.617')] [2024-03-21 10:24:53,165][11266] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000054290_1778974720.pth... [2024-03-21 10:24:53,339][11266] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000053946_1767702528.pth [2024-03-21 10:24:55,990][11286] Updated weights for policy 0, policy_version 54297 (0.0016) [2024-03-21 10:24:58,152][11055] Fps is (10 sec: 36044.6, 60 sec: 47513.6, 300 sec: 47208.1). Total num frames: 1779302400. Throughput: 0: 49891.1. Samples: 47066200. Policy #0 lag: (min: 0.0, avg: 37.0, max: 85.0) [2024-03-21 10:24:58,153][11055] Avg episode reward: [(0, '1.617')] [2024-03-21 10:25:03,152][11055] Fps is (10 sec: 52429.4, 60 sec: 46967.5, 300 sec: 46763.8). Total num frames: 1779499008. Throughput: 0: 50073.3. Samples: 47369300. Policy #0 lag: (min: 0.0, avg: 37.0, max: 85.0) [2024-03-21 10:25:03,153][11055] Avg episode reward: [(0, '1.109')] [2024-03-21 10:25:03,239][11286] Updated weights for policy 0, policy_version 54307 (0.0011) [2024-03-21 10:25:03,854][11266] Signal inference workers to stop experience collection... (900 times) [2024-03-21 10:25:03,965][11286] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-03-21 10:25:04,105][11266] Signal inference workers to resume experience collection... (900 times) [2024-03-21 10:25:04,106][11286] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-03-21 10:25:06,169][11286] Updated weights for policy 0, policy_version 54317 (0.0024) [2024-03-21 10:25:08,152][11055] Fps is (10 sec: 72089.8, 60 sec: 49698.2, 300 sec: 48318.9). Total num frames: 1780023296. Throughput: 0: 50306.8. Samples: 47494400. Policy #0 lag: (min: 0.0, avg: 37.0, max: 85.0) [2024-03-21 10:25:08,153][11055] Avg episode reward: [(0, '1.075')] [2024-03-21 10:25:13,152][11055] Fps is (10 sec: 55704.9, 60 sec: 50244.2, 300 sec: 47541.4). Total num frames: 1780056064. Throughput: 0: 50326.5. Samples: 47796000. Policy #0 lag: (min: 0.0, avg: 37.0, max: 85.0) [2024-03-21 10:25:13,153][11055] Avg episode reward: [(0, '0.744')] [2024-03-21 10:25:15,880][11286] Updated weights for policy 0, policy_version 54327 (0.0013) [2024-03-21 10:25:18,152][11055] Fps is (10 sec: 39321.5, 60 sec: 52974.9, 300 sec: 47652.4). Total num frames: 1780416512. Throughput: 0: 49853.3. Samples: 48084600. Policy #0 lag: (min: 0.0, avg: 37.0, max: 85.0) [2024-03-21 10:25:18,153][11055] Avg episode reward: [(0, '1.352')] [2024-03-21 10:25:18,918][11286] Updated weights for policy 0, policy_version 54337 (0.0040) [2024-03-21 10:25:23,152][11055] Fps is (10 sec: 65536.2, 60 sec: 50790.4, 300 sec: 47985.7). Total num frames: 1780711424. Throughput: 0: 49100.0. Samples: 48204200. Policy #0 lag: (min: 0.0, avg: 37.0, max: 85.0) [2024-03-21 10:25:23,153][11055] Avg episode reward: [(0, '0.543')] [2024-03-21 10:25:24,742][11286] Updated weights for policy 0, policy_version 54347 (0.0015) [2024-03-21 10:25:28,152][11055] Fps is (10 sec: 49152.2, 60 sec: 49698.2, 300 sec: 47985.7). Total num frames: 1780908032. Throughput: 0: 50391.2. Samples: 48528100. Policy #0 lag: (min: 0.0, avg: 39.8, max: 80.0) [2024-03-21 10:25:28,153][11055] Avg episode reward: [(0, '0.800')] [2024-03-21 10:25:31,872][11286] Updated weights for policy 0, policy_version 54357 (0.0010) [2024-03-21 10:25:33,152][11055] Fps is (10 sec: 49152.2, 60 sec: 51882.6, 300 sec: 47763.5). Total num frames: 1781202944. Throughput: 0: 49651.0. Samples: 48824500. Policy #0 lag: (min: 0.0, avg: 39.8, max: 80.0) [2024-03-21 10:25:33,153][11055] Avg episode reward: [(0, '1.369')] [2024-03-21 10:25:37,978][11286] Updated weights for policy 0, policy_version 54367 (0.0012) [2024-03-21 10:25:38,152][11055] Fps is (10 sec: 58982.4, 60 sec: 50244.3, 300 sec: 48207.9). Total num frames: 1781497856. Throughput: 0: 49353.5. Samples: 48984000. Policy #0 lag: (min: 0.0, avg: 39.8, max: 80.0) [2024-03-21 10:25:38,153][11055] Avg episode reward: [(0, '1.369')] [2024-03-21 10:25:43,152][11055] Fps is (10 sec: 45875.5, 60 sec: 46967.5, 300 sec: 48096.8). Total num frames: 1781661696. Throughput: 0: 48960.0. Samples: 49269400. Policy #0 lag: (min: 0.0, avg: 39.8, max: 80.0) [2024-03-21 10:25:43,153][11055] Avg episode reward: [(0, '1.451')] [2024-03-21 10:25:47,480][11286] Updated weights for policy 0, policy_version 54377 (0.0011) [2024-03-21 10:25:48,152][11055] Fps is (10 sec: 36044.8, 60 sec: 48605.8, 300 sec: 48096.8). Total num frames: 1781858304. Throughput: 0: 48466.7. Samples: 49550300. Policy #0 lag: (min: 0.0, avg: 39.8, max: 80.0) [2024-03-21 10:25:48,153][11055] Avg episode reward: [(0, '0.840')] [2024-03-21 10:25:50,744][11266] Signal inference workers to stop experience collection... (950 times) [2024-03-21 10:25:50,831][11286] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-03-21 10:25:50,993][11266] Signal inference workers to resume experience collection... (950 times) [2024-03-21 10:25:50,993][11286] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-03-21 10:25:53,152][11055] Fps is (10 sec: 45875.1, 60 sec: 52428.9, 300 sec: 48318.9). Total num frames: 1782120448. Throughput: 0: 48602.2. Samples: 49681500. Policy #0 lag: (min: 0.0, avg: 39.8, max: 80.0) [2024-03-21 10:25:53,153][11055] Avg episode reward: [(0, '1.114')] [2024-03-21 10:25:53,249][11286] Updated weights for policy 0, policy_version 54387 (0.0022) [2024-03-21 10:25:58,152][11055] Fps is (10 sec: 36044.8, 60 sec: 48605.9, 300 sec: 47319.2). Total num frames: 1782218752. Throughput: 0: 47435.7. Samples: 49930600. Policy #0 lag: (min: 0.0, avg: 39.8, max: 80.0) [2024-03-21 10:25:58,153][11055] Avg episode reward: [(0, '1.667')] [2024-03-21 10:26:03,152][11055] Fps is (10 sec: 26214.4, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 1782382592. Throughput: 0: 46486.7. Samples: 50176500. Policy #0 lag: (min: 0.0, avg: 27.9, max: 69.0) [2024-03-21 10:26:03,153][11055] Avg episode reward: [(0, '0.761')] [2024-03-21 10:26:06,859][11286] Updated weights for policy 0, policy_version 54397 (0.0011) [2024-03-21 10:26:08,152][11055] Fps is (10 sec: 32768.1, 60 sec: 42052.3, 300 sec: 46874.9). Total num frames: 1782546432. Throughput: 0: 47120.1. Samples: 50324600. Policy #0 lag: (min: 0.0, avg: 27.9, max: 69.0) [2024-03-21 10:26:08,153][11055] Avg episode reward: [(0, '0.709')] [2024-03-21 10:26:12,130][11286] Updated weights for policy 0, policy_version 54407 (0.0016) [2024-03-21 10:26:13,152][11055] Fps is (10 sec: 49152.0, 60 sec: 46967.5, 300 sec: 46652.8). Total num frames: 1782874112. Throughput: 0: 45968.9. Samples: 50596700. Policy #0 lag: (min: 0.0, avg: 27.9, max: 69.0) [2024-03-21 10:26:13,153][11055] Avg episode reward: [(0, '0.554')] [2024-03-21 10:26:18,152][11055] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 46763.8). Total num frames: 1783037952. Throughput: 0: 45420.1. Samples: 50868400. Policy #0 lag: (min: 0.0, avg: 27.9, max: 69.0) [2024-03-21 10:26:18,162][11055] Avg episode reward: [(0, '1.274')] [2024-03-21 10:26:20,683][11286] Updated weights for policy 0, policy_version 54417 (0.0019) [2024-03-21 10:26:23,152][11055] Fps is (10 sec: 45874.6, 60 sec: 43690.6, 300 sec: 47208.1). Total num frames: 1783332864. Throughput: 0: 45002.0. Samples: 51009100. Policy #0 lag: (min: 0.0, avg: 27.9, max: 69.0) [2024-03-21 10:26:23,153][11055] Avg episode reward: [(0, '0.842')] [2024-03-21 10:26:24,775][11286] Updated weights for policy 0, policy_version 54427 (0.0025) [2024-03-21 10:26:28,152][11055] Fps is (10 sec: 55705.8, 60 sec: 44783.0, 300 sec: 47208.2). Total num frames: 1783595008. Throughput: 0: 44008.9. Samples: 51249800. Policy #0 lag: (min: 0.0, avg: 27.9, max: 69.0) [2024-03-21 10:26:28,153][11055] Avg episode reward: [(0, '0.994')] [2024-03-21 10:26:30,051][11286] Updated weights for policy 0, policy_version 54437 (0.0013) [2024-03-21 10:26:33,152][11055] Fps is (10 sec: 55706.1, 60 sec: 44782.9, 300 sec: 47874.6). Total num frames: 1783889920. Throughput: 0: 43693.3. Samples: 51516500. Policy #0 lag: (min: 0.0, avg: 27.9, max: 69.0) [2024-03-21 10:26:33,153][11055] Avg episode reward: [(0, '1.394')] [2024-03-21 10:26:36,506][11286] Updated weights for policy 0, policy_version 54447 (0.0019) [2024-03-21 10:26:38,152][11055] Fps is (10 sec: 62258.5, 60 sec: 45329.0, 300 sec: 48318.9). Total num frames: 1784217600. Throughput: 0: 44131.1. Samples: 51667400. Policy #0 lag: (min: 1.0, avg: 40.3, max: 76.0) [2024-03-21 10:26:38,153][11055] Avg episode reward: [(0, '1.237')] [2024-03-21 10:26:43,152][11055] Fps is (10 sec: 52428.7, 60 sec: 45875.1, 300 sec: 48541.1). Total num frames: 1784414208. Throughput: 0: 44946.6. Samples: 51953200. Policy #0 lag: (min: 1.0, avg: 40.3, max: 76.0) [2024-03-21 10:26:43,153][11055] Avg episode reward: [(0, '1.467')] [2024-03-21 10:26:43,220][11286] Updated weights for policy 0, policy_version 54457 (0.0019) [2024-03-21 10:26:47,077][11266] Signal inference workers to stop experience collection... (1000 times) [2024-03-21 10:26:47,077][11266] Signal inference workers to resume experience collection... (1000 times) [2024-03-21 10:26:47,193][11286] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-03-21 10:26:47,194][11286] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-03-21 10:26:47,429][11286] Updated weights for policy 0, policy_version 54467 (0.0010) [2024-03-21 10:26:48,152][11055] Fps is (10 sec: 62259.6, 60 sec: 49698.1, 300 sec: 48763.2). Total num frames: 1784840192. Throughput: 0: 45751.1. Samples: 52235300. Policy #0 lag: (min: 1.0, avg: 40.3, max: 76.0) [2024-03-21 10:26:48,153][11055] Avg episode reward: [(0, '1.467')] [2024-03-21 10:26:53,152][11055] Fps is (10 sec: 55705.9, 60 sec: 47513.6, 300 sec: 48207.8). Total num frames: 1784971264. Throughput: 0: 45593.3. Samples: 52376300. Policy #0 lag: (min: 1.0, avg: 40.3, max: 76.0) [2024-03-21 10:26:53,153][11055] Avg episode reward: [(0, '1.157')] [2024-03-21 10:26:53,163][11266] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000054473_1784971264.pth... [2024-03-21 10:26:53,286][11266] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000054117_1773305856.pth [2024-03-21 10:26:58,152][11055] Fps is (10 sec: 22937.6, 60 sec: 47513.6, 300 sec: 47208.1). Total num frames: 1785069568. Throughput: 0: 45848.9. Samples: 52659900. Policy #0 lag: (min: 1.0, avg: 40.3, max: 76.0) [2024-03-21 10:26:58,153][11055] Avg episode reward: [(0, '1.499')] [2024-03-21 10:27:02,525][11286] Updated weights for policy 0, policy_version 54477 (0.0019) [2024-03-21 10:27:03,152][11055] Fps is (10 sec: 16383.9, 60 sec: 45875.2, 300 sec: 46430.6). Total num frames: 1785135104. Throughput: 0: 45733.2. Samples: 52926400. Policy #0 lag: (min: 1.0, avg: 40.3, max: 76.0) [2024-03-21 10:27:03,153][11055] Avg episode reward: [(0, '0.952')] [2024-03-21 10:27:08,152][11055] Fps is (10 sec: 16383.9, 60 sec: 44782.9, 300 sec: 45764.1). Total num frames: 1785233408. Throughput: 0: 45277.9. Samples: 53046600. Policy #0 lag: (min: 1.0, avg: 40.3, max: 76.0) [2024-03-21 10:27:08,153][11055] Avg episode reward: [(0, '0.793')] [2024-03-21 10:27:12,674][11286] Updated weights for policy 0, policy_version 54487 (0.0015) [2024-03-21 10:27:13,152][11055] Fps is (10 sec: 29491.4, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 1785430016. Throughput: 0: 45226.6. Samples: 53285000. Policy #0 lag: (min: 0.0, avg: 25.2, max: 69.0) [2024-03-21 10:27:13,153][11055] Avg episode reward: [(0, '0.884')] [2024-03-21 10:27:18,152][11055] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1785659392. Throughput: 0: 44768.9. Samples: 53531100. Policy #0 lag: (min: 0.0, avg: 25.2, max: 69.0) [2024-03-21 10:27:18,153][11055] Avg episode reward: [(0, '0.968')] [2024-03-21 10:27:19,426][11286] Updated weights for policy 0, policy_version 54497 (0.0017) [2024-03-21 10:27:23,152][11055] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 45430.9). Total num frames: 1785856000. Throughput: 0: 44175.6. Samples: 53655300. Policy #0 lag: (min: 0.0, avg: 25.2, max: 69.0) [2024-03-21 10:27:23,153][11055] Avg episode reward: [(0, '1.461')] [2024-03-21 10:27:27,683][11286] Updated weights for policy 0, policy_version 54507 (0.0020) [2024-03-21 10:27:28,152][11055] Fps is (10 sec: 45874.8, 60 sec: 42052.2, 300 sec: 46208.4). Total num frames: 1786118144. Throughput: 0: 44475.6. Samples: 53954600. Policy #0 lag: (min: 0.0, avg: 25.2, max: 69.0) [2024-03-21 10:27:28,153][11055] Avg episode reward: [(0, '0.929')] [2024-03-21 10:27:32,479][11286] Updated weights for policy 0, policy_version 54517 (0.0025) [2024-03-21 10:27:33,152][11055] Fps is (10 sec: 62259.1, 60 sec: 43144.5, 300 sec: 46986.0). Total num frames: 1786478592. Throughput: 0: 43702.2. Samples: 54201900. Policy #0 lag: (min: 0.0, avg: 25.2, max: 69.0) [2024-03-21 10:27:33,153][11055] Avg episode reward: [(0, '0.781')] [2024-03-21 10:27:38,152][11055] Fps is (10 sec: 58982.7, 60 sec: 41506.2, 300 sec: 47208.1). Total num frames: 1786707968. Throughput: 0: 43693.3. Samples: 54342500. Policy #0 lag: (min: 0.0, avg: 25.2, max: 69.0) [2024-03-21 10:27:38,153][11055] Avg episode reward: [(0, '0.781')] [2024-03-21 10:27:38,265][11286] Updated weights for policy 0, policy_version 54527 (0.0011) [2024-03-21 10:27:41,295][11286] Updated weights for policy 0, policy_version 54537 (0.0014) [2024-03-21 10:27:43,152][11055] Fps is (10 sec: 65536.2, 60 sec: 45329.1, 300 sec: 47652.4). Total num frames: 1787133952. Throughput: 0: 43140.0. Samples: 54601200. Policy #0 lag: (min: 0.0, avg: 25.2, max: 69.0) [2024-03-21 10:27:43,153][11055] Avg episode reward: [(0, '0.827')] [2024-03-21 10:27:47,665][11266] Signal inference workers to stop experience collection... (1050 times) [2024-03-21 10:27:47,778][11286] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-03-21 10:27:47,932][11266] Signal inference workers to resume experience collection... (1050 times) [2024-03-21 10:27:47,933][11286] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-03-21 10:27:48,152][11055] Fps is (10 sec: 65536.2, 60 sec: 42052.3, 300 sec: 47652.4). Total num frames: 1787363328. Throughput: 0: 44166.7. Samples: 54913900. Policy #0 lag: (min: 1.0, avg: 41.3, max: 71.0) [2024-03-21 10:27:48,153][11055] Avg episode reward: [(0, '0.973')] [2024-03-21 10:27:48,256][11286] Updated weights for policy 0, policy_version 54547 (0.0010) [2024-03-21 10:27:53,152][11055] Fps is (10 sec: 36044.6, 60 sec: 42052.2, 300 sec: 47319.2). Total num frames: 1787494400. Throughput: 0: 44973.3. Samples: 55070400. Policy #0 lag: (min: 1.0, avg: 41.3, max: 71.0) [2024-03-21 10:27:53,153][11055] Avg episode reward: [(0, '0.973')] [2024-03-21 10:27:55,962][11286] Updated weights for policy 0, policy_version 54557 (0.0015) [2024-03-21 10:27:58,152][11055] Fps is (10 sec: 55705.2, 60 sec: 47513.5, 300 sec: 47874.6). Total num frames: 1787920384. Throughput: 0: 46055.5. Samples: 55357500. Policy #0 lag: (min: 1.0, avg: 41.3, max: 71.0) [2024-03-21 10:27:58,153][11055] Avg episode reward: [(0, '1.182')] [2024-03-21 10:28:00,324][11286] Updated weights for policy 0, policy_version 54567 (0.0012) [2024-03-21 10:28:03,152][11055] Fps is (10 sec: 65536.4, 60 sec: 50244.3, 300 sec: 47541.4). Total num frames: 1788149760. Throughput: 0: 46782.2. Samples: 55636300. Policy #0 lag: (min: 1.0, avg: 41.3, max: 71.0) [2024-03-21 10:28:03,153][11055] Avg episode reward: [(0, '1.182')] [2024-03-21 10:28:08,152][11055] Fps is (10 sec: 42598.6, 60 sec: 51882.7, 300 sec: 47319.2). Total num frames: 1788346368. Throughput: 0: 47511.1. Samples: 55793300. Policy #0 lag: (min: 1.0, avg: 41.3, max: 71.0) [2024-03-21 10:28:08,153][11055] Avg episode reward: [(0, '1.152')] [2024-03-21 10:28:10,701][11286] Updated weights for policy 0, policy_version 54577 (0.0014) [2024-03-21 10:28:13,152][11055] Fps is (10 sec: 42598.2, 60 sec: 52428.8, 300 sec: 47763.5). Total num frames: 1788575744. Throughput: 0: 47702.3. Samples: 56101200. Policy #0 lag: (min: 1.0, avg: 41.3, max: 71.0) [2024-03-21 10:28:13,153][11055] Avg episode reward: [(0, '0.644')] [2024-03-21 10:28:18,152][11055] Fps is (10 sec: 29491.2, 60 sec: 49698.1, 300 sec: 46763.8). Total num frames: 1788641280. Throughput: 0: 49153.4. Samples: 56413800. Policy #0 lag: (min: 1.0, avg: 41.3, max: 71.0) [2024-03-21 10:28:18,153][11055] Avg episode reward: [(0, '0.863')] [2024-03-21 10:28:18,970][11286] Updated weights for policy 0, policy_version 54587 (0.0010) [2024-03-21 10:28:23,152][11055] Fps is (10 sec: 29491.2, 60 sec: 50244.3, 300 sec: 46986.0). Total num frames: 1788870656. Throughput: 0: 49153.3. Samples: 56554400. Policy #0 lag: (min: 1.0, avg: 30.1, max: 66.0) [2024-03-21 10:28:23,153][11055] Avg episode reward: [(0, '0.757')] [2024-03-21 10:28:27,580][11286] Updated weights for policy 0, policy_version 54597 (0.0011) [2024-03-21 10:28:28,152][11055] Fps is (10 sec: 42598.4, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 1789067264. Throughput: 0: 50175.6. Samples: 56859100. Policy #0 lag: (min: 1.0, avg: 30.1, max: 66.0) [2024-03-21 10:28:28,153][11055] Avg episode reward: [(0, '1.466')] [2024-03-21 10:28:33,152][11055] Fps is (10 sec: 45875.4, 60 sec: 47513.6, 300 sec: 46986.0). Total num frames: 1789329408. Throughput: 0: 50111.1. Samples: 57168900. Policy #0 lag: (min: 1.0, avg: 30.1, max: 66.0) [2024-03-21 10:28:33,153][11055] Avg episode reward: [(0, '1.466')] [2024-03-21 10:28:33,261][11286] Updated weights for policy 0, policy_version 54607 (0.0011) [2024-03-21 10:28:38,152][11055] Fps is (10 sec: 42598.4, 60 sec: 46421.3, 300 sec: 46208.4). Total num frames: 1789493248. Throughput: 0: 50028.9. Samples: 57321700. Policy #0 lag: (min: 1.0, avg: 30.1, max: 66.0) [2024-03-21 10:28:38,153][11055] Avg episode reward: [(0, '1.403')] [2024-03-21 10:28:39,969][11286] Updated weights for policy 0, policy_version 54617 (0.0012) [2024-03-21 10:28:41,210][11266] Signal inference workers to stop experience collection... (1100 times) [2024-03-21 10:28:41,272][11286] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-03-21 10:28:41,474][11266] Signal inference workers to resume experience collection... (1100 times) [2024-03-21 10:28:41,474][11286] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-03-21 10:28:43,152][11055] Fps is (10 sec: 65536.3, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 1789984768. Throughput: 0: 49589.0. Samples: 57589000. Policy #0 lag: (min: 1.0, avg: 30.1, max: 66.0) [2024-03-21 10:28:43,153][11055] Avg episode reward: [(0, '0.834')] [2024-03-21 10:28:43,247][11286] Updated weights for policy 0, policy_version 54627 (0.0014) [2024-03-21 10:28:48,152][11055] Fps is (10 sec: 81920.1, 60 sec: 49152.0, 300 sec: 48096.8). Total num frames: 1790312448. Throughput: 0: 49724.4. Samples: 57873900. Policy #0 lag: (min: 1.0, avg: 30.1, max: 66.0) [2024-03-21 10:28:48,153][11055] Avg episode reward: [(0, '0.834')] [2024-03-21 10:28:48,226][11286] Updated weights for policy 0, policy_version 54637 (0.0014) [2024-03-21 10:28:53,152][11055] Fps is (10 sec: 58982.0, 60 sec: 51336.5, 300 sec: 47874.6). Total num frames: 1790574592. Throughput: 0: 49680.0. Samples: 58028900. Policy #0 lag: (min: 1.0, avg: 30.1, max: 66.0) [2024-03-21 10:28:53,153][11055] Avg episode reward: [(0, '0.834')] [2024-03-21 10:28:53,166][11266] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000054644_1790574592.pth... [2024-03-21 10:28:53,290][11266] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000054290_1778974720.pth [2024-03-21 10:28:54,533][11286] Updated weights for policy 0, policy_version 54647 (0.0010) [2024-03-21 10:28:58,152][11055] Fps is (10 sec: 62259.4, 60 sec: 50244.3, 300 sec: 48318.9). Total num frames: 1790935040. Throughput: 0: 49249.0. Samples: 58317400. Policy #0 lag: (min: 1.0, avg: 37.0, max: 63.0) [2024-03-21 10:28:58,153][11055] Avg episode reward: [(0, '0.674')] [2024-03-21 10:29:00,878][11286] Updated weights for policy 0, policy_version 54657 (0.0016) [2024-03-21 10:29:03,152][11055] Fps is (10 sec: 49152.3, 60 sec: 48605.9, 300 sec: 47541.4). Total num frames: 1791066112. Throughput: 0: 48815.6. Samples: 58610500. Policy #0 lag: (min: 1.0, avg: 37.0, max: 63.0) [2024-03-21 10:29:03,153][11055] Avg episode reward: [(0, '1.375')] [2024-03-21 10:29:07,130][11286] Updated weights for policy 0, policy_version 54667 (0.0016) [2024-03-21 10:29:08,152][11055] Fps is (10 sec: 39321.5, 60 sec: 49698.1, 300 sec: 48430.0). Total num frames: 1791328256. Throughput: 0: 49020.1. Samples: 58760300. Policy #0 lag: (min: 1.0, avg: 37.0, max: 63.0) [2024-03-21 10:29:08,153][11055] Avg episode reward: [(0, '1.280')] [2024-03-21 10:29:13,152][11055] Fps is (10 sec: 36044.7, 60 sec: 47513.6, 300 sec: 48096.8). Total num frames: 1791426560. Throughput: 0: 49082.2. Samples: 59067800. Policy #0 lag: (min: 1.0, avg: 37.0, max: 63.0) [2024-03-21 10:29:13,153][11055] Avg episode reward: [(0, '0.450')] [2024-03-21 10:29:18,152][11055] Fps is (10 sec: 9830.3, 60 sec: 46421.3, 300 sec: 46652.7). Total num frames: 1791426560. Throughput: 0: 49166.6. Samples: 59381400. Policy #0 lag: (min: 1.0, avg: 37.0, max: 63.0) [2024-03-21 10:29:18,153][11055] Avg episode reward: [(0, '0.501')] [2024-03-21 10:29:22,812][11286] Updated weights for policy 0, policy_version 54677 (0.0014) [2024-03-21 10:29:23,152][11055] Fps is (10 sec: 26214.1, 60 sec: 46967.4, 300 sec: 46652.7). Total num frames: 1791688704. Throughput: 0: 48931.0. Samples: 59523600. Policy #0 lag: (min: 1.0, avg: 37.0, max: 63.0) [2024-03-21 10:29:23,153][11055] Avg episode reward: [(0, '1.568')] [2024-03-21 10:29:27,224][11286] Updated weights for policy 0, policy_version 54687 (0.0023) [2024-03-21 10:29:28,152][11055] Fps is (10 sec: 55706.1, 60 sec: 48605.9, 300 sec: 47097.0). Total num frames: 1791983616. Throughput: 0: 48942.2. Samples: 59791400. Policy #0 lag: (min: 1.0, avg: 37.0, max: 63.0) [2024-03-21 10:29:28,153][11055] Avg episode reward: [(0, '0.954')] [2024-03-21 10:29:32,796][11286] Updated weights for policy 0, policy_version 54697 (0.0018) [2024-03-21 10:29:33,152][11055] Fps is (10 sec: 65536.9, 60 sec: 50244.3, 300 sec: 46986.0). Total num frames: 1792344064. Throughput: 0: 49164.4. Samples: 60086300. Policy #0 lag: (min: 2.0, avg: 36.1, max: 92.0) [2024-03-21 10:29:33,153][11055] Avg episode reward: [(0, '0.954')] [2024-03-21 10:29:36,096][11266] Signal inference workers to stop experience collection... (1150 times) [2024-03-21 10:29:36,097][11266] Signal inference workers to resume experience collection... (1150 times) [2024-03-21 10:29:36,169][11286] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-03-21 10:29:36,169][11286] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-03-21 10:29:37,310][11286] Updated weights for policy 0, policy_version 54707 (0.0015) [2024-03-21 10:29:38,152][11055] Fps is (10 sec: 68812.9, 60 sec: 52975.0, 300 sec: 46874.9). Total num frames: 1792671744. Throughput: 0: 49195.6. Samples: 60242700. Policy #0 lag: (min: 2.0, avg: 36.1, max: 92.0) [2024-03-21 10:29:38,153][11055] Avg episode reward: [(0, '1.834')] [2024-03-21 10:29:43,152][11055] Fps is (10 sec: 52428.8, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 1792868352. Throughput: 0: 48908.8. Samples: 60518300. Policy #0 lag: (min: 2.0, avg: 36.1, max: 92.0) [2024-03-21 10:29:43,153][11055] Avg episode reward: [(0, '0.586')] [2024-03-21 10:29:44,115][11286] Updated weights for policy 0, policy_version 54717 (0.0012) [2024-03-21 10:29:48,152][11055] Fps is (10 sec: 49151.9, 60 sec: 47513.6, 300 sec: 48096.8). Total num frames: 1793163264. Throughput: 0: 48773.3. Samples: 60805300. Policy #0 lag: (min: 2.0, avg: 36.1, max: 92.0) [2024-03-21 10:29:48,153][11055] Avg episode reward: [(0, '0.660')] [2024-03-21 10:29:50,333][11286] Updated weights for policy 0, policy_version 54727 (0.0011) [2024-03-21 10:29:53,152][11055] Fps is (10 sec: 68812.6, 60 sec: 49698.1, 300 sec: 48318.9). Total num frames: 1793556480. Throughput: 0: 48382.2. Samples: 60937500. Policy #0 lag: (min: 2.0, avg: 36.1, max: 92.0) [2024-03-21 10:29:53,153][11055] Avg episode reward: [(0, '1.009')] [2024-03-21 10:29:53,697][11286] Updated weights for policy 0, policy_version 54737 (0.0016) [2024-03-21 10:29:58,152][11055] Fps is (10 sec: 68812.7, 60 sec: 48605.8, 300 sec: 48652.1). Total num frames: 1793851392. Throughput: 0: 47091.1. Samples: 61186900. Policy #0 lag: (min: 2.0, avg: 36.1, max: 92.0) [2024-03-21 10:29:58,153][11055] Avg episode reward: [(0, '1.536')] [2024-03-21 10:30:00,376][11286] Updated weights for policy 0, policy_version 54747 (0.0011) [2024-03-21 10:30:03,152][11055] Fps is (10 sec: 39321.6, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 1793949696. Throughput: 0: 46662.3. Samples: 61481200. Policy #0 lag: (min: 2.0, avg: 36.1, max: 92.0) [2024-03-21 10:30:03,153][11055] Avg episode reward: [(0, '1.390')] [2024-03-21 10:30:08,152][11055] Fps is (10 sec: 13107.3, 60 sec: 44236.8, 300 sec: 47208.2). Total num frames: 1793982464. Throughput: 0: 46973.5. Samples: 61637400. Policy #0 lag: (min: 0.0, avg: 40.4, max: 85.0) [2024-03-21 10:30:08,153][11055] Avg episode reward: [(0, '1.102')] [2024-03-21 10:30:13,152][11055] Fps is (10 sec: 19660.9, 60 sec: 45329.1, 300 sec: 46541.7). Total num frames: 1794146304. Throughput: 0: 46882.3. Samples: 61901100. Policy #0 lag: (min: 0.0, avg: 40.4, max: 85.0) [2024-03-21 10:30:13,153][11055] Avg episode reward: [(0, '1.541')] [2024-03-21 10:30:15,619][11286] Updated weights for policy 0, policy_version 54757 (0.0014) [2024-03-21 10:30:18,152][11055] Fps is (10 sec: 45874.7, 60 sec: 50244.3, 300 sec: 46541.7). Total num frames: 1794441216. Throughput: 0: 45026.6. Samples: 62112500. Policy #0 lag: (min: 0.0, avg: 40.4, max: 85.0) [2024-03-21 10:30:18,153][11055] Avg episode reward: [(0, '1.261')] [2024-03-21 10:30:20,913][11286] Updated weights for policy 0, policy_version 54767 (0.0013) [2024-03-21 10:30:23,152][11055] Fps is (10 sec: 52428.7, 60 sec: 49698.2, 300 sec: 46652.7). Total num frames: 1794670592. Throughput: 0: 44473.3. Samples: 62244000. Policy #0 lag: (min: 0.0, avg: 40.4, max: 85.0) [2024-03-21 10:30:23,153][11055] Avg episode reward: [(0, '1.572')] [2024-03-21 10:30:28,152][11055] Fps is (10 sec: 32768.1, 60 sec: 46421.3, 300 sec: 45986.3). Total num frames: 1794768896. Throughput: 0: 44920.0. Samples: 62539700. Policy #0 lag: (min: 0.0, avg: 40.4, max: 85.0) [2024-03-21 10:30:28,153][11055] Avg episode reward: [(0, '0.851')] [2024-03-21 10:30:30,283][11286] Updated weights for policy 0, policy_version 54777 (0.0025) [2024-03-21 10:30:32,506][11266] Signal inference workers to stop experience collection... (1200 times) [2024-03-21 10:30:32,555][11286] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-03-21 10:30:32,572][11266] Signal inference workers to resume experience collection... (1200 times) [2024-03-21 10:30:32,596][11286] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-03-21 10:30:33,152][11055] Fps is (10 sec: 36044.7, 60 sec: 44782.9, 300 sec: 45875.2). Total num frames: 1795031040. Throughput: 0: 44033.3. Samples: 62786800. Policy #0 lag: (min: 0.0, avg: 40.4, max: 85.0) [2024-03-21 10:30:33,153][11055] Avg episode reward: [(0, '0.637')] [2024-03-21 10:30:38,152][11055] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 45764.1). Total num frames: 1795162112. Throughput: 0: 44146.7. Samples: 62924100. Policy #0 lag: (min: 0.0, avg: 40.4, max: 85.0) [2024-03-21 10:30:38,153][11055] Avg episode reward: [(0, '1.187')] [2024-03-21 10:30:41,035][11286] Updated weights for policy 0, policy_version 54787 (0.0012) [2024-03-21 10:30:43,152][11055] Fps is (10 sec: 32767.9, 60 sec: 41506.1, 300 sec: 45764.1). Total num frames: 1795358720. Throughput: 0: 44751.1. Samples: 63200700. Policy #0 lag: (min: 0.0, avg: 23.2, max: 62.0) [2024-03-21 10:30:43,153][11055] Avg episode reward: [(0, '0.615')] [2024-03-21 10:30:46,777][11286] Updated weights for policy 0, policy_version 54797 (0.0009) [2024-03-21 10:30:48,152][11055] Fps is (10 sec: 52428.5, 60 sec: 42052.3, 300 sec: 45986.3). Total num frames: 1795686400. Throughput: 0: 43833.3. Samples: 63453700. Policy #0 lag: (min: 0.0, avg: 23.2, max: 62.0) [2024-03-21 10:30:48,153][11055] Avg episode reward: [(0, '1.205')] [2024-03-21 10:30:51,035][11286] Updated weights for policy 0, policy_version 54807 (0.0011) [2024-03-21 10:30:53,152][11055] Fps is (10 sec: 75366.8, 60 sec: 42598.4, 300 sec: 47097.1). Total num frames: 1796112384. Throughput: 0: 43328.9. Samples: 63587200. Policy #0 lag: (min: 0.0, avg: 23.2, max: 62.0) [2024-03-21 10:30:53,153][11055] Avg episode reward: [(0, '0.970')] [2024-03-21 10:30:53,216][11266] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000054814_1796145152.pth... [2024-03-21 10:30:53,301][11266] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000054473_1784971264.pth [2024-03-21 10:30:57,696][11286] Updated weights for policy 0, policy_version 54817 (0.0009) [2024-03-21 10:30:58,152][11055] Fps is (10 sec: 55706.0, 60 sec: 39867.8, 300 sec: 46986.0). Total num frames: 1796243456. Throughput: 0: 42726.7. Samples: 63823800. Policy #0 lag: (min: 0.0, avg: 23.2, max: 62.0) [2024-03-21 10:30:58,153][11055] Avg episode reward: [(0, '0.970')] [2024-03-21 10:31:03,152][11055] Fps is (10 sec: 22937.6, 60 sec: 39867.8, 300 sec: 46763.8). Total num frames: 1796341760. Throughput: 0: 43651.2. Samples: 64076800. Policy #0 lag: (min: 0.0, avg: 23.2, max: 62.0) [2024-03-21 10:31:03,153][11055] Avg episode reward: [(0, '0.970')] [2024-03-21 10:31:07,282][11286] Updated weights for policy 0, policy_version 54827 (0.0011) [2024-03-21 10:31:08,152][11055] Fps is (10 sec: 36044.7, 60 sec: 43690.6, 300 sec: 46541.7). Total num frames: 1796603904. Throughput: 0: 43495.6. Samples: 64201300. Policy #0 lag: (min: 0.0, avg: 23.2, max: 62.0) [2024-03-21 10:31:08,153][11055] Avg episode reward: [(0, '0.658')] [2024-03-21 10:31:13,152][11055] Fps is (10 sec: 45874.7, 60 sec: 44236.7, 300 sec: 46652.7). Total num frames: 1796800512. Throughput: 0: 42228.8. Samples: 64440000. Policy #0 lag: (min: 0.0, avg: 23.2, max: 62.0) [2024-03-21 10:31:13,153][11055] Avg episode reward: [(0, '0.658')] [2024-03-21 10:31:14,593][11286] Updated weights for policy 0, policy_version 54837 (0.0010) [2024-03-21 10:31:18,152][11055] Fps is (10 sec: 52429.0, 60 sec: 44783.0, 300 sec: 46763.9). Total num frames: 1797128192. Throughput: 0: 42144.5. Samples: 64683300. Policy #0 lag: (min: 1.0, avg: 40.7, max: 99.0) [2024-03-21 10:31:18,153][11055] Avg episode reward: [(0, '0.755')] [2024-03-21 10:31:19,508][11286] Updated weights for policy 0, policy_version 54847 (0.0029) [2024-03-21 10:31:23,152][11055] Fps is (10 sec: 58983.1, 60 sec: 45329.1, 300 sec: 46763.8). Total num frames: 1797390336. Throughput: 0: 41673.3. Samples: 64799400. Policy #0 lag: (min: 1.0, avg: 40.7, max: 99.0) [2024-03-21 10:31:23,153][11055] Avg episode reward: [(0, '0.755')] [2024-03-21 10:31:28,099][11286] Updated weights for policy 0, policy_version 54857 (0.0011) [2024-03-21 10:31:28,152][11055] Fps is (10 sec: 42598.1, 60 sec: 46421.4, 300 sec: 46319.5). Total num frames: 1797554176. Throughput: 0: 40422.3. Samples: 65019700. Policy #0 lag: (min: 1.0, avg: 40.7, max: 99.0) [2024-03-21 10:31:28,153][11055] Avg episode reward: [(0, '1.646')] [2024-03-21 10:31:29,507][11266] Signal inference workers to stop experience collection... (1250 times) [2024-03-21 10:31:29,599][11286] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-03-21 10:31:29,830][11266] Signal inference workers to resume experience collection... (1250 times) [2024-03-21 10:31:29,834][11286] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-03-21 10:31:33,152][11055] Fps is (10 sec: 32767.8, 60 sec: 44782.9, 300 sec: 45764.1). Total num frames: 1797718016. Throughput: 0: 39513.3. Samples: 65231800. Policy #0 lag: (min: 1.0, avg: 40.7, max: 99.0) [2024-03-21 10:31:33,153][11055] Avg episode reward: [(0, '1.323')] [2024-03-21 10:46:30,409][14687] Saving configuration to /workspace/metta/train_dir/p2.objt_atn.4/config.json... [2024-03-21 10:46:30,425][14687] Rollout worker 0 uses device cpu [2024-03-21 10:46:30,425][14687] Rollout worker 1 uses device cpu [2024-03-21 10:46:30,425][14687] Rollout worker 2 uses device cpu [2024-03-21 10:46:30,425][14687] Rollout worker 3 uses device cpu [2024-03-21 10:46:30,425][14687] Rollout worker 4 uses device cpu [2024-03-21 10:46:30,426][14687] Rollout worker 5 uses device cpu [2024-03-21 10:46:30,426][14687] Rollout worker 6 uses device cpu [2024-03-21 10:46:30,426][14687] Rollout worker 7 uses device cpu [2024-03-21 10:46:30,426][14687] Rollout worker 8 uses device cpu [2024-03-21 10:46:30,426][14687] Rollout worker 9 uses device cpu [2024-03-21 10:46:30,426][14687] Rollout worker 10 uses device cpu [2024-03-21 10:46:30,426][14687] Rollout worker 11 uses device cpu [2024-03-21 10:46:30,426][14687] Rollout worker 12 uses device cpu [2024-03-21 10:46:30,426][14687] Rollout worker 13 uses device cpu [2024-03-21 10:46:30,426][14687] Rollout worker 14 uses device cpu [2024-03-21 10:46:30,426][14687] Rollout worker 15 uses device cpu [2024-03-21 10:46:30,426][14687] Rollout worker 16 uses device cpu [2024-03-21 10:46:30,427][14687] Rollout worker 17 uses device cpu [2024-03-21 10:46:30,427][14687] Rollout worker 18 uses device cpu [2024-03-21 10:46:30,427][14687] Rollout worker 19 uses device cpu [2024-03-21 10:46:30,427][14687] Rollout worker 20 uses device cpu [2024-03-21 10:46:30,427][14687] Rollout worker 21 uses device cpu [2024-03-21 10:46:30,427][14687] Rollout worker 22 uses device cpu [2024-03-21 10:46:30,427][14687] Rollout worker 23 uses device cpu [2024-03-21 10:46:30,427][14687] Rollout worker 24 uses device cpu [2024-03-21 10:46:30,427][14687] Rollout worker 25 uses device cpu [2024-03-21 10:46:30,427][14687] Rollout worker 26 uses device cpu [2024-03-21 10:46:30,427][14687] Rollout worker 27 uses device cpu [2024-03-21 10:46:30,427][14687] Rollout worker 28 uses device cpu [2024-03-21 10:46:30,427][14687] Rollout worker 29 uses device cpu [2024-03-21 10:46:30,428][14687] Rollout worker 30 uses device cpu [2024-03-21 10:46:30,428][14687] Rollout worker 31 uses device cpu [2024-03-21 10:46:30,428][14687] Rollout worker 32 uses device cpu [2024-03-21 10:46:30,428][14687] Rollout worker 33 uses device cpu [2024-03-21 10:46:30,428][14687] Rollout worker 34 uses device cpu [2024-03-21 10:46:30,428][14687] Rollout worker 35 uses device cpu [2024-03-21 10:46:30,428][14687] Rollout worker 36 uses device cpu [2024-03-21 10:46:30,428][14687] Rollout worker 37 uses device cpu [2024-03-21 10:46:30,428][14687] Rollout worker 38 uses device cpu [2024-03-21 10:46:30,428][14687] Rollout worker 39 uses device cpu [2024-03-21 10:46:30,428][14687] Rollout worker 40 uses device cpu [2024-03-21 10:46:30,428][14687] Rollout worker 41 uses device cpu [2024-03-21 10:46:30,428][14687] Rollout worker 42 uses device cpu [2024-03-21 10:46:30,429][14687] Rollout worker 43 uses device cpu [2024-03-21 10:46:30,429][14687] Rollout worker 44 uses device cpu [2024-03-21 10:46:30,429][14687] Rollout worker 45 uses device cpu [2024-03-21 10:46:30,429][14687] Rollout worker 46 uses device cpu [2024-03-21 10:46:30,429][14687] Rollout worker 47 uses device cpu [2024-03-21 10:46:30,429][14687] Rollout worker 48 uses device cpu [2024-03-21 10:46:30,429][14687] Rollout worker 49 uses device cpu [2024-03-21 10:46:34,649][14687] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-21 10:46:34,650][14687] InferenceWorker_p0-w0: min num requests: 16 [2024-03-21 10:46:34,711][14687] Starting all processes... [2024-03-21 10:46:34,711][14687] Starting process learner_proc0 [2024-03-21 10:46:34,794][14687] Starting all processes... [2024-03-21 10:46:34,797][14687] Starting process inference_proc0-0 [2024-03-21 10:46:34,797][14687] Starting process rollout_proc0 [2024-03-21 10:46:34,798][14687] Starting process rollout_proc1 [2024-03-21 10:46:34,798][14687] Starting process rollout_proc2 [2024-03-21 10:46:34,799][14687] Starting process rollout_proc3 [2024-03-21 10:46:34,800][14687] Starting process rollout_proc4 [2024-03-21 10:46:34,801][14687] Starting process rollout_proc5 [2024-03-21 10:46:34,802][14687] Starting process rollout_proc6 [2024-03-21 10:46:34,802][14687] Starting process rollout_proc7 [2024-03-21 10:46:34,802][14687] Starting process rollout_proc8 [2024-03-21 10:46:34,802][14687] Starting process rollout_proc9 [2024-03-21 10:46:34,803][14687] Starting process rollout_proc10 [2024-03-21 10:46:34,803][14687] Starting process rollout_proc11 [2024-03-21 10:46:34,803][14687] Starting process rollout_proc12 [2024-03-21 10:46:34,803][14687] Starting process rollout_proc13 [2024-03-21 10:46:34,803][14687] Starting process rollout_proc14 [2024-03-21 10:46:34,803][14687] Starting process rollout_proc15 [2024-03-21 10:46:34,804][14687] Starting process rollout_proc16 [2024-03-21 10:46:34,804][14687] Starting process rollout_proc17 [2024-03-21 10:46:34,804][14687] Starting process rollout_proc18 [2024-03-21 10:46:34,804][14687] Starting process rollout_proc19 [2024-03-21 10:46:34,804][14687] Starting process rollout_proc20 [2024-03-21 10:46:34,805][14687] Starting process rollout_proc21 [2024-03-21 10:46:34,808][14687] Starting process rollout_proc22 [2024-03-21 10:46:34,808][14687] Starting process rollout_proc23 [2024-03-21 10:46:34,809][14687] Starting process rollout_proc24 [2024-03-21 10:46:34,811][14687] Starting process rollout_proc25 [2024-03-21 10:46:34,812][14687] Starting process rollout_proc26 [2024-03-21 10:46:34,816][14687] Starting process rollout_proc27 [2024-03-21 10:46:34,816][14687] Starting process rollout_proc28 [2024-03-21 10:46:34,818][14687] Starting process rollout_proc29 [2024-03-21 10:46:34,818][14687] Starting process rollout_proc30 [2024-03-21 10:46:34,827][14687] Starting process rollout_proc31 [2024-03-21 10:46:34,859][14687] Starting process rollout_proc32 [2024-03-21 10:46:34,878][14687] Starting process rollout_proc33 [2024-03-21 10:46:34,879][14687] Starting process rollout_proc34 [2024-03-21 10:46:34,879][14687] Starting process rollout_proc35 [2024-03-21 10:46:34,880][14687] Starting process rollout_proc36 [2024-03-21 10:46:34,918][14687] Starting process rollout_proc37 [2024-03-21 10:46:34,919][14687] Starting process rollout_proc38 [2024-03-21 10:46:34,920][14687] Starting process rollout_proc39 [2024-03-21 10:46:34,928][14687] Starting process rollout_proc40 [2024-03-21 10:46:34,937][14687] Starting process rollout_proc41 [2024-03-21 10:46:34,986][14687] Starting process rollout_proc42 [2024-03-21 10:46:35,002][14687] Starting process rollout_proc43 [2024-03-21 10:46:35,002][14687] Starting process rollout_proc44 [2024-03-21 10:46:35,005][14687] Starting process rollout_proc45 [2024-03-21 10:46:35,042][14687] Starting process rollout_proc46 [2024-03-21 10:46:35,042][14687] Starting process rollout_proc47 [2024-03-21 10:46:35,047][14687] Starting process rollout_proc48 [2024-03-21 10:46:35,047][14687] Starting process rollout_proc49 [2024-03-21 10:46:38,038][14926] Worker 7 uses CPU cores [7] [2024-03-21 10:46:38,070][15477] Worker 30 uses CPU cores [30] [2024-03-21 10:46:38,227][15186] Worker 17 uses CPU cores [17] [2024-03-21 10:46:38,235][15510] Worker 33 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:38,257][14918] Worker 0 uses CPU cores [0] [2024-03-21 10:46:38,282][14922] Worker 3 uses CPU cores [3] [2024-03-21 10:46:38,379][15159] Worker 19 uses CPU cores [19] [2024-03-21 10:46:38,391][15328] Worker 23 uses CPU cores [23] [2024-03-21 10:46:38,398][15545] Worker 37 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:38,411][14898] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-21 10:46:38,411][14898] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-03-21 10:46:38,420][14898] Num visible devices: 1 [2024-03-21 10:46:38,486][14925] Worker 8 uses CPU cores [8] [2024-03-21 10:46:38,494][14920] Worker 1 uses CPU cores [1] [2024-03-21 10:46:38,495][14898] Starting seed is not provided [2024-03-21 10:46:38,495][14898] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-21 10:46:38,496][14898] Initializing actor-critic model on device cuda:0 [2024-03-21 10:46:38,496][14898] RunningMeanStd input shape: (20,) [2024-03-21 10:46:38,496][14898] RunningMeanStd input shape: (24, 11, 11) [2024-03-21 10:46:38,497][14898] RunningMeanStd input shape: (1, 11, 11) [2024-03-21 10:46:38,497][14898] RunningMeanStd input shape: (2,) [2024-03-21 10:46:38,497][14898] RunningMeanStd input shape: (1,) [2024-03-21 10:46:38,497][14898] RunningMeanStd input shape: (1,) [2024-03-21 10:46:38,501][15218] Worker 20 uses CPU cores [20] [2024-03-21 10:46:38,562][14943] Worker 13 uses CPU cores [13] [2024-03-21 10:46:38,562][14929] Worker 11 uses CPU cores [11] [2024-03-21 10:46:38,590][14923] Worker 4 uses CPU cores [4] [2024-03-21 10:46:38,597][15349] Worker 27 uses CPU cores [27] [2024-03-21 10:46:38,602][14928] Worker 10 uses CPU cores [10] [2024-03-21 10:46:38,614][15120] Worker 18 uses CPU cores [18] [2024-03-21 10:46:38,637][15590] Worker 38 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:38,674][14927] Worker 6 uses CPU cores [6] [2024-03-21 10:46:38,678][14965] Worker 14 uses CPU cores [14] [2024-03-21 10:46:38,691][15511] Worker 34 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:38,706][14937] Worker 12 uses CPU cores [12] [2024-03-21 10:46:38,746][15673] Worker 41 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:38,760][15736] Worker 43 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:38,768][15640] Worker 39 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:38,775][15509] Worker 32 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:38,799][15543] Worker 35 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:38,854][15219] Worker 21 uses CPU cores [21] [2024-03-21 10:46:38,859][14949] Worker 15 uses CPU cores [15] [2024-03-21 10:46:38,859][14921] Worker 2 uses CPU cores [2] [2024-03-21 10:46:38,876][15544] Worker 36 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:38,882][14924] Worker 5 uses CPU cores [5] [2024-03-21 10:46:38,893][16176] Worker 46 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:38,932][15737] Worker 44 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:38,942][15257] Worker 26 uses CPU cores [26] [2024-03-21 10:46:38,979][15893] Worker 42 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:39,012][14898] Created Actor Critic model with architecture: [2024-03-21 10:46:39,012][14898] PredictingActorCritic( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (global_vars): RunningMeanStdInPlace() (griddly_obs): RunningMeanStdInPlace() (kinship): RunningMeanStdInPlace() (last_action): RunningMeanStdInPlace() (last_reward): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): GriddlyEncoder( (object_embedding): Sequential( (0): Linear(in_features=52, out_features=64, bias=True) (1): ELU(alpha=1.0) (2): Sequential( (0): Linear(in_features=64, out_features=64, bias=True) (1): ELU(alpha=1.0) ) (3): Sequential( (0): Linear(in_features=64, out_features=64, bias=True) (1): ELU(alpha=1.0) ) (4): Sequential( (0): Linear(in_features=64, out_features=64, bias=True) (1): ELU(alpha=1.0) ) ) (encoder_head): Sequential( (0): Linear(in_features=7767, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) (3): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) (4): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) (5): Sequential( (0): Linear(in_features=512, out_features=512, bias=True) (1): ELU(alpha=1.0) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): GriddlyDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) [2024-03-21 10:46:39,015][16112] Worker 49 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:39,021][15405] Worker 28 uses CPU cores [28] [2024-03-21 10:46:39,021][14919] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-21 10:46:39,027][14919] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-03-21 10:46:39,034][15476] Worker 31 uses CPU cores [31] [2024-03-21 10:46:39,046][14919] Num visible devices: 1 [2024-03-21 10:46:39,051][15185] Worker 16 uses CPU cores [16] [2024-03-21 10:46:39,100][14932] Worker 9 uses CPU cores [9] [2024-03-21 10:46:39,104][15252] Worker 29 uses CPU cores [29] [2024-03-21 10:46:39,109][16080] Worker 45 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:39,139][15475] Worker 25 uses CPU cores [25] [2024-03-21 10:46:39,159][16208] Worker 47 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:39,175][15641] Worker 40 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:39,184][15251] Worker 22 uses CPU cores [22] [2024-03-21 10:46:39,185][15474] Worker 24 uses CPU cores [24] [2024-03-21 10:46:39,208][16175] Worker 48 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2024-03-21 10:46:39,215][14898] Using optimizer [2024-03-21 10:46:39,420][14898] Loading state from checkpoint /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000054814_1796145152.pth... [2024-03-21 10:46:39,441][14898] Loading model from checkpoint [2024-03-21 10:46:39,442][14898] Loaded experiment state at self.train_step=54814, self.env_steps=1796145152 [2024-03-21 10:46:39,442][14898] Initialized policy 0 weights for model version 54814 [2024-03-21 10:46:39,443][14898] LearnerWorker_p0 finished initialization! [2024-03-21 10:46:39,444][14898] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-03-21 10:46:39,485][14919] RunningMeanStd input shape: (20,) [2024-03-21 10:46:39,486][14919] RunningMeanStd input shape: (24, 11, 11) [2024-03-21 10:46:39,486][14919] RunningMeanStd input shape: (1, 11, 11) [2024-03-21 10:46:39,486][14919] RunningMeanStd input shape: (2,) [2024-03-21 10:46:39,486][14919] RunningMeanStd input shape: (1,) [2024-03-21 10:46:39,486][14919] RunningMeanStd input shape: (1,) [2024-03-21 10:46:39,754][14687] Inference worker 0-0 is ready! [2024-03-21 10:46:39,754][14687] All inference workers are ready! Signal rollout workers to start! [2024-03-21 10:46:42,805][14687] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 1796145152. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:46:47,805][14687] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1796145152. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:46:48,260][15185] Decorrelating experience for 0 frames... [2024-03-21 10:46:48,307][15218] Decorrelating experience for 0 frames... [2024-03-21 10:46:48,374][15252] Decorrelating experience for 0 frames... [2024-03-21 10:46:48,471][15328] Decorrelating experience for 0 frames... [2024-03-21 10:46:48,927][15477] Decorrelating experience for 0 frames... [2024-03-21 10:46:51,092][14926] Decorrelating experience for 0 frames... [2024-03-21 10:46:51,260][15159] Decorrelating experience for 0 frames... [2024-03-21 10:46:52,805][14687] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1796145152. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:46:53,756][15474] Decorrelating experience for 0 frames... [2024-03-21 10:46:53,824][14927] Decorrelating experience for 0 frames... [2024-03-21 10:46:53,827][14920] Decorrelating experience for 0 frames... [2024-03-21 10:46:54,045][15120] Decorrelating experience for 0 frames... [2024-03-21 10:46:54,182][14918] Decorrelating experience for 0 frames... [2024-03-21 10:46:54,647][14687] Heartbeat connected on Batcher_0 [2024-03-21 10:46:54,648][14687] Heartbeat connected on LearnerWorker_p0 [2024-03-21 10:46:54,681][14687] Heartbeat connected on InferenceWorker_p0-w0 [2024-03-21 10:46:54,968][14949] Decorrelating experience for 0 frames... [2024-03-21 10:46:55,283][15186] Decorrelating experience for 0 frames... [2024-03-21 10:46:55,523][15476] Decorrelating experience for 0 frames... [2024-03-21 10:46:55,617][14965] Decorrelating experience for 0 frames... [2024-03-21 10:46:56,573][15475] Decorrelating experience for 0 frames... [2024-03-21 10:46:56,621][15219] Decorrelating experience for 0 frames... [2024-03-21 10:46:56,716][14929] Decorrelating experience for 0 frames... [2024-03-21 10:46:56,878][14928] Decorrelating experience for 0 frames... [2024-03-21 10:46:56,960][14924] Decorrelating experience for 0 frames... [2024-03-21 10:46:57,209][14921] Decorrelating experience for 0 frames... [2024-03-21 10:46:57,805][14687] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1796145152. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:46:57,910][15185] Decorrelating experience for 256 frames... [2024-03-21 10:46:57,967][14922] Decorrelating experience for 0 frames... [2024-03-21 10:46:58,065][15257] Decorrelating experience for 0 frames... [2024-03-21 10:46:58,271][14923] Decorrelating experience for 0 frames... [2024-03-21 10:46:58,333][15251] Decorrelating experience for 0 frames... [2024-03-21 10:46:58,824][15477] Decorrelating experience for 256 frames... [2024-03-21 10:46:58,899][14925] Decorrelating experience for 0 frames... [2024-03-21 10:46:59,166][14932] Decorrelating experience for 0 frames... [2024-03-21 10:47:01,485][15405] Decorrelating experience for 0 frames... [2024-03-21 10:47:02,805][14687] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1796145152. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:47:02,915][15218] Decorrelating experience for 256 frames... [2024-03-21 10:47:03,003][14937] Decorrelating experience for 0 frames... [2024-03-21 10:47:03,418][15349] Decorrelating experience for 0 frames... [2024-03-21 10:47:03,682][16080] Decorrelating experience for 0 frames... [2024-03-21 10:47:03,952][15510] Decorrelating experience for 0 frames... [2024-03-21 10:47:04,229][14943] Decorrelating experience for 0 frames... [2024-03-21 10:47:04,355][15186] Decorrelating experience for 256 frames... [2024-03-21 10:47:04,471][16175] Decorrelating experience for 0 frames... [2024-03-21 10:47:05,754][15509] Decorrelating experience for 0 frames... [2024-03-21 10:47:05,816][15640] Decorrelating experience for 0 frames... [2024-03-21 10:47:06,008][15328] Decorrelating experience for 256 frames... [2024-03-21 10:47:06,207][15544] Decorrelating experience for 0 frames... [2024-03-21 10:47:06,478][15737] Decorrelating experience for 0 frames... [2024-03-21 10:47:06,486][14920] Decorrelating experience for 256 frames... [2024-03-21 10:47:06,696][14687] Heartbeat connected on RolloutWorker_w16 [2024-03-21 10:47:06,744][16208] Decorrelating experience for 0 frames... [2024-03-21 10:47:06,977][14918] Decorrelating experience for 256 frames... [2024-03-21 10:47:07,786][16112] Decorrelating experience for 0 frames... [2024-03-21 10:47:07,805][14687] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1796145152. Throughput: 0: 12.0. Samples: 300. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:47:07,871][14687] Heartbeat connected on RolloutWorker_w30 [2024-03-21 10:47:08,558][15252] Decorrelating experience for 256 frames... [2024-03-21 10:47:08,690][14926] Decorrelating experience for 256 frames... [2024-03-21 10:47:08,769][15120] Decorrelating experience for 256 frames... [2024-03-21 10:47:08,956][15673] Decorrelating experience for 0 frames... [2024-03-21 10:47:09,144][15219] Decorrelating experience for 256 frames... [2024-03-21 10:47:09,514][15736] Decorrelating experience for 0 frames... [2024-03-21 10:47:10,030][15159] Decorrelating experience for 256 frames... [2024-03-21 10:47:10,741][15641] Decorrelating experience for 0 frames... [2024-03-21 10:47:10,817][15543] Decorrelating experience for 0 frames... [2024-03-21 10:47:11,471][15893] Decorrelating experience for 0 frames... [2024-03-21 10:47:11,627][14687] Heartbeat connected on RolloutWorker_w20 [2024-03-21 10:47:11,746][15475] Decorrelating experience for 256 frames... [2024-03-21 10:47:12,005][15590] Decorrelating experience for 0 frames... [2024-03-21 10:47:12,299][15251] Decorrelating experience for 256 frames... [2024-03-21 10:47:12,379][15545] Decorrelating experience for 0 frames... [2024-03-21 10:47:12,486][14927] Decorrelating experience for 256 frames... [2024-03-21 10:47:12,492][14921] Decorrelating experience for 256 frames... [2024-03-21 10:47:12,598][14687] Heartbeat connected on RolloutWorker_w17 [2024-03-21 10:47:12,805][14687] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1796145152. Throughput: 0: 310.0. Samples: 9300. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:47:13,246][15405] Decorrelating experience for 256 frames... [2024-03-21 10:47:13,399][15511] Decorrelating experience for 0 frames... [2024-03-21 10:47:13,776][14922] Decorrelating experience for 256 frames... [2024-03-21 10:47:13,913][16176] Decorrelating experience for 0 frames... [2024-03-21 10:47:14,139][15474] Decorrelating experience for 256 frames... [2024-03-21 10:47:14,381][14924] Decorrelating experience for 256 frames... [2024-03-21 10:47:14,838][15349] Decorrelating experience for 256 frames... [2024-03-21 10:47:16,135][14687] Heartbeat connected on RolloutWorker_w1 [2024-03-21 10:47:16,151][14923] Decorrelating experience for 256 frames... [2024-03-21 10:47:17,112][14687] Heartbeat connected on RolloutWorker_w18 [2024-03-21 10:47:17,150][14687] Heartbeat connected on RolloutWorker_w0 [2024-03-21 10:47:17,466][15476] Decorrelating experience for 256 frames... [2024-03-21 10:47:17,805][14687] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1796145152. Throughput: 0: 1197.1. Samples: 41900. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:47:19,091][14932] Decorrelating experience for 256 frames... [2024-03-21 10:47:19,466][14925] Decorrelating experience for 256 frames... [2024-03-21 10:47:19,698][15257] Decorrelating experience for 256 frames... [2024-03-21 10:47:19,846][14687] Heartbeat connected on RolloutWorker_w19 [2024-03-21 10:47:20,018][14949] Decorrelating experience for 256 frames... [2024-03-21 10:47:20,347][14687] Heartbeat connected on RolloutWorker_w21 [2024-03-21 10:47:20,371][14687] Heartbeat connected on RolloutWorker_w25 [2024-03-21 10:47:20,933][14687] Heartbeat connected on RolloutWorker_w7 [2024-03-21 10:47:21,351][14687] Heartbeat connected on RolloutWorker_w23 [2024-03-21 10:47:21,489][14687] Heartbeat connected on RolloutWorker_w29 [2024-03-21 10:47:22,061][14687] Heartbeat connected on RolloutWorker_w2 [2024-03-21 10:47:22,366][14965] Decorrelating experience for 256 frames... [2024-03-21 10:47:22,805][14687] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1796145152. Throughput: 0: 2797.5. Samples: 111900. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:47:22,941][14928] Decorrelating experience for 256 frames... [2024-03-21 10:47:22,963][14687] Heartbeat connected on RolloutWorker_w3 [2024-03-21 10:47:23,358][14929] Decorrelating experience for 256 frames... [2024-03-21 10:47:23,395][14687] Heartbeat connected on RolloutWorker_w22 [2024-03-21 10:47:25,613][14687] Heartbeat connected on RolloutWorker_w5 [2024-03-21 10:47:26,080][14937] Decorrelating experience for 256 frames... [2024-03-21 10:47:26,745][14687] Heartbeat connected on RolloutWorker_w31 [2024-03-21 10:47:27,269][14943] Decorrelating experience for 256 frames... [2024-03-21 10:47:27,345][16080] Decorrelating experience for 256 frames... [2024-03-21 10:47:27,536][14687] Heartbeat connected on RolloutWorker_w4 [2024-03-21 10:47:27,805][14687] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1796145152. Throughput: 0: 4022.2. Samples: 181000. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:47:27,934][14687] Heartbeat connected on RolloutWorker_w6 [2024-03-21 10:47:28,293][16175] Decorrelating experience for 256 frames... [2024-03-21 10:47:28,408][14687] Heartbeat connected on RolloutWorker_w26 [2024-03-21 10:47:28,410][15640] Decorrelating experience for 256 frames... [2024-03-21 10:47:28,817][14687] Heartbeat connected on RolloutWorker_w9 [2024-03-21 10:47:28,917][14687] Heartbeat connected on RolloutWorker_w27 [2024-03-21 10:47:28,998][14687] Heartbeat connected on RolloutWorker_w8 [2024-03-21 10:47:29,508][14687] Heartbeat connected on RolloutWorker_w24 [2024-03-21 10:47:30,326][14687] Heartbeat connected on RolloutWorker_w28 [2024-03-21 10:47:30,693][15737] Decorrelating experience for 256 frames... [2024-03-21 10:47:30,730][16208] Decorrelating experience for 256 frames... [2024-03-21 10:47:32,392][15510] Decorrelating experience for 256 frames... [2024-03-21 10:47:32,692][15893] Decorrelating experience for 256 frames... [2024-03-21 10:47:32,805][14687] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1796145152. Throughput: 0: 8004.5. Samples: 360200. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:47:33,917][14687] Heartbeat connected on RolloutWorker_w10 [2024-03-21 10:47:34,958][15641] Decorrelating experience for 256 frames... [2024-03-21 10:47:35,003][14687] Heartbeat connected on RolloutWorker_w11 [2024-03-21 10:47:35,030][14687] Heartbeat connected on RolloutWorker_w15 [2024-03-21 10:47:35,565][15185] Worker 16, sleep for 48.000 sec to decorrelate experience collection [2024-03-21 10:47:35,901][14687] Heartbeat connected on RolloutWorker_w14 [2024-03-21 10:47:36,835][15544] Decorrelating experience for 256 frames... [2024-03-21 10:47:37,805][14687] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 1796145152. Throughput: 0: 13293.4. Samples: 598200. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-03-21 10:47:38,300][14687] Heartbeat connected on RolloutWorker_w12 [2024-03-21 10:47:39,517][14687] Heartbeat connected on RolloutWorker_w13 [2024-03-21 10:47:39,715][16112] Decorrelating experience for 256 frames... [2024-03-21 10:47:39,998][15218] Worker 20, sleep for 60.000 sec to decorrelate experience collection [2024-03-21 10:47:40,659][15673] Decorrelating experience for 256 frames... [2024-03-21 10:47:40,799][15509] Decorrelating experience for 256 frames... [2024-03-21 10:47:41,286][14687] Heartbeat connected on RolloutWorker_w47 [2024-03-21 10:47:41,319][15477] Worker 30, sleep for 90.000 sec to decorrelate experience collection [2024-03-21 10:47:42,022][14898] Signal inference workers to stop experience collection... [2024-03-21 10:47:42,060][14919] InferenceWorker_p0-w0: stopping experience collection [2024-03-21 10:47:42,201][15736] Decorrelating experience for 256 frames... [2024-03-21 10:47:42,212][15590] Decorrelating experience for 256 frames... [2024-03-21 10:47:42,294][14898] Signal inference workers to resume experience collection... [2024-03-21 10:47:42,294][14919] InferenceWorker_p0-w0: resuming experience collection [2024-03-21 10:47:42,805][14687] Fps is (10 sec: 3276.8, 60 sec: 546.1, 300 sec: 546.1). Total num frames: 1796177920. Throughput: 0: 16215.6. Samples: 729700. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-21 10:47:42,843][16176] Decorrelating experience for 256 frames... [2024-03-21 10:47:42,992][15545] Decorrelating experience for 256 frames... [2024-03-21 10:47:43,273][15511] Decorrelating experience for 256 frames... [2024-03-21 10:47:43,630][14687] Heartbeat connected on RolloutWorker_w45 [2024-03-21 10:47:43,831][15543] Decorrelating experience for 256 frames... [2024-03-21 10:47:44,272][14687] Heartbeat connected on RolloutWorker_w39 [2024-03-21 10:47:44,295][14687] Heartbeat connected on RolloutWorker_w48 [2024-03-21 10:47:47,719][14687] Heartbeat connected on RolloutWorker_w44 [2024-03-21 10:47:47,805][14687] Fps is (10 sec: 6553.6, 60 sec: 1092.3, 300 sec: 1008.2). Total num frames: 1796210688. Throughput: 0: 21880.1. Samples: 984600. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-21 10:47:47,951][14687] Heartbeat connected on RolloutWorker_w33 [2024-03-21 10:47:48,390][15186] Worker 17, sleep for 51.000 sec to decorrelate experience collection [2024-03-21 10:47:48,636][14687] Heartbeat connected on RolloutWorker_w36 [2024-03-21 10:47:49,705][14687] Heartbeat connected on RolloutWorker_w42 [2024-03-21 10:47:50,287][14920] Worker 1, sleep for 3.000 sec to decorrelate experience collection [2024-03-21 10:47:51,124][14687] Heartbeat connected on RolloutWorker_w38 [2024-03-21 10:47:51,278][14687] Heartbeat connected on RolloutWorker_w37 [2024-03-21 10:47:51,279][15252] Worker 29, sleep for 87.000 sec to decorrelate experience collection [2024-03-21 10:47:51,763][14687] Heartbeat connected on RolloutWorker_w40 [2024-03-21 10:47:52,309][15120] Worker 18, sleep for 54.000 sec to decorrelate experience collection [2024-03-21 10:47:52,620][14687] Heartbeat connected on RolloutWorker_w49 [2024-03-21 10:47:52,805][14687] Fps is (10 sec: 13107.3, 60 sec: 2730.7, 300 sec: 2340.6). Total num frames: 1796308992. Throughput: 0: 28544.6. Samples: 1284800. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-21 10:47:53,288][15251] Worker 22, sleep for 66.000 sec to decorrelate experience collection [2024-03-21 10:47:53,306][14920] Worker 1 awakens! [2024-03-21 10:47:54,359][14687] Heartbeat connected on RolloutWorker_w41 [2024-03-21 10:47:54,365][14687] Heartbeat connected on RolloutWorker_w46 [2024-03-21 10:47:55,108][14687] Heartbeat connected on RolloutWorker_w32 [2024-03-21 10:47:55,732][14687] Heartbeat connected on RolloutWorker_w43 [2024-03-21 10:47:56,418][14687] Heartbeat connected on RolloutWorker_w35 [2024-03-21 10:47:56,990][15219] Worker 21, sleep for 63.000 sec to decorrelate experience collection [2024-03-21 10:47:57,017][14687] Heartbeat connected on RolloutWorker_w34 [2024-03-21 10:47:57,379][15476] Worker 31, sleep for 93.000 sec to decorrelate experience collection [2024-03-21 10:47:57,805][14687] Fps is (10 sec: 22937.6, 60 sec: 4915.2, 300 sec: 3932.2). Total num frames: 1796440064. Throughput: 0: 31444.4. Samples: 1424300. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-21 10:47:58,026][14919] Updated weights for policy 0, policy_version 54824 (0.0040) [2024-03-21 10:47:59,075][15159] Worker 19, sleep for 57.000 sec to decorrelate experience collection [2024-03-21 10:47:59,374][15474] Worker 24, sleep for 72.000 sec to decorrelate experience collection [2024-03-21 10:47:59,386][15328] Worker 23, sleep for 69.000 sec to decorrelate experience collection [2024-03-21 10:47:59,599][15475] Worker 25, sleep for 75.000 sec to decorrelate experience collection [2024-03-21 10:48:00,658][14921] Worker 2, sleep for 6.000 sec to decorrelate experience collection [2024-03-21 10:48:00,821][14922] Worker 3, sleep for 9.000 sec to decorrelate experience collection [2024-03-21 10:48:01,015][15405] Worker 28, sleep for 84.000 sec to decorrelate experience collection [2024-03-21 10:48:01,403][14926] Worker 7, sleep for 21.000 sec to decorrelate experience collection [2024-03-21 10:48:01,792][14924] Worker 5, sleep for 15.000 sec to decorrelate experience collection [2024-03-21 10:48:01,929][14932] Worker 9, sleep for 27.000 sec to decorrelate experience collection [2024-03-21 10:48:02,677][15257] Worker 26, sleep for 78.000 sec to decorrelate experience collection [2024-03-21 10:48:02,805][14687] Fps is (10 sec: 39320.3, 60 sec: 9284.2, 300 sec: 6963.2). Total num frames: 1796702208. Throughput: 0: 38268.7. Samples: 1764000. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-21 10:48:02,913][15349] Worker 27, sleep for 81.000 sec to decorrelate experience collection [2024-03-21 10:48:03,382][14925] Worker 8, sleep for 24.000 sec to decorrelate experience collection [2024-03-21 10:48:03,790][14919] Updated weights for policy 0, policy_version 54834 (0.0011) [2024-03-21 10:48:06,502][14949] Worker 15, sleep for 45.000 sec to decorrelate experience collection [2024-03-21 10:48:06,620][14927] Worker 6, sleep for 18.000 sec to decorrelate experience collection [2024-03-21 10:48:06,690][14921] Worker 2 awakens! [2024-03-21 10:48:06,814][14929] Worker 11, sleep for 33.000 sec to decorrelate experience collection [2024-03-21 10:48:07,187][14923] Worker 4, sleep for 12.000 sec to decorrelate experience collection [2024-03-21 10:48:07,805][14687] Fps is (10 sec: 45875.5, 60 sec: 12561.1, 300 sec: 8866.7). Total num frames: 1796898816. Throughput: 0: 44346.8. Samples: 2107500. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-21 10:48:07,944][14928] Worker 10, sleep for 30.000 sec to decorrelate experience collection [2024-03-21 10:48:08,581][14965] Worker 14, sleep for 42.000 sec to decorrelate experience collection [2024-03-21 10:48:09,867][14922] Worker 3 awakens! [2024-03-21 10:48:10,483][14943] Worker 13, sleep for 39.000 sec to decorrelate experience collection [2024-03-21 10:48:10,883][14937] Worker 12, sleep for 36.000 sec to decorrelate experience collection [2024-03-21 10:48:12,805][14687] Fps is (10 sec: 36046.2, 60 sec: 15291.8, 300 sec: 10194.5). Total num frames: 1797062656. Throughput: 0: 49842.4. Samples: 2423900. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-03-21 10:48:13,446][16208] Worker 47, sleep for 141.000 sec to decorrelate experience collection [2024-03-21 10:48:13,765][16080] Worker 45, sleep for 135.000 sec to decorrelate experience collection [2024-03-21 10:48:14,448][16175] Worker 48, sleep for 144.000 sec to decorrelate experience collection [2024-03-21 10:48:14,767][14919] Updated weights for policy 0, policy_version 54844 (0.0041) [2024-03-21 10:48:15,267][15640] Worker 39, sleep for 117.000 sec to decorrelate experience collection [2024-03-21 10:48:15,993][15544] Worker 36, sleep for 108.000 sec to decorrelate experience collection [2024-03-21 10:48:16,671][15510] Worker 33, sleep for 99.000 sec to decorrelate experience collection [2024-03-21 10:48:16,715][15737] Worker 44, sleep for 132.000 sec to decorrelate experience collection [2024-03-21 10:48:16,866][14924] Worker 5 awakens! [2024-03-21 10:48:16,921][15545] Worker 37, sleep for 111.000 sec to decorrelate experience collection [2024-03-21 10:48:17,443][15590] Worker 38, sleep for 114.000 sec to decorrelate experience collection [2024-03-21 10:48:17,595][15893] Worker 42, sleep for 126.000 sec to decorrelate experience collection [2024-03-21 10:48:17,805][14687] Fps is (10 sec: 39321.3, 60 sec: 19114.7, 300 sec: 12072.4). Total num frames: 1797292032. Throughput: 0: 49457.8. Samples: 2585800. Policy #0 lag: (min: 0.0, avg: 12.5, max: 30.0) [2024-03-21 10:48:18,200][15641] Worker 40, sleep for 120.000 sec to decorrelate experience collection [2024-03-21 10:48:19,182][16112] Worker 49, sleep for 147.000 sec to decorrelate experience collection [2024-03-21 10:48:19,246][14923] Worker 4 awakens! [2024-03-21 10:48:19,519][15509] Worker 32, sleep for 96.000 sec to decorrelate experience collection [2024-03-21 10:48:19,613][14919] Updated weights for policy 0, policy_version 54854 (0.0019) [2024-03-21 10:48:19,676][15673] Worker 41, sleep for 123.000 sec to decorrelate experience collection [2024-03-21 10:48:19,697][16176] Worker 46, sleep for 138.000 sec to decorrelate experience collection [2024-03-21 10:48:19,948][15543] Worker 35, sleep for 105.000 sec to decorrelate experience collection [2024-03-21 10:48:20,425][15511] Worker 34, sleep for 102.000 sec to decorrelate experience collection [2024-03-21 10:48:21,029][15736] Worker 43, sleep for 129.000 sec to decorrelate experience collection [2024-03-21 10:48:22,504][14926] Worker 7 awakens! [2024-03-21 10:48:22,805][14687] Fps is (10 sec: 62258.9, 60 sec: 25668.3, 300 sec: 15401.0). Total num frames: 1797685248. Throughput: 0: 48013.4. Samples: 2758800. Policy #0 lag: (min: 0.0, avg: 12.5, max: 30.0) [2024-03-21 10:48:23,666][15185] Worker 16 awakens! [2024-03-21 10:48:24,706][14927] Worker 6 awakens! [2024-03-21 10:48:26,237][14919] Updated weights for policy 0, policy_version 54864 (0.0006) [2024-03-21 10:48:27,445][14925] Worker 8 awakens! [2024-03-21 10:48:27,805][14687] Fps is (10 sec: 52429.0, 60 sec: 27852.8, 300 sec: 15915.9). Total num frames: 1797816320. Throughput: 0: 46855.6. Samples: 2838200. Policy #0 lag: (min: 0.0, avg: 12.5, max: 30.0) [2024-03-21 10:48:27,814][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000054865_1797816320.pth... [2024-03-21 10:48:27,871][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000054644_1790574592.pth [2024-03-21 10:48:29,029][14932] Worker 9 awakens! [2024-03-21 10:48:32,805][14687] Fps is (10 sec: 19660.8, 60 sec: 28945.1, 300 sec: 15788.2). Total num frames: 1797881856. Throughput: 0: 45575.6. Samples: 3035500. Policy #0 lag: (min: 0.0, avg: 12.5, max: 30.0) [2024-03-21 10:48:37,805][14687] Fps is (10 sec: 26214.0, 60 sec: 32221.8, 300 sec: 16811.4). Total num frames: 1798078464. Throughput: 0: 43248.7. Samples: 3231000. Policy #0 lag: (min: 0.0, avg: 12.5, max: 30.0) [2024-03-21 10:48:37,814][14687] Avg episode reward: [(0, '0.939')] [2024-03-21 10:48:37,983][14919] Updated weights for policy 0, policy_version 54874 (0.0007) [2024-03-21 10:48:38,045][14928] Worker 10 awakens! [2024-03-21 10:48:39,395][15186] Worker 17 awakens! [2024-03-21 10:48:39,914][14929] Worker 11 awakens! [2024-03-21 10:48:40,099][15218] Worker 20 awakens! [2024-03-21 10:48:42,805][14687] Fps is (10 sec: 45875.2, 60 sec: 36044.8, 300 sec: 18295.5). Total num frames: 1798340608. Throughput: 0: 42220.0. Samples: 3324200. Policy #0 lag: (min: 0.0, avg: 12.5, max: 30.0) [2024-03-21 10:48:42,812][14687] Avg episode reward: [(0, '0.939')] [2024-03-21 10:48:45,152][14919] Updated weights for policy 0, policy_version 54884 (0.0009) [2024-03-21 10:48:46,363][15120] Worker 18 awakens! [2024-03-21 10:48:46,945][14937] Worker 12 awakens! [2024-03-21 10:48:47,805][14687] Fps is (10 sec: 42598.9, 60 sec: 38229.3, 300 sec: 18874.4). Total num frames: 1798504448. Throughput: 0: 39398.0. Samples: 3536900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 30.0) [2024-03-21 10:48:47,806][14687] Avg episode reward: [(0, '0.499')] [2024-03-21 10:48:49,582][14943] Worker 13 awakens! [2024-03-21 10:48:50,681][14965] Worker 14 awakens! [2024-03-21 10:48:51,602][14949] Worker 15 awakens! [2024-03-21 10:48:52,805][14687] Fps is (10 sec: 36044.5, 60 sec: 39867.7, 300 sec: 19660.8). Total num frames: 1798701056. Throughput: 0: 37704.4. Samples: 3804200. Policy #0 lag: (min: 52.0, avg: 69.8, max: 77.0) [2024-03-21 10:48:52,806][14687] Avg episode reward: [(0, '0.499')] [2024-03-21 10:48:56,174][15159] Worker 19 awakens! [2024-03-21 10:48:56,603][14919] Updated weights for policy 0, policy_version 54894 (0.0009) [2024-03-21 10:48:57,805][14687] Fps is (10 sec: 26214.6, 60 sec: 38775.5, 300 sec: 19418.1). Total num frames: 1798766592. Throughput: 0: 36862.2. Samples: 4082700. Policy #0 lag: (min: 52.0, avg: 69.8, max: 77.0) [2024-03-21 10:48:57,805][14687] Avg episode reward: [(0, '0.879')] [2024-03-21 10:48:59,389][15251] Worker 22 awakens! [2024-03-21 10:49:00,097][15219] Worker 21 awakens! [2024-03-21 10:49:01,949][14919] Updated weights for policy 0, policy_version 54904 (0.0031) [2024-03-21 10:49:02,805][14687] Fps is (10 sec: 45875.1, 60 sec: 40960.2, 300 sec: 21533.3). Total num frames: 1799159808. Throughput: 0: 35746.6. Samples: 4194400. Policy #0 lag: (min: 52.0, avg: 69.8, max: 77.0) [2024-03-21 10:49:02,806][14687] Avg episode reward: [(0, '0.650')] [2024-03-21 10:49:07,805][14687] Fps is (10 sec: 55705.4, 60 sec: 40413.8, 300 sec: 21920.7). Total num frames: 1799323648. Throughput: 0: 37606.7. Samples: 4451100. Policy #0 lag: (min: 52.0, avg: 69.8, max: 77.0) [2024-03-21 10:49:07,805][14687] Avg episode reward: [(0, '0.732')] [2024-03-21 10:49:08,486][15328] Worker 23 awakens! [2024-03-21 10:49:08,938][14919] Updated weights for policy 0, policy_version 54914 (0.0011) [2024-03-21 10:49:11,327][15477] Worker 30 awakens! [2024-03-21 10:49:11,477][15474] Worker 24 awakens! [2024-03-21 10:49:11,770][14898] Signal inference workers to stop experience collection... (50 times) [2024-03-21 10:49:11,813][14919] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-03-21 10:49:12,023][14898] Signal inference workers to resume experience collection... (50 times) [2024-03-21 10:49:12,024][14919] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-03-21 10:49:12,805][14687] Fps is (10 sec: 49152.2, 60 sec: 43144.5, 300 sec: 23374.5). Total num frames: 1799651328. Throughput: 0: 41486.6. Samples: 4705100. Policy #0 lag: (min: 52.0, avg: 69.8, max: 77.0) [2024-03-21 10:49:12,805][14687] Avg episode reward: [(0, '0.740')] [2024-03-21 10:49:14,114][14919] Updated weights for policy 0, policy_version 54924 (0.0011) [2024-03-21 10:49:14,702][15475] Worker 25 awakens! [2024-03-21 10:49:17,805][14687] Fps is (10 sec: 55705.0, 60 sec: 43144.5, 300 sec: 24100.4). Total num frames: 1799880704. Throughput: 0: 40311.0. Samples: 4849500. Policy #0 lag: (min: 52.0, avg: 69.8, max: 77.0) [2024-03-21 10:49:17,806][14687] Avg episode reward: [(0, '1.289')] [2024-03-21 10:49:18,364][15252] Worker 29 awakens! [2024-03-21 10:49:20,774][15257] Worker 26 awakens! [2024-03-21 10:49:22,805][14687] Fps is (10 sec: 29490.9, 60 sec: 37683.1, 300 sec: 23756.8). Total num frames: 1799946240. Throughput: 0: 43042.3. Samples: 5167900. Policy #0 lag: (min: 52.0, avg: 69.8, max: 77.0) [2024-03-21 10:49:22,806][14687] Avg episode reward: [(0, '1.289')] [2024-03-21 10:49:24,014][15349] Worker 27 awakens! [2024-03-21 10:49:25,114][15405] Worker 28 awakens! [2024-03-21 10:49:25,917][14919] Updated weights for policy 0, policy_version 54934 (0.0012) [2024-03-21 10:49:27,805][14687] Fps is (10 sec: 29491.3, 60 sec: 39321.6, 300 sec: 24427.1). Total num frames: 1800175616. Throughput: 0: 47537.7. Samples: 5463400. Policy #0 lag: (min: 2.0, avg: 21.2, max: 42.0) [2024-03-21 10:49:27,809][14687] Avg episode reward: [(0, '1.019')] [2024-03-21 10:49:30,486][15476] Worker 31 awakens! [2024-03-21 10:49:30,970][14919] Updated weights for policy 0, policy_version 54944 (0.0034) [2024-03-21 10:49:32,805][14687] Fps is (10 sec: 58982.8, 60 sec: 44236.7, 300 sec: 25828.9). Total num frames: 1800536064. Throughput: 0: 45715.5. Samples: 5594100. Policy #0 lag: (min: 2.0, avg: 21.2, max: 42.0) [2024-03-21 10:49:32,806][14687] Avg episode reward: [(0, '0.855')] [2024-03-21 10:49:37,805][14687] Fps is (10 sec: 52428.2, 60 sec: 43690.7, 300 sec: 26027.2). Total num frames: 1800699904. Throughput: 0: 46642.1. Samples: 5903100. Policy #0 lag: (min: 2.0, avg: 21.2, max: 42.0) [2024-03-21 10:49:37,806][14687] Avg episode reward: [(0, '0.722')] [2024-03-21 10:49:38,112][14919] Updated weights for policy 0, policy_version 54954 (0.0010) [2024-03-21 10:49:42,805][14687] Fps is (10 sec: 45875.5, 60 sec: 44236.8, 300 sec: 26942.6). Total num frames: 1800994816. Throughput: 0: 46813.3. Samples: 6189300. Policy #0 lag: (min: 2.0, avg: 21.2, max: 42.0) [2024-03-21 10:49:42,805][14687] Avg episode reward: [(0, '0.996')] [2024-03-21 10:49:43,544][14919] Updated weights for policy 0, policy_version 54964 (0.0026) [2024-03-21 10:49:47,805][14687] Fps is (10 sec: 65536.5, 60 sec: 47513.6, 300 sec: 28162.8). Total num frames: 1801355264. Throughput: 0: 47166.7. Samples: 6316900. Policy #0 lag: (min: 2.0, avg: 21.2, max: 42.0) [2024-03-21 10:49:47,806][14687] Avg episode reward: [(0, '0.753')] [2024-03-21 10:49:48,355][14919] Updated weights for policy 0, policy_version 54974 (0.0008) [2024-03-21 10:49:52,805][14687] Fps is (10 sec: 52428.6, 60 sec: 46967.5, 300 sec: 28284.0). Total num frames: 1801519104. Throughput: 0: 49024.4. Samples: 6657200. Policy #0 lag: (min: 2.0, avg: 21.2, max: 42.0) [2024-03-21 10:49:52,806][14687] Avg episode reward: [(0, '0.753')] [2024-03-21 10:49:55,622][15509] Worker 32 awakens! [2024-03-21 10:49:55,774][15510] Worker 33 awakens! [2024-03-21 10:49:57,012][14919] Updated weights for policy 0, policy_version 54984 (0.0020) [2024-03-21 10:49:57,805][14687] Fps is (10 sec: 39322.0, 60 sec: 49698.1, 300 sec: 28735.0). Total num frames: 1801748480. Throughput: 0: 50604.5. Samples: 6982300. Policy #0 lag: (min: 2.0, avg: 21.2, max: 42.0) [2024-03-21 10:49:57,806][14687] Avg episode reward: [(0, '0.730')] [2024-03-21 10:50:02,526][15511] Worker 34 awakens! [2024-03-21 10:50:02,805][14687] Fps is (10 sec: 45874.9, 60 sec: 46967.5, 300 sec: 29163.5). Total num frames: 1801977856. Throughput: 0: 50524.4. Samples: 7123100. Policy #0 lag: (min: 0.0, avg: 25.7, max: 53.0) [2024-03-21 10:50:02,806][14687] Avg episode reward: [(0, '1.107')] [2024-03-21 10:50:03,462][14919] Updated weights for policy 0, policy_version 54994 (0.0020) [2024-03-21 10:50:04,094][15544] Worker 36 awakens! [2024-03-21 10:50:05,050][15543] Worker 35 awakens! [2024-03-21 10:50:06,785][14898] Signal inference workers to stop experience collection... (100 times) [2024-03-21 10:50:06,839][14919] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-03-21 10:50:07,096][14898] Signal inference workers to resume experience collection... (100 times) [2024-03-21 10:50:07,096][14919] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-03-21 10:50:07,805][14687] Fps is (10 sec: 55705.4, 60 sec: 49698.1, 300 sec: 30050.7). Total num frames: 1802305536. Throughput: 0: 49184.6. Samples: 7381200. Policy #0 lag: (min: 0.0, avg: 25.7, max: 53.0) [2024-03-21 10:50:07,806][14687] Avg episode reward: [(0, '0.843')] [2024-03-21 10:50:08,022][15545] Worker 37 awakens! [2024-03-21 10:50:08,608][14919] Updated weights for policy 0, policy_version 55004 (0.0057) [2024-03-21 10:50:11,542][15590] Worker 38 awakens! [2024-03-21 10:50:12,350][15640] Worker 39 awakens! [2024-03-21 10:50:12,805][14687] Fps is (10 sec: 52428.9, 60 sec: 47513.6, 300 sec: 30271.4). Total num frames: 1802502144. Throughput: 0: 45742.2. Samples: 7521800. Policy #0 lag: (min: 0.0, avg: 25.7, max: 53.0) [2024-03-21 10:50:12,805][14687] Avg episode reward: [(0, '0.950')] [2024-03-21 10:50:15,939][14919] Updated weights for policy 0, policy_version 55014 (0.0010) [2024-03-21 10:50:17,805][14687] Fps is (10 sec: 49151.8, 60 sec: 48605.9, 300 sec: 30939.1). Total num frames: 1802797056. Throughput: 0: 50131.1. Samples: 7850000. Policy #0 lag: (min: 0.0, avg: 25.7, max: 53.0) [2024-03-21 10:50:17,806][14687] Avg episode reward: [(0, '1.364')] [2024-03-21 10:50:18,302][15641] Worker 40 awakens! [2024-03-21 10:50:20,537][14919] Updated weights for policy 0, policy_version 55024 (0.0027) [2024-03-21 10:50:22,778][15673] Worker 41 awakens! [2024-03-21 10:50:22,805][14687] Fps is (10 sec: 65536.2, 60 sec: 53521.1, 300 sec: 31874.3). Total num frames: 1803157504. Throughput: 0: 49571.2. Samples: 8133800. Policy #0 lag: (min: 0.0, avg: 25.7, max: 53.0) [2024-03-21 10:50:22,806][14687] Avg episode reward: [(0, '1.078')] [2024-03-21 10:50:23,694][15893] Worker 42 awakens! [2024-03-21 10:50:25,683][14919] Updated weights for policy 0, policy_version 55034 (0.0019) [2024-03-21 10:50:27,805][14687] Fps is (10 sec: 62258.6, 60 sec: 54067.1, 300 sec: 32331.1). Total num frames: 1803419648. Throughput: 0: 49726.5. Samples: 8427000. Policy #0 lag: (min: 0.0, avg: 25.7, max: 53.0) [2024-03-21 10:50:27,806][14687] Avg episode reward: [(0, '0.938')] [2024-03-21 10:50:27,817][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000055036_1803419648.pth... [2024-03-21 10:50:27,972][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000054814_1796145152.pth [2024-03-21 10:50:28,818][15737] Worker 44 awakens! [2024-03-21 10:50:28,865][16080] Worker 45 awakens! [2024-03-21 10:50:30,115][15736] Worker 43 awakens! [2024-03-21 10:50:32,805][14687] Fps is (10 sec: 45875.5, 60 sec: 51336.6, 300 sec: 32483.1). Total num frames: 1803616256. Throughput: 0: 50091.2. Samples: 8571000. Policy #0 lag: (min: 0.0, avg: 25.7, max: 53.0) [2024-03-21 10:50:32,805][14687] Avg episode reward: [(0, '0.980')] [2024-03-21 10:50:34,546][16208] Worker 47 awakens! [2024-03-21 10:50:35,648][14919] Updated weights for policy 0, policy_version 55044 (0.0036) [2024-03-21 10:50:37,765][16176] Worker 46 awakens! [2024-03-21 10:50:37,805][14687] Fps is (10 sec: 36045.1, 60 sec: 51336.6, 300 sec: 32489.1). Total num frames: 1803780096. Throughput: 0: 49373.3. Samples: 8879000. Policy #0 lag: (min: 2.0, avg: 31.3, max: 63.0) [2024-03-21 10:50:37,806][14687] Avg episode reward: [(0, '0.718')] [2024-03-21 10:50:38,546][16175] Worker 48 awakens! [2024-03-21 10:50:42,805][14687] Fps is (10 sec: 32767.4, 60 sec: 49151.8, 300 sec: 32494.9). Total num frames: 1803943936. Throughput: 0: 45942.0. Samples: 9049700. Policy #0 lag: (min: 2.0, avg: 31.3, max: 63.0) [2024-03-21 10:50:42,806][14687] Avg episode reward: [(0, '0.718')] [2024-03-21 10:50:43,705][14919] Updated weights for policy 0, policy_version 55054 (0.0023) [2024-03-21 10:50:46,278][16112] Worker 49 awakens! [2024-03-21 10:50:47,805][14687] Fps is (10 sec: 42598.4, 60 sec: 47513.6, 300 sec: 32901.8). Total num frames: 1804206080. Throughput: 0: 49824.5. Samples: 9365200. Policy #0 lag: (min: 2.0, avg: 31.3, max: 63.0) [2024-03-21 10:50:47,806][14687] Avg episode reward: [(0, '1.500')] [2024-03-21 10:50:49,752][14919] Updated weights for policy 0, policy_version 55064 (0.0017) [2024-03-21 10:50:52,805][14687] Fps is (10 sec: 49152.9, 60 sec: 48605.9, 300 sec: 33161.2). Total num frames: 1804435456. Throughput: 0: 50664.4. Samples: 9661100. Policy #0 lag: (min: 2.0, avg: 31.3, max: 63.0) [2024-03-21 10:50:52,806][14687] Avg episode reward: [(0, '1.073')] [2024-03-21 10:50:57,805][14687] Fps is (10 sec: 39321.6, 60 sec: 47513.5, 300 sec: 33153.5). Total num frames: 1804599296. Throughput: 0: 51042.2. Samples: 9818700. Policy #0 lag: (min: 2.0, avg: 31.3, max: 63.0) [2024-03-21 10:50:57,806][14687] Avg episode reward: [(0, '1.040')] [2024-03-21 10:50:58,874][14919] Updated weights for policy 0, policy_version 55074 (0.0010) [2024-03-21 10:51:02,569][14919] Updated weights for policy 0, policy_version 55084 (0.0036) [2024-03-21 10:51:02,805][14687] Fps is (10 sec: 55704.8, 60 sec: 50244.2, 300 sec: 34028.3). Total num frames: 1804992512. Throughput: 0: 50399.9. Samples: 10118000. Policy #0 lag: (min: 2.0, avg: 31.3, max: 63.0) [2024-03-21 10:51:02,806][14687] Avg episode reward: [(0, '1.195')] [2024-03-21 10:51:03,507][14898] Signal inference workers to stop experience collection... (150 times) [2024-03-21 10:51:03,566][14919] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-03-21 10:51:03,574][14898] Signal inference workers to resume experience collection... (150 times) [2024-03-21 10:51:03,614][14919] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-03-21 10:51:07,805][14687] Fps is (10 sec: 68812.7, 60 sec: 49698.1, 300 sec: 34499.2). Total num frames: 1805287424. Throughput: 0: 50620.0. Samples: 10411700. Policy #0 lag: (min: 2.0, avg: 31.3, max: 63.0) [2024-03-21 10:51:07,814][14687] Avg episode reward: [(0, '1.256')] [2024-03-21 10:51:08,846][14919] Updated weights for policy 0, policy_version 55094 (0.0022) [2024-03-21 10:51:12,805][14687] Fps is (10 sec: 49152.6, 60 sec: 49698.2, 300 sec: 34588.5). Total num frames: 1805484032. Throughput: 0: 47326.8. Samples: 10556700. Policy #0 lag: (min: 2.0, avg: 40.1, max: 72.0) [2024-03-21 10:51:12,814][14687] Avg episode reward: [(0, '1.189')] [2024-03-21 10:51:14,255][14919] Updated weights for policy 0, policy_version 55104 (0.0034) [2024-03-21 10:51:17,805][14687] Fps is (10 sec: 55705.9, 60 sec: 50790.4, 300 sec: 35270.3). Total num frames: 1805844480. Throughput: 0: 50637.8. Samples: 10849700. Policy #0 lag: (min: 2.0, avg: 40.1, max: 72.0) [2024-03-21 10:51:17,806][14687] Avg episode reward: [(0, '1.506')] [2024-03-21 10:51:18,864][14919] Updated weights for policy 0, policy_version 55114 (0.0015) [2024-03-21 10:51:22,805][14687] Fps is (10 sec: 62258.8, 60 sec: 49152.0, 300 sec: 35576.7). Total num frames: 1806106624. Throughput: 0: 50913.3. Samples: 11170100. Policy #0 lag: (min: 2.0, avg: 40.1, max: 72.0) [2024-03-21 10:51:22,814][14687] Avg episode reward: [(0, '1.506')] [2024-03-21 10:51:25,689][14919] Updated weights for policy 0, policy_version 55124 (0.0010) [2024-03-21 10:51:27,805][14687] Fps is (10 sec: 52428.8, 60 sec: 49152.1, 300 sec: 35872.4). Total num frames: 1806368768. Throughput: 0: 54120.2. Samples: 11485100. Policy #0 lag: (min: 2.0, avg: 40.1, max: 72.0) [2024-03-21 10:51:27,806][14687] Avg episode reward: [(0, '1.392')] [2024-03-21 10:51:32,805][14687] Fps is (10 sec: 32768.2, 60 sec: 46967.4, 300 sec: 35479.9). Total num frames: 1806434304. Throughput: 0: 50775.6. Samples: 11650100. Policy #0 lag: (min: 2.0, avg: 40.1, max: 72.0) [2024-03-21 10:51:32,806][14687] Avg episode reward: [(0, '1.685')] [2024-03-21 10:51:34,891][14919] Updated weights for policy 0, policy_version 55134 (0.0015) [2024-03-21 10:51:37,805][14687] Fps is (10 sec: 42598.5, 60 sec: 50244.3, 300 sec: 36100.4). Total num frames: 1806794752. Throughput: 0: 51217.8. Samples: 11965900. Policy #0 lag: (min: 2.0, avg: 40.1, max: 72.0) [2024-03-21 10:51:37,806][14687] Avg episode reward: [(0, '0.806')] [2024-03-21 10:51:42,805][14687] Fps is (10 sec: 45875.5, 60 sec: 49152.2, 300 sec: 36433.6). Total num frames: 1806893056. Throughput: 0: 51206.7. Samples: 12123000. Policy #0 lag: (min: 2.0, avg: 40.1, max: 72.0) [2024-03-21 10:51:42,806][14687] Avg episode reward: [(0, '0.718')] [2024-03-21 10:51:44,356][14919] Updated weights for policy 0, policy_version 55144 (0.0015) [2024-03-21 10:51:47,805][14687] Fps is (10 sec: 32767.6, 60 sec: 48605.8, 300 sec: 37211.1). Total num frames: 1807122432. Throughput: 0: 51011.1. Samples: 12413500. Policy #0 lag: (min: 1.0, avg: 32.8, max: 77.0) [2024-03-21 10:51:47,806][14687] Avg episode reward: [(0, '1.136')] [2024-03-21 10:51:52,343][14919] Updated weights for policy 0, policy_version 55154 (0.0010) [2024-03-21 10:51:52,805][14687] Fps is (10 sec: 42597.5, 60 sec: 48059.6, 300 sec: 37877.6). Total num frames: 1807319040. Throughput: 0: 51315.4. Samples: 12720900. Policy #0 lag: (min: 1.0, avg: 32.8, max: 77.0) [2024-03-21 10:51:52,806][14687] Avg episode reward: [(0, '0.516')] [2024-03-21 10:51:55,121][14898] Signal inference workers to stop experience collection... (200 times) [2024-03-21 10:51:55,205][14919] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-03-21 10:51:55,405][14898] Signal inference workers to resume experience collection... (200 times) [2024-03-21 10:51:55,405][14919] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-03-21 10:51:56,027][14919] Updated weights for policy 0, policy_version 55164 (0.0018) [2024-03-21 10:51:57,805][14687] Fps is (10 sec: 58982.2, 60 sec: 51882.6, 300 sec: 39210.5). Total num frames: 1807712256. Throughput: 0: 50982.1. Samples: 12850900. Policy #0 lag: (min: 1.0, avg: 32.8, max: 77.0) [2024-03-21 10:51:57,806][14687] Avg episode reward: [(0, '1.014')] [2024-03-21 10:52:01,132][14919] Updated weights for policy 0, policy_version 55174 (0.0025) [2024-03-21 10:52:02,805][14687] Fps is (10 sec: 78644.8, 60 sec: 51882.8, 300 sec: 40543.5). Total num frames: 1808105472. Throughput: 0: 50344.5. Samples: 13115200. Policy #0 lag: (min: 1.0, avg: 32.8, max: 77.0) [2024-03-21 10:52:02,805][14687] Avg episode reward: [(0, '1.175')] [2024-03-21 10:52:04,686][14919] Updated weights for policy 0, policy_version 55184 (0.0031) [2024-03-21 10:52:07,805][14687] Fps is (10 sec: 65536.3, 60 sec: 51336.5, 300 sec: 41432.1). Total num frames: 1808367616. Throughput: 0: 48975.5. Samples: 13374000. Policy #0 lag: (min: 1.0, avg: 32.8, max: 77.0) [2024-03-21 10:52:07,806][14687] Avg episode reward: [(0, '1.185')] [2024-03-21 10:52:12,805][14687] Fps is (10 sec: 45874.8, 60 sec: 51336.5, 300 sec: 42098.5). Total num frames: 1808564224. Throughput: 0: 45253.3. Samples: 13521500. Policy #0 lag: (min: 1.0, avg: 32.8, max: 77.0) [2024-03-21 10:52:12,806][14687] Avg episode reward: [(0, '1.222')] [2024-03-21 10:52:13,107][14919] Updated weights for policy 0, policy_version 55194 (0.0010) [2024-03-21 10:52:17,805][14687] Fps is (10 sec: 36045.1, 60 sec: 48059.7, 300 sec: 42653.9). Total num frames: 1808728064. Throughput: 0: 48788.9. Samples: 13845600. Policy #0 lag: (min: 1.0, avg: 32.8, max: 77.0) [2024-03-21 10:52:17,806][14687] Avg episode reward: [(0, '1.646')] [2024-03-21 10:52:22,805][14687] Fps is (10 sec: 26214.5, 60 sec: 45329.1, 300 sec: 42987.2). Total num frames: 1808826368. Throughput: 0: 48282.2. Samples: 14138600. Policy #0 lag: (min: 0.0, avg: 38.4, max: 97.0) [2024-03-21 10:52:22,806][14687] Avg episode reward: [(0, '1.425')] [2024-03-21 10:52:25,288][14919] Updated weights for policy 0, policy_version 55204 (0.0015) [2024-03-21 10:52:27,805][14687] Fps is (10 sec: 36044.9, 60 sec: 45329.1, 300 sec: 43875.8). Total num frames: 1809088512. Throughput: 0: 48231.1. Samples: 14293400. Policy #0 lag: (min: 0.0, avg: 38.4, max: 97.0) [2024-03-21 10:52:27,806][14687] Avg episode reward: [(0, '0.982')] [2024-03-21 10:52:27,821][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000055209_1809088512.pth... [2024-03-21 10:52:27,933][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000054865_1797816320.pth [2024-03-21 10:52:32,756][14919] Updated weights for policy 0, policy_version 55214 (0.0015) [2024-03-21 10:52:32,805][14687] Fps is (10 sec: 42598.2, 60 sec: 46967.4, 300 sec: 44431.2). Total num frames: 1809252352. Throughput: 0: 48495.6. Samples: 14595800. Policy #0 lag: (min: 0.0, avg: 38.4, max: 97.0) [2024-03-21 10:52:32,806][14687] Avg episode reward: [(0, '0.397')] [2024-03-21 10:52:36,363][14919] Updated weights for policy 0, policy_version 55224 (0.0037) [2024-03-21 10:52:37,805][14687] Fps is (10 sec: 49151.7, 60 sec: 46421.3, 300 sec: 45430.9). Total num frames: 1809580032. Throughput: 0: 47086.8. Samples: 14839800. Policy #0 lag: (min: 0.0, avg: 38.4, max: 97.0) [2024-03-21 10:52:37,806][14687] Avg episode reward: [(0, '1.387')] [2024-03-21 10:52:42,805][14687] Fps is (10 sec: 62259.1, 60 sec: 49698.0, 300 sec: 46319.5). Total num frames: 1809874944. Throughput: 0: 47366.7. Samples: 14982400. Policy #0 lag: (min: 0.0, avg: 38.4, max: 97.0) [2024-03-21 10:52:42,806][14687] Avg episode reward: [(0, '1.076')] [2024-03-21 10:52:42,988][14919] Updated weights for policy 0, policy_version 55234 (0.0010) [2024-03-21 10:52:47,805][14687] Fps is (10 sec: 52429.1, 60 sec: 49698.2, 300 sec: 46763.8). Total num frames: 1810104320. Throughput: 0: 47740.0. Samples: 15263500. Policy #0 lag: (min: 0.0, avg: 38.4, max: 97.0) [2024-03-21 10:52:47,806][14687] Avg episode reward: [(0, '1.302')] [2024-03-21 10:52:49,659][14898] Signal inference workers to stop experience collection... (250 times) [2024-03-21 10:52:49,708][14919] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-03-21 10:52:49,913][14898] Signal inference workers to resume experience collection... (250 times) [2024-03-21 10:52:49,913][14919] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-03-21 10:52:49,919][14919] Updated weights for policy 0, policy_version 55244 (0.0023) [2024-03-21 10:52:52,805][14687] Fps is (10 sec: 49152.4, 60 sec: 50790.6, 300 sec: 47208.1). Total num frames: 1810366464. Throughput: 0: 48511.2. Samples: 15557000. Policy #0 lag: (min: 0.0, avg: 38.4, max: 97.0) [2024-03-21 10:52:52,806][14687] Avg episode reward: [(0, '0.449')] [2024-03-21 10:52:54,796][14919] Updated weights for policy 0, policy_version 55254 (0.0017) [2024-03-21 10:52:57,805][14687] Fps is (10 sec: 45875.3, 60 sec: 47513.7, 300 sec: 46986.0). Total num frames: 1810563072. Throughput: 0: 51875.7. Samples: 15855900. Policy #0 lag: (min: 1.0, avg: 40.4, max: 82.0) [2024-03-21 10:52:57,806][14687] Avg episode reward: [(0, '1.455')] [2024-03-21 10:53:01,313][14919] Updated weights for policy 0, policy_version 55264 (0.0018) [2024-03-21 10:53:02,805][14687] Fps is (10 sec: 55705.3, 60 sec: 46967.4, 300 sec: 47541.4). Total num frames: 1810923520. Throughput: 0: 47246.6. Samples: 15971700. Policy #0 lag: (min: 1.0, avg: 40.4, max: 82.0) [2024-03-21 10:53:02,806][14687] Avg episode reward: [(0, '1.069')] [2024-03-21 10:53:07,805][14687] Fps is (10 sec: 45875.0, 60 sec: 44236.9, 300 sec: 47319.2). Total num frames: 1811021824. Throughput: 0: 47262.2. Samples: 16265400. Policy #0 lag: (min: 1.0, avg: 40.4, max: 82.0) [2024-03-21 10:53:07,806][14687] Avg episode reward: [(0, '0.852')] [2024-03-21 10:53:10,999][14919] Updated weights for policy 0, policy_version 55274 (0.0015) [2024-03-21 10:53:12,805][14687] Fps is (10 sec: 29491.3, 60 sec: 44236.8, 300 sec: 47208.1). Total num frames: 1811218432. Throughput: 0: 49980.0. Samples: 16542500. Policy #0 lag: (min: 1.0, avg: 40.4, max: 82.0) [2024-03-21 10:53:12,806][14687] Avg episode reward: [(0, '0.557')] [2024-03-21 10:53:17,805][14687] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 1811480576. Throughput: 0: 46813.4. Samples: 16702400. Policy #0 lag: (min: 1.0, avg: 40.4, max: 82.0) [2024-03-21 10:53:17,805][14687] Avg episode reward: [(0, '0.690')] [2024-03-21 10:53:18,206][14919] Updated weights for policy 0, policy_version 55284 (0.0015) [2024-03-21 10:53:22,805][14687] Fps is (10 sec: 52428.7, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 1811742720. Throughput: 0: 47248.9. Samples: 16966000. Policy #0 lag: (min: 1.0, avg: 40.4, max: 82.0) [2024-03-21 10:53:22,806][14687] Avg episode reward: [(0, '0.915')] [2024-03-21 10:53:25,089][14919] Updated weights for policy 0, policy_version 55294 (0.0009) [2024-03-21 10:53:27,805][14687] Fps is (10 sec: 45874.7, 60 sec: 47513.5, 300 sec: 47652.4). Total num frames: 1811939328. Throughput: 0: 47515.6. Samples: 17120600. Policy #0 lag: (min: 1.0, avg: 40.4, max: 82.0) [2024-03-21 10:53:27,806][14687] Avg episode reward: [(0, '1.265')] [2024-03-21 10:53:32,805][14687] Fps is (10 sec: 26214.3, 60 sec: 45875.2, 300 sec: 47208.2). Total num frames: 1812004864. Throughput: 0: 48035.5. Samples: 17425100. Policy #0 lag: (min: 0.0, avg: 38.1, max: 77.0) [2024-03-21 10:53:32,806][14687] Avg episode reward: [(0, '1.370')] [2024-03-21 10:53:35,716][14919] Updated weights for policy 0, policy_version 55304 (0.0016) [2024-03-21 10:53:37,805][14687] Fps is (10 sec: 32768.2, 60 sec: 44782.9, 300 sec: 47208.1). Total num frames: 1812267008. Throughput: 0: 46735.5. Samples: 17660100. Policy #0 lag: (min: 0.0, avg: 38.1, max: 77.0) [2024-03-21 10:53:37,806][14687] Avg episode reward: [(0, '0.637')] [2024-03-21 10:53:41,500][14919] Updated weights for policy 0, policy_version 55314 (0.0026) [2024-03-21 10:53:41,531][14898] Signal inference workers to stop experience collection... (300 times) [2024-03-21 10:53:41,600][14919] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-03-21 10:53:41,811][14898] Signal inference workers to resume experience collection... (300 times) [2024-03-21 10:53:41,811][14919] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-03-21 10:53:42,805][14687] Fps is (10 sec: 65536.2, 60 sec: 46421.4, 300 sec: 47985.7). Total num frames: 1812660224. Throughput: 0: 43353.3. Samples: 17806800. Policy #0 lag: (min: 0.0, avg: 38.1, max: 77.0) [2024-03-21 10:53:42,806][14687] Avg episode reward: [(0, '1.142')] [2024-03-21 10:53:44,645][14919] Updated weights for policy 0, policy_version 55324 (0.0014) [2024-03-21 10:53:47,805][14687] Fps is (10 sec: 81919.3, 60 sec: 49698.0, 300 sec: 48763.2). Total num frames: 1813086208. Throughput: 0: 46059.9. Samples: 18044400. Policy #0 lag: (min: 0.0, avg: 38.1, max: 77.0) [2024-03-21 10:53:47,806][14687] Avg episode reward: [(0, '1.245')] [2024-03-21 10:53:48,610][14919] Updated weights for policy 0, policy_version 55334 (0.0024) [2024-03-21 10:53:52,805][14687] Fps is (10 sec: 68812.4, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 1813348352. Throughput: 0: 45564.4. Samples: 18315800. Policy #0 lag: (min: 0.0, avg: 38.1, max: 77.0) [2024-03-21 10:53:52,806][14687] Avg episode reward: [(0, '1.245')] [2024-03-21 10:53:57,805][14687] Fps is (10 sec: 29491.3, 60 sec: 46967.4, 300 sec: 48207.8). Total num frames: 1813381120. Throughput: 0: 43088.8. Samples: 18481500. Policy #0 lag: (min: 0.0, avg: 38.1, max: 77.0) [2024-03-21 10:53:57,806][14687] Avg episode reward: [(0, '1.694')] [2024-03-21 10:54:00,653][14919] Updated weights for policy 0, policy_version 55344 (0.0015) [2024-03-21 10:54:02,805][14687] Fps is (10 sec: 29491.0, 60 sec: 45329.0, 300 sec: 48541.0). Total num frames: 1813643264. Throughput: 0: 46602.1. Samples: 18799500. Policy #0 lag: (min: 0.0, avg: 38.1, max: 77.0) [2024-03-21 10:54:02,806][14687] Avg episode reward: [(0, '1.225')] [2024-03-21 10:54:05,871][14919] Updated weights for policy 0, policy_version 55354 (0.0015) [2024-03-21 10:54:07,805][14687] Fps is (10 sec: 55706.1, 60 sec: 48605.9, 300 sec: 48430.0). Total num frames: 1813938176. Throughput: 0: 46962.2. Samples: 19079300. Policy #0 lag: (min: 1.0, avg: 32.2, max: 67.0) [2024-03-21 10:54:07,806][14687] Avg episode reward: [(0, '1.726')] [2024-03-21 10:54:11,487][14919] Updated weights for policy 0, policy_version 55364 (0.0019) [2024-03-21 10:54:12,805][14687] Fps is (10 sec: 55706.4, 60 sec: 49698.2, 300 sec: 48541.1). Total num frames: 1814200320. Throughput: 0: 50160.1. Samples: 19377800. Policy #0 lag: (min: 1.0, avg: 32.2, max: 67.0) [2024-03-21 10:54:12,806][14687] Avg episode reward: [(0, '1.464')] [2024-03-21 10:54:17,442][14919] Updated weights for policy 0, policy_version 55374 (0.0023) [2024-03-21 10:54:17,805][14687] Fps is (10 sec: 55705.3, 60 sec: 50244.2, 300 sec: 49318.6). Total num frames: 1814495232. Throughput: 0: 46795.5. Samples: 19530900. Policy #0 lag: (min: 1.0, avg: 32.2, max: 67.0) [2024-03-21 10:54:17,806][14687] Avg episode reward: [(0, '1.610')] [2024-03-21 10:54:22,805][14687] Fps is (10 sec: 52428.5, 60 sec: 49698.1, 300 sec: 49318.6). Total num frames: 1814724608. Throughput: 0: 47711.1. Samples: 19807100. Policy #0 lag: (min: 1.0, avg: 32.2, max: 67.0) [2024-03-21 10:54:22,806][14687] Avg episode reward: [(0, '0.727')] [2024-03-21 10:54:24,376][14898] Signal inference workers to stop experience collection... (350 times) [2024-03-21 10:54:24,394][14919] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-03-21 10:54:24,663][14898] Signal inference workers to resume experience collection... (350 times) [2024-03-21 10:54:24,663][14919] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-03-21 10:54:24,970][14919] Updated weights for policy 0, policy_version 55384 (0.0009) [2024-03-21 10:54:27,805][14687] Fps is (10 sec: 52429.4, 60 sec: 51336.6, 300 sec: 49096.5). Total num frames: 1815019520. Throughput: 0: 48002.3. Samples: 19966900. Policy #0 lag: (min: 1.0, avg: 32.2, max: 67.0) [2024-03-21 10:54:27,805][14687] Avg episode reward: [(0, '0.727')] [2024-03-21 10:54:27,819][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000055390_1815019520.pth... [2024-03-21 10:54:27,932][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000055036_1803419648.pth [2024-03-21 10:54:31,533][14919] Updated weights for policy 0, policy_version 55394 (0.0028) [2024-03-21 10:54:32,805][14687] Fps is (10 sec: 49152.4, 60 sec: 53521.2, 300 sec: 49207.6). Total num frames: 1815216128. Throughput: 0: 49569.1. Samples: 20275000. Policy #0 lag: (min: 1.0, avg: 32.2, max: 67.0) [2024-03-21 10:54:32,805][14687] Avg episode reward: [(0, '0.727')] [2024-03-21 10:54:37,805][14687] Fps is (10 sec: 36044.4, 60 sec: 51882.6, 300 sec: 48763.2). Total num frames: 1815379968. Throughput: 0: 50353.3. Samples: 20581700. Policy #0 lag: (min: 1.0, avg: 32.2, max: 67.0) [2024-03-21 10:54:37,806][14687] Avg episode reward: [(0, '0.727')] [2024-03-21 10:54:41,484][14919] Updated weights for policy 0, policy_version 55404 (0.0010) [2024-03-21 10:54:42,805][14687] Fps is (10 sec: 29491.1, 60 sec: 47513.6, 300 sec: 47985.7). Total num frames: 1815511040. Throughput: 0: 50524.6. Samples: 20755100. Policy #0 lag: (min: 0.0, avg: 36.5, max: 83.0) [2024-03-21 10:54:42,806][14687] Avg episode reward: [(0, '0.727')] [2024-03-21 10:54:46,890][14919] Updated weights for policy 0, policy_version 55414 (0.0027) [2024-03-21 10:54:47,805][14687] Fps is (10 sec: 49151.7, 60 sec: 46421.3, 300 sec: 48652.1). Total num frames: 1815871488. Throughput: 0: 49820.0. Samples: 21041400. Policy #0 lag: (min: 0.0, avg: 36.5, max: 83.0) [2024-03-21 10:54:47,806][14687] Avg episode reward: [(0, '0.861')] [2024-03-21 10:54:50,774][14919] Updated weights for policy 0, policy_version 55424 (0.0020) [2024-03-21 10:54:52,805][14687] Fps is (10 sec: 75366.0, 60 sec: 48605.9, 300 sec: 49207.5). Total num frames: 1816264704. Throughput: 0: 49604.4. Samples: 21311500. Policy #0 lag: (min: 0.0, avg: 36.5, max: 83.0) [2024-03-21 10:54:52,806][14687] Avg episode reward: [(0, '0.861')] [2024-03-21 10:54:56,856][14919] Updated weights for policy 0, policy_version 55434 (0.0008) [2024-03-21 10:54:57,805][14687] Fps is (10 sec: 58983.2, 60 sec: 51336.6, 300 sec: 49096.5). Total num frames: 1816461312. Throughput: 0: 46208.9. Samples: 21457200. Policy #0 lag: (min: 0.0, avg: 36.5, max: 83.0) [2024-03-21 10:54:57,806][14687] Avg episode reward: [(0, '0.861')] [2024-03-21 10:55:01,961][14919] Updated weights for policy 0, policy_version 55444 (0.0013) [2024-03-21 10:55:02,805][14687] Fps is (10 sec: 58982.5, 60 sec: 53521.2, 300 sec: 49318.6). Total num frames: 1816854528. Throughput: 0: 49386.7. Samples: 21753300. Policy #0 lag: (min: 0.0, avg: 36.5, max: 83.0) [2024-03-21 10:55:02,806][14687] Avg episode reward: [(0, '0.491')] [2024-03-21 10:55:07,805][14687] Fps is (10 sec: 55705.2, 60 sec: 51336.5, 300 sec: 49207.5). Total num frames: 1817018368. Throughput: 0: 49586.6. Samples: 22038500. Policy #0 lag: (min: 0.0, avg: 36.5, max: 83.0) [2024-03-21 10:55:07,806][14687] Avg episode reward: [(0, '1.326')] [2024-03-21 10:55:12,805][14687] Fps is (10 sec: 22937.7, 60 sec: 48059.7, 300 sec: 48430.0). Total num frames: 1817083904. Throughput: 0: 49806.6. Samples: 22208200. Policy #0 lag: (min: 0.0, avg: 36.5, max: 83.0) [2024-03-21 10:55:12,805][14687] Avg episode reward: [(0, '0.730')] [2024-03-21 10:55:12,940][14919] Updated weights for policy 0, policy_version 55454 (0.0010) [2024-03-21 10:55:15,119][14898] Signal inference workers to stop experience collection... (400 times) [2024-03-21 10:55:15,169][14919] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-03-21 10:55:15,388][14898] Signal inference workers to resume experience collection... (400 times) [2024-03-21 10:55:15,388][14919] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-03-21 10:55:17,805][14687] Fps is (10 sec: 29491.2, 60 sec: 46967.5, 300 sec: 47985.7). Total num frames: 1817313280. Throughput: 0: 49726.5. Samples: 22512700. Policy #0 lag: (min: 0.0, avg: 45.3, max: 92.0) [2024-03-21 10:55:17,806][14687] Avg episode reward: [(0, '1.192')] [2024-03-21 10:55:20,999][14919] Updated weights for policy 0, policy_version 55464 (0.0028) [2024-03-21 10:55:22,805][14687] Fps is (10 sec: 52429.2, 60 sec: 48059.9, 300 sec: 48096.8). Total num frames: 1817608192. Throughput: 0: 49091.3. Samples: 22790800. Policy #0 lag: (min: 0.0, avg: 45.3, max: 92.0) [2024-03-21 10:55:22,805][14687] Avg episode reward: [(0, '1.470')] [2024-03-21 10:55:24,118][14919] Updated weights for policy 0, policy_version 55474 (0.0010) [2024-03-21 10:55:27,805][14687] Fps is (10 sec: 58982.4, 60 sec: 48059.6, 300 sec: 48430.0). Total num frames: 1817903104. Throughput: 0: 51544.4. Samples: 23074600. Policy #0 lag: (min: 0.0, avg: 45.3, max: 92.0) [2024-03-21 10:55:27,806][14687] Avg episode reward: [(0, '1.470')] [2024-03-21 10:55:31,106][14919] Updated weights for policy 0, policy_version 55484 (0.0019) [2024-03-21 10:55:32,805][14687] Fps is (10 sec: 55704.6, 60 sec: 49151.9, 300 sec: 48763.2). Total num frames: 1818165248. Throughput: 0: 48246.7. Samples: 23212500. Policy #0 lag: (min: 0.0, avg: 45.3, max: 92.0) [2024-03-21 10:55:32,806][14687] Avg episode reward: [(0, '0.785')] [2024-03-21 10:55:37,805][14687] Fps is (10 sec: 45874.8, 60 sec: 49698.1, 300 sec: 48874.3). Total num frames: 1818361856. Throughput: 0: 48726.5. Samples: 23504200. Policy #0 lag: (min: 0.0, avg: 45.3, max: 92.0) [2024-03-21 10:55:37,806][14687] Avg episode reward: [(0, '1.138')] [2024-03-21 10:55:39,049][14919] Updated weights for policy 0, policy_version 55494 (0.0018) [2024-03-21 10:55:42,805][14687] Fps is (10 sec: 32768.1, 60 sec: 49698.1, 300 sec: 48430.0). Total num frames: 1818492928. Throughput: 0: 49039.9. Samples: 23664000. Policy #0 lag: (min: 0.0, avg: 45.3, max: 92.0) [2024-03-21 10:55:42,806][14687] Avg episode reward: [(0, '1.018')] [2024-03-21 10:55:45,686][14919] Updated weights for policy 0, policy_version 55504 (0.0023) [2024-03-21 10:55:47,810][14687] Fps is (10 sec: 52405.3, 60 sec: 50240.5, 300 sec: 48984.6). Total num frames: 1818886144. Throughput: 0: 48983.9. Samples: 23957800. Policy #0 lag: (min: 0.0, avg: 45.3, max: 92.0) [2024-03-21 10:55:47,810][14687] Avg episode reward: [(0, '0.708')] [2024-03-21 10:55:50,279][14919] Updated weights for policy 0, policy_version 55514 (0.0014) [2024-03-21 10:55:52,805][14687] Fps is (10 sec: 72089.5, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 1819213824. Throughput: 0: 48700.0. Samples: 24230000. Policy #0 lag: (min: 1.0, avg: 49.4, max: 101.0) [2024-03-21 10:55:52,806][14687] Avg episode reward: [(0, '1.604')] [2024-03-21 10:55:57,805][14687] Fps is (10 sec: 49174.9, 60 sec: 48605.9, 300 sec: 48763.3). Total num frames: 1819377664. Throughput: 0: 48244.4. Samples: 24379200. Policy #0 lag: (min: 1.0, avg: 49.4, max: 101.0) [2024-03-21 10:55:57,805][14687] Avg episode reward: [(0, '1.472')] [2024-03-21 10:56:01,084][14919] Updated weights for policy 0, policy_version 55524 (0.0010) [2024-03-21 10:56:02,805][14687] Fps is (10 sec: 26214.6, 60 sec: 43690.7, 300 sec: 48096.8). Total num frames: 1819475968. Throughput: 0: 47595.7. Samples: 24654500. Policy #0 lag: (min: 1.0, avg: 49.4, max: 101.0) [2024-03-21 10:56:02,806][14687] Avg episode reward: [(0, '0.555')] [2024-03-21 10:56:06,807][14898] Signal inference workers to stop experience collection... (450 times) [2024-03-21 10:56:06,852][14919] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-03-21 10:56:06,876][14898] Signal inference workers to resume experience collection... (450 times) [2024-03-21 10:56:06,897][14919] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-03-21 10:56:07,805][14687] Fps is (10 sec: 26214.3, 60 sec: 43690.7, 300 sec: 47985.7). Total num frames: 1819639808. Throughput: 0: 47315.4. Samples: 24920000. Policy #0 lag: (min: 1.0, avg: 49.4, max: 101.0) [2024-03-21 10:56:07,806][14687] Avg episode reward: [(0, '1.472')] [2024-03-21 10:56:09,845][14919] Updated weights for policy 0, policy_version 55534 (0.0012) [2024-03-21 10:56:12,805][14687] Fps is (10 sec: 32767.6, 60 sec: 45329.0, 300 sec: 47319.2). Total num frames: 1819803648. Throughput: 0: 43951.1. Samples: 25052400. Policy #0 lag: (min: 1.0, avg: 49.4, max: 101.0) [2024-03-21 10:56:12,806][14687] Avg episode reward: [(0, '1.038')] [2024-03-21 10:56:17,472][14919] Updated weights for policy 0, policy_version 55544 (0.0023) [2024-03-21 10:56:17,805][14687] Fps is (10 sec: 45875.6, 60 sec: 46421.4, 300 sec: 47430.3). Total num frames: 1820098560. Throughput: 0: 47042.4. Samples: 25329400. Policy #0 lag: (min: 1.0, avg: 49.4, max: 101.0) [2024-03-21 10:56:17,805][14687] Avg episode reward: [(0, '1.533')] [2024-03-21 10:56:21,845][14919] Updated weights for policy 0, policy_version 55554 (0.0011) [2024-03-21 10:56:22,805][14687] Fps is (10 sec: 62259.4, 60 sec: 46967.3, 300 sec: 47652.4). Total num frames: 1820426240. Throughput: 0: 46655.7. Samples: 25603700. Policy #0 lag: (min: 1.0, avg: 49.4, max: 101.0) [2024-03-21 10:56:22,806][14687] Avg episode reward: [(0, '1.495')] [2024-03-21 10:56:27,136][14919] Updated weights for policy 0, policy_version 55564 (0.0011) [2024-03-21 10:56:27,805][14687] Fps is (10 sec: 65535.3, 60 sec: 47513.6, 300 sec: 48541.1). Total num frames: 1820753920. Throughput: 0: 46284.4. Samples: 25746800. Policy #0 lag: (min: 0.0, avg: 43.0, max: 82.0) [2024-03-21 10:56:27,806][14687] Avg episode reward: [(0, '0.774')] [2024-03-21 10:56:28,382][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000055567_1820819456.pth... [2024-03-21 10:56:28,531][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000055209_1809088512.pth [2024-03-21 10:56:32,805][14687] Fps is (10 sec: 58982.6, 60 sec: 47513.6, 300 sec: 48207.8). Total num frames: 1821016064. Throughput: 0: 45986.9. Samples: 26027000. Policy #0 lag: (min: 0.0, avg: 43.0, max: 82.0) [2024-03-21 10:56:32,806][14687] Avg episode reward: [(0, '1.353')] [2024-03-21 10:56:33,268][14919] Updated weights for policy 0, policy_version 55574 (0.0015) [2024-03-21 10:56:37,805][14687] Fps is (10 sec: 42598.6, 60 sec: 46967.6, 300 sec: 48430.0). Total num frames: 1821179904. Throughput: 0: 46833.4. Samples: 26337500. Policy #0 lag: (min: 0.0, avg: 43.0, max: 82.0) [2024-03-21 10:56:37,806][14687] Avg episode reward: [(0, '0.909')] [2024-03-21 10:56:42,139][14919] Updated weights for policy 0, policy_version 55584 (0.0020) [2024-03-21 10:56:42,805][14687] Fps is (10 sec: 39321.6, 60 sec: 48605.9, 300 sec: 48430.0). Total num frames: 1821409280. Throughput: 0: 49144.4. Samples: 26590700. Policy #0 lag: (min: 0.0, avg: 43.0, max: 82.0) [2024-03-21 10:56:42,806][14687] Avg episode reward: [(0, '0.942')] [2024-03-21 10:56:47,805][14687] Fps is (10 sec: 29491.3, 60 sec: 43147.9, 300 sec: 47985.7). Total num frames: 1821474816. Throughput: 0: 46211.1. Samples: 26734000. Policy #0 lag: (min: 0.0, avg: 43.0, max: 82.0) [2024-03-21 10:56:47,806][14687] Avg episode reward: [(0, '1.368')] [2024-03-21 10:56:51,863][14919] Updated weights for policy 0, policy_version 55594 (0.0016) [2024-03-21 10:56:52,805][14687] Fps is (10 sec: 32768.0, 60 sec: 42052.3, 300 sec: 47541.4). Total num frames: 1821736960. Throughput: 0: 45973.3. Samples: 26988800. Policy #0 lag: (min: 0.0, avg: 43.0, max: 82.0) [2024-03-21 10:56:52,806][14687] Avg episode reward: [(0, '1.020')] [2024-03-21 10:56:57,805][14687] Fps is (10 sec: 32767.6, 60 sec: 40413.8, 300 sec: 46430.6). Total num frames: 1821802496. Throughput: 0: 49951.1. Samples: 27300200. Policy #0 lag: (min: 0.0, avg: 43.0, max: 82.0) [2024-03-21 10:56:57,806][14687] Avg episode reward: [(0, '1.568')] [2024-03-21 10:57:00,671][14919] Updated weights for policy 0, policy_version 55604 (0.0016) [2024-03-21 10:57:02,194][14898] Signal inference workers to stop experience collection... (500 times) [2024-03-21 10:57:02,273][14919] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-03-21 10:57:02,475][14898] Signal inference workers to resume experience collection... (500 times) [2024-03-21 10:57:02,475][14919] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-03-21 10:57:02,805][14687] Fps is (10 sec: 49152.6, 60 sec: 45875.3, 300 sec: 46986.0). Total num frames: 1822228480. Throughput: 0: 46613.4. Samples: 27427000. Policy #0 lag: (min: 3.0, avg: 40.6, max: 98.0) [2024-03-21 10:57:02,805][14687] Avg episode reward: [(0, '1.393')] [2024-03-21 10:57:03,765][14919] Updated weights for policy 0, policy_version 55614 (0.0007) [2024-03-21 10:57:07,805][14687] Fps is (10 sec: 75366.5, 60 sec: 48605.8, 300 sec: 47430.3). Total num frames: 1822556160. Throughput: 0: 46282.2. Samples: 27686400. Policy #0 lag: (min: 3.0, avg: 40.6, max: 98.0) [2024-03-21 10:57:07,807][14687] Avg episode reward: [(0, '1.393')] [2024-03-21 10:57:10,092][14919] Updated weights for policy 0, policy_version 55624 (0.0010) [2024-03-21 10:57:12,805][14687] Fps is (10 sec: 65535.0, 60 sec: 51336.6, 300 sec: 47985.7). Total num frames: 1822883840. Throughput: 0: 46091.1. Samples: 27820900. Policy #0 lag: (min: 3.0, avg: 40.6, max: 98.0) [2024-03-21 10:57:12,806][14687] Avg episode reward: [(0, '1.114')] [2024-03-21 10:57:15,139][14919] Updated weights for policy 0, policy_version 55634 (0.0015) [2024-03-21 10:57:17,805][14687] Fps is (10 sec: 55705.5, 60 sec: 50244.1, 300 sec: 48430.0). Total num frames: 1823113216. Throughput: 0: 45977.7. Samples: 28096000. Policy #0 lag: (min: 3.0, avg: 40.6, max: 98.0) [2024-03-21 10:57:17,806][14687] Avg episode reward: [(0, '0.934')] [2024-03-21 10:57:22,805][14687] Fps is (10 sec: 39321.3, 60 sec: 47513.5, 300 sec: 48096.7). Total num frames: 1823277056. Throughput: 0: 45975.4. Samples: 28406400. Policy #0 lag: (min: 3.0, avg: 40.6, max: 98.0) [2024-03-21 10:57:22,806][14687] Avg episode reward: [(0, '1.546')] [2024-03-21 10:57:23,414][14919] Updated weights for policy 0, policy_version 55644 (0.0016) [2024-03-21 10:57:27,805][14687] Fps is (10 sec: 39321.6, 60 sec: 45875.1, 300 sec: 48318.9). Total num frames: 1823506432. Throughput: 0: 46591.0. Samples: 28687300. Policy #0 lag: (min: 3.0, avg: 40.6, max: 98.0) [2024-03-21 10:57:27,806][14687] Avg episode reward: [(0, '1.385')] [2024-03-21 10:57:29,972][14919] Updated weights for policy 0, policy_version 55654 (0.0011) [2024-03-21 10:57:32,805][14687] Fps is (10 sec: 39322.1, 60 sec: 44236.8, 300 sec: 47763.5). Total num frames: 1823670272. Throughput: 0: 46804.4. Samples: 28840200. Policy #0 lag: (min: 3.0, avg: 40.6, max: 98.0) [2024-03-21 10:57:32,806][14687] Avg episode reward: [(0, '1.211')] [2024-03-21 10:57:37,805][14687] Fps is (10 sec: 32768.5, 60 sec: 44236.8, 300 sec: 47319.2). Total num frames: 1823834112. Throughput: 0: 47637.8. Samples: 29132500. Policy #0 lag: (min: 0.0, avg: 49.7, max: 102.0) [2024-03-21 10:57:37,806][14687] Avg episode reward: [(0, '0.629')] [2024-03-21 10:57:41,417][14919] Updated weights for policy 0, policy_version 55664 (0.0018) [2024-03-21 10:57:42,805][14687] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 47430.3). Total num frames: 1824096256. Throughput: 0: 44060.1. Samples: 29282900. Policy #0 lag: (min: 0.0, avg: 49.7, max: 102.0) [2024-03-21 10:57:42,806][14687] Avg episode reward: [(0, '1.405')] [2024-03-21 10:57:47,805][14687] Fps is (10 sec: 45874.5, 60 sec: 46967.4, 300 sec: 47208.1). Total num frames: 1824292864. Throughput: 0: 47606.4. Samples: 29569300. Policy #0 lag: (min: 0.0, avg: 49.7, max: 102.0) [2024-03-21 10:57:47,807][14687] Avg episode reward: [(0, '1.405')] [2024-03-21 10:57:47,924][14919] Updated weights for policy 0, policy_version 55674 (0.0014) [2024-03-21 10:57:52,805][14687] Fps is (10 sec: 52429.0, 60 sec: 48059.8, 300 sec: 47652.4). Total num frames: 1824620544. Throughput: 0: 48393.5. Samples: 29864100. Policy #0 lag: (min: 0.0, avg: 49.7, max: 102.0) [2024-03-21 10:57:52,805][14687] Avg episode reward: [(0, '1.472')] [2024-03-21 10:57:52,841][14919] Updated weights for policy 0, policy_version 55684 (0.0019) [2024-03-21 10:57:53,207][14898] Signal inference workers to stop experience collection... (550 times) [2024-03-21 10:57:53,207][14898] Signal inference workers to resume experience collection... (550 times) [2024-03-21 10:57:53,377][14919] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-03-21 10:57:53,377][14919] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-03-21 10:57:57,805][14687] Fps is (10 sec: 52430.0, 60 sec: 50244.4, 300 sec: 47097.1). Total num frames: 1824817152. Throughput: 0: 51484.6. Samples: 30137700. Policy #0 lag: (min: 0.0, avg: 49.7, max: 102.0) [2024-03-21 10:57:57,805][14687] Avg episode reward: [(0, '1.252')] [2024-03-21 10:57:59,777][14919] Updated weights for policy 0, policy_version 55694 (0.0014) [2024-03-21 10:58:02,805][14687] Fps is (10 sec: 45875.0, 60 sec: 47513.5, 300 sec: 47652.4). Total num frames: 1825079296. Throughput: 0: 48286.8. Samples: 30268900. Policy #0 lag: (min: 0.0, avg: 49.7, max: 102.0) [2024-03-21 10:58:02,806][14687] Avg episode reward: [(0, '1.343')] [2024-03-21 10:58:07,168][14919] Updated weights for policy 0, policy_version 55704 (0.0012) [2024-03-21 10:58:07,805][14687] Fps is (10 sec: 49151.5, 60 sec: 45875.3, 300 sec: 47763.5). Total num frames: 1825308672. Throughput: 0: 47929.0. Samples: 30563200. Policy #0 lag: (min: 0.0, avg: 49.7, max: 102.0) [2024-03-21 10:58:07,806][14687] Avg episode reward: [(0, '0.621')] [2024-03-21 10:58:11,165][14919] Updated weights for policy 0, policy_version 55714 (0.0015) [2024-03-21 10:58:12,805][14687] Fps is (10 sec: 58982.3, 60 sec: 46421.3, 300 sec: 48096.8). Total num frames: 1825669120. Throughput: 0: 44757.9. Samples: 30701400. Policy #0 lag: (min: 4.0, avg: 54.7, max: 113.0) [2024-03-21 10:58:12,806][14687] Avg episode reward: [(0, '1.304')] [2024-03-21 10:58:16,966][14919] Updated weights for policy 0, policy_version 55724 (0.0019) [2024-03-21 10:58:17,805][14687] Fps is (10 sec: 65536.2, 60 sec: 47513.7, 300 sec: 48207.8). Total num frames: 1825964032. Throughput: 0: 47735.6. Samples: 30988300. Policy #0 lag: (min: 4.0, avg: 54.7, max: 113.0) [2024-03-21 10:58:17,806][14687] Avg episode reward: [(0, '1.223')] [2024-03-21 10:58:22,805][14687] Fps is (10 sec: 58982.0, 60 sec: 49698.2, 300 sec: 48541.1). Total num frames: 1826258944. Throughput: 0: 47453.2. Samples: 31267900. Policy #0 lag: (min: 4.0, avg: 54.7, max: 113.0) [2024-03-21 10:58:22,806][14687] Avg episode reward: [(0, '1.376')] [2024-03-21 10:58:23,046][14919] Updated weights for policy 0, policy_version 55734 (0.0014) [2024-03-21 10:58:27,805][14687] Fps is (10 sec: 52428.1, 60 sec: 49698.1, 300 sec: 49096.4). Total num frames: 1826488320. Throughput: 0: 47462.1. Samples: 31418700. Policy #0 lag: (min: 4.0, avg: 54.7, max: 113.0) [2024-03-21 10:58:27,806][14687] Avg episode reward: [(0, '1.376')] [2024-03-21 10:58:27,827][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000055740_1826488320.pth... [2024-03-21 10:58:27,940][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000055390_1815019520.pth [2024-03-21 10:58:31,596][14919] Updated weights for policy 0, policy_version 55744 (0.0011) [2024-03-21 10:58:32,805][14687] Fps is (10 sec: 42598.9, 60 sec: 50244.3, 300 sec: 48874.3). Total num frames: 1826684928. Throughput: 0: 47833.5. Samples: 31721800. Policy #0 lag: (min: 4.0, avg: 54.7, max: 113.0) [2024-03-21 10:58:32,805][14687] Avg episode reward: [(0, '1.387')] [2024-03-21 10:58:37,805][14687] Fps is (10 sec: 32768.3, 60 sec: 49698.1, 300 sec: 47985.7). Total num frames: 1826816000. Throughput: 0: 48026.6. Samples: 32025300. Policy #0 lag: (min: 4.0, avg: 54.7, max: 113.0) [2024-03-21 10:58:37,806][14687] Avg episode reward: [(0, '0.690')] [2024-03-21 10:58:39,275][14919] Updated weights for policy 0, policy_version 55754 (0.0014) [2024-03-21 10:58:42,805][14687] Fps is (10 sec: 45875.5, 60 sec: 50790.5, 300 sec: 47652.5). Total num frames: 1827143680. Throughput: 0: 48097.8. Samples: 32302100. Policy #0 lag: (min: 4.0, avg: 54.7, max: 113.0) [2024-03-21 10:58:42,805][14687] Avg episode reward: [(0, '0.725')] [2024-03-21 10:58:43,525][14898] Signal inference workers to stop experience collection... (600 times) [2024-03-21 10:58:43,539][14919] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-03-21 10:58:43,743][14898] Signal inference workers to resume experience collection... (600 times) [2024-03-21 10:58:43,744][14919] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-03-21 10:58:44,338][14919] Updated weights for policy 0, policy_version 55764 (0.0012) [2024-03-21 10:58:47,805][14687] Fps is (10 sec: 49152.4, 60 sec: 50244.4, 300 sec: 47319.2). Total num frames: 1827307520. Throughput: 0: 48417.8. Samples: 32447700. Policy #0 lag: (min: 0.0, avg: 37.1, max: 73.0) [2024-03-21 10:58:47,805][14687] Avg episode reward: [(0, '0.971')] [2024-03-21 10:58:52,805][14687] Fps is (10 sec: 42598.1, 60 sec: 49152.0, 300 sec: 48096.8). Total num frames: 1827569664. Throughput: 0: 48471.1. Samples: 32744400. Policy #0 lag: (min: 0.0, avg: 37.1, max: 73.0) [2024-03-21 10:58:52,806][14687] Avg episode reward: [(0, '1.365')] [2024-03-21 10:58:53,065][14919] Updated weights for policy 0, policy_version 55774 (0.0011) [2024-03-21 10:58:57,805][14687] Fps is (10 sec: 42597.2, 60 sec: 48605.6, 300 sec: 47763.5). Total num frames: 1827733504. Throughput: 0: 48546.4. Samples: 32886000. Policy #0 lag: (min: 0.0, avg: 37.1, max: 73.0) [2024-03-21 10:58:57,807][14687] Avg episode reward: [(0, '1.644')] [2024-03-21 10:59:00,056][14919] Updated weights for policy 0, policy_version 55784 (0.0019) [2024-03-21 10:59:02,805][14687] Fps is (10 sec: 45875.3, 60 sec: 49152.1, 300 sec: 47763.5). Total num frames: 1828028416. Throughput: 0: 48580.0. Samples: 33174400. Policy #0 lag: (min: 0.0, avg: 37.1, max: 73.0) [2024-03-21 10:59:02,805][14687] Avg episode reward: [(0, '1.250')] [2024-03-21 10:59:07,805][14687] Fps is (10 sec: 45875.6, 60 sec: 48059.6, 300 sec: 47430.3). Total num frames: 1828192256. Throughput: 0: 49166.6. Samples: 33480400. Policy #0 lag: (min: 0.0, avg: 37.1, max: 73.0) [2024-03-21 10:59:07,806][14687] Avg episode reward: [(0, '1.250')] [2024-03-21 10:59:08,477][14919] Updated weights for policy 0, policy_version 55794 (0.0018) [2024-03-21 10:59:11,912][14919] Updated weights for policy 0, policy_version 55804 (0.0015) [2024-03-21 10:59:12,805][14687] Fps is (10 sec: 62258.5, 60 sec: 49698.1, 300 sec: 47985.7). Total num frames: 1828651008. Throughput: 0: 51566.7. Samples: 33739200. Policy #0 lag: (min: 0.0, avg: 37.1, max: 73.0) [2024-03-21 10:59:12,806][14687] Avg episode reward: [(0, '1.223')] [2024-03-21 10:59:17,805][14687] Fps is (10 sec: 62260.2, 60 sec: 47513.6, 300 sec: 47763.5). Total num frames: 1828814848. Throughput: 0: 47997.8. Samples: 33881700. Policy #0 lag: (min: 0.0, avg: 37.1, max: 73.0) [2024-03-21 10:59:17,806][14687] Avg episode reward: [(0, '1.234')] [2024-03-21 10:59:20,455][14919] Updated weights for policy 0, policy_version 55814 (0.0014) [2024-03-21 10:59:22,805][14687] Fps is (10 sec: 45875.4, 60 sec: 47513.7, 300 sec: 47763.5). Total num frames: 1829109760. Throughput: 0: 47391.1. Samples: 34157900. Policy #0 lag: (min: 1.0, avg: 43.1, max: 86.0) [2024-03-21 10:59:22,806][14687] Avg episode reward: [(0, '0.837')] [2024-03-21 10:59:27,213][14919] Updated weights for policy 0, policy_version 55824 (0.0028) [2024-03-21 10:59:27,805][14687] Fps is (10 sec: 45874.7, 60 sec: 46421.3, 300 sec: 47652.4). Total num frames: 1829273600. Throughput: 0: 48293.1. Samples: 34475300. Policy #0 lag: (min: 1.0, avg: 43.1, max: 86.0) [2024-03-21 10:59:27,806][14687] Avg episode reward: [(0, '0.837')] [2024-03-21 10:59:31,055][14919] Updated weights for policy 0, policy_version 55834 (0.0013) [2024-03-21 10:59:32,805][14687] Fps is (10 sec: 52428.8, 60 sec: 49152.0, 300 sec: 48318.9). Total num frames: 1829634048. Throughput: 0: 47826.6. Samples: 34599900. Policy #0 lag: (min: 1.0, avg: 43.1, max: 86.0) [2024-03-21 10:59:32,806][14687] Avg episode reward: [(0, '1.570')] [2024-03-21 10:59:37,434][14898] Signal inference workers to stop experience collection... (650 times) [2024-03-21 10:59:37,491][14919] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-03-21 10:59:37,654][14898] Signal inference workers to resume experience collection... (650 times) [2024-03-21 10:59:37,654][14919] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-03-21 10:59:37,805][14687] Fps is (10 sec: 52428.7, 60 sec: 49698.1, 300 sec: 48430.0). Total num frames: 1829797888. Throughput: 0: 47773.2. Samples: 34894200. Policy #0 lag: (min: 1.0, avg: 43.1, max: 86.0) [2024-03-21 10:59:37,806][14687] Avg episode reward: [(0, '0.692')] [2024-03-21 10:59:39,669][14919] Updated weights for policy 0, policy_version 55844 (0.0011) [2024-03-21 10:59:42,805][14687] Fps is (10 sec: 39321.8, 60 sec: 48059.7, 300 sec: 47985.7). Total num frames: 1830027264. Throughput: 0: 47698.1. Samples: 35032400. Policy #0 lag: (min: 1.0, avg: 43.1, max: 86.0) [2024-03-21 10:59:42,806][14687] Avg episode reward: [(0, '0.627')] [2024-03-21 10:59:47,805][14687] Fps is (10 sec: 39322.2, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 1830191104. Throughput: 0: 48255.5. Samples: 35345900. Policy #0 lag: (min: 1.0, avg: 43.1, max: 86.0) [2024-03-21 10:59:47,806][14687] Avg episode reward: [(0, '0.590')] [2024-03-21 10:59:48,215][14919] Updated weights for policy 0, policy_version 55854 (0.0015) [2024-03-21 10:59:52,805][14687] Fps is (10 sec: 45874.9, 60 sec: 48605.8, 300 sec: 47541.4). Total num frames: 1830486016. Throughput: 0: 48277.9. Samples: 35652900. Policy #0 lag: (min: 1.0, avg: 43.1, max: 86.0) [2024-03-21 10:59:52,806][14687] Avg episode reward: [(0, '1.031')] [2024-03-21 10:59:54,223][14919] Updated weights for policy 0, policy_version 55864 (0.0020) [2024-03-21 10:59:57,805][14687] Fps is (10 sec: 58982.1, 60 sec: 50790.6, 300 sec: 47208.1). Total num frames: 1830780928. Throughput: 0: 45653.4. Samples: 35793600. Policy #0 lag: (min: 0.0, avg: 36.6, max: 81.0) [2024-03-21 10:59:57,806][14687] Avg episode reward: [(0, '0.723')] [2024-03-21 11:00:01,342][14919] Updated weights for policy 0, policy_version 55874 (0.0009) [2024-03-21 11:00:02,805][14687] Fps is (10 sec: 52429.0, 60 sec: 49698.1, 300 sec: 47430.3). Total num frames: 1831010304. Throughput: 0: 49408.9. Samples: 36105100. Policy #0 lag: (min: 0.0, avg: 36.6, max: 81.0) [2024-03-21 11:00:02,806][14687] Avg episode reward: [(0, '0.723')] [2024-03-21 11:00:06,859][14919] Updated weights for policy 0, policy_version 55884 (0.0032) [2024-03-21 11:00:07,805][14687] Fps is (10 sec: 45875.3, 60 sec: 50790.5, 300 sec: 47985.7). Total num frames: 1831239680. Throughput: 0: 48957.8. Samples: 36361000. Policy #0 lag: (min: 0.0, avg: 36.6, max: 81.0) [2024-03-21 11:00:07,806][14687] Avg episode reward: [(0, '0.965')] [2024-03-21 11:00:12,805][14687] Fps is (10 sec: 45875.3, 60 sec: 46967.5, 300 sec: 47985.7). Total num frames: 1831469056. Throughput: 0: 45049.0. Samples: 36502500. Policy #0 lag: (min: 0.0, avg: 36.6, max: 81.0) [2024-03-21 11:00:12,805][14687] Avg episode reward: [(0, '1.645')] [2024-03-21 11:00:13,644][14919] Updated weights for policy 0, policy_version 55894 (0.0021) [2024-03-21 11:00:17,805][14687] Fps is (10 sec: 52428.7, 60 sec: 49152.0, 300 sec: 47985.7). Total num frames: 1831763968. Throughput: 0: 48648.9. Samples: 36789100. Policy #0 lag: (min: 0.0, avg: 36.6, max: 81.0) [2024-03-21 11:00:17,806][14687] Avg episode reward: [(0, '1.401')] [2024-03-21 11:00:22,549][14919] Updated weights for policy 0, policy_version 55904 (0.0011) [2024-03-21 11:00:22,805][14687] Fps is (10 sec: 39321.2, 60 sec: 45875.2, 300 sec: 47319.2). Total num frames: 1831862272. Throughput: 0: 49157.8. Samples: 37106300. Policy #0 lag: (min: 0.0, avg: 36.6, max: 81.0) [2024-03-21 11:00:22,806][14687] Avg episode reward: [(0, '1.401')] [2024-03-21 11:00:26,234][14919] Updated weights for policy 0, policy_version 55914 (0.0012) [2024-03-21 11:00:27,805][14687] Fps is (10 sec: 45875.0, 60 sec: 49152.0, 300 sec: 47652.4). Total num frames: 1832222720. Throughput: 0: 52251.0. Samples: 37383700. Policy #0 lag: (min: 0.0, avg: 36.6, max: 81.0) [2024-03-21 11:00:27,806][14687] Avg episode reward: [(0, '1.306')] [2024-03-21 11:00:27,820][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000055915_1832222720.pth... [2024-03-21 11:00:27,932][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000055567_1820819456.pth [2024-03-21 11:00:32,805][14687] Fps is (10 sec: 55705.4, 60 sec: 46421.3, 300 sec: 47652.5). Total num frames: 1832419328. Throughput: 0: 48688.8. Samples: 37536900. Policy #0 lag: (min: 0.0, avg: 30.9, max: 78.0) [2024-03-21 11:00:32,806][14687] Avg episode reward: [(0, '1.363')] [2024-03-21 11:00:33,605][14919] Updated weights for policy 0, policy_version 55924 (0.0011) [2024-03-21 11:00:37,805][14687] Fps is (10 sec: 42598.4, 60 sec: 47513.6, 300 sec: 47985.7). Total num frames: 1832648704. Throughput: 0: 48135.5. Samples: 37819000. Policy #0 lag: (min: 0.0, avg: 30.9, max: 78.0) [2024-03-21 11:00:37,806][14687] Avg episode reward: [(0, '0.712')] [2024-03-21 11:00:39,289][14898] Signal inference workers to stop experience collection... (700 times) [2024-03-21 11:00:39,359][14919] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-03-21 11:00:39,360][14898] Signal inference workers to resume experience collection... (700 times) [2024-03-21 11:00:39,398][14919] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-03-21 11:00:41,317][14919] Updated weights for policy 0, policy_version 55934 (0.0018) [2024-03-21 11:00:42,805][14687] Fps is (10 sec: 49153.0, 60 sec: 48059.8, 300 sec: 47542.1). Total num frames: 1832910848. Throughput: 0: 51471.3. Samples: 38109800. Policy #0 lag: (min: 0.0, avg: 30.9, max: 78.0) [2024-03-21 11:00:42,805][14687] Avg episode reward: [(0, '1.180')] [2024-03-21 11:00:47,805][14687] Fps is (10 sec: 49152.4, 60 sec: 49152.0, 300 sec: 47208.1). Total num frames: 1833140224. Throughput: 0: 48066.7. Samples: 38268100. Policy #0 lag: (min: 0.0, avg: 30.9, max: 78.0) [2024-03-21 11:00:47,805][14687] Avg episode reward: [(0, '1.417')] [2024-03-21 11:00:47,911][14919] Updated weights for policy 0, policy_version 55944 (0.0017) [2024-03-21 11:00:52,805][14687] Fps is (10 sec: 49150.7, 60 sec: 48605.8, 300 sec: 47541.3). Total num frames: 1833402368. Throughput: 0: 48906.5. Samples: 38561800. Policy #0 lag: (min: 0.0, avg: 30.9, max: 78.0) [2024-03-21 11:00:52,806][14687] Avg episode reward: [(0, '1.383')] [2024-03-21 11:00:55,127][14919] Updated weights for policy 0, policy_version 55954 (0.0012) [2024-03-21 11:00:57,805][14687] Fps is (10 sec: 52428.9, 60 sec: 48059.8, 300 sec: 48096.8). Total num frames: 1833664512. Throughput: 0: 52117.8. Samples: 38847800. Policy #0 lag: (min: 0.0, avg: 30.9, max: 78.0) [2024-03-21 11:00:57,806][14687] Avg episode reward: [(0, '1.383')] [2024-03-21 11:01:02,805][14687] Fps is (10 sec: 36045.0, 60 sec: 45875.1, 300 sec: 47874.6). Total num frames: 1833762816. Throughput: 0: 48971.0. Samples: 38992800. Policy #0 lag: (min: 0.0, avg: 30.9, max: 78.0) [2024-03-21 11:01:02,806][14687] Avg episode reward: [(0, '1.383')] [2024-03-21 11:01:03,558][14919] Updated weights for policy 0, policy_version 55964 (0.0011) [2024-03-21 11:01:07,805][14687] Fps is (10 sec: 45875.1, 60 sec: 48059.8, 300 sec: 48541.1). Total num frames: 1834123264. Throughput: 0: 47160.1. Samples: 39228500. Policy #0 lag: (min: 2.0, avg: 36.6, max: 74.0) [2024-03-21 11:01:07,806][14687] Avg episode reward: [(0, '0.970')] [2024-03-21 11:01:10,233][14919] Updated weights for policy 0, policy_version 55974 (0.0017) [2024-03-21 11:01:12,805][14687] Fps is (10 sec: 52429.8, 60 sec: 46967.5, 300 sec: 48096.8). Total num frames: 1834287104. Throughput: 0: 44380.2. Samples: 39380800. Policy #0 lag: (min: 2.0, avg: 36.6, max: 74.0) [2024-03-21 11:01:12,805][14687] Avg episode reward: [(0, '1.474')] [2024-03-21 11:01:17,805][14687] Fps is (10 sec: 29491.1, 60 sec: 44236.8, 300 sec: 47430.3). Total num frames: 1834418176. Throughput: 0: 47540.1. Samples: 39676200. Policy #0 lag: (min: 2.0, avg: 36.6, max: 74.0) [2024-03-21 11:01:17,806][14687] Avg episode reward: [(0, '1.123')] [2024-03-21 11:01:19,465][14919] Updated weights for policy 0, policy_version 55984 (0.0012) [2024-03-21 11:01:22,805][14687] Fps is (10 sec: 39321.9, 60 sec: 46967.6, 300 sec: 47208.2). Total num frames: 1834680320. Throughput: 0: 47620.2. Samples: 39961900. Policy #0 lag: (min: 2.0, avg: 36.6, max: 74.0) [2024-03-21 11:01:22,805][14687] Avg episode reward: [(0, '1.657')] [2024-03-21 11:01:24,185][14919] Updated weights for policy 0, policy_version 55994 (0.0018) [2024-03-21 11:01:27,805][14687] Fps is (10 sec: 58981.9, 60 sec: 46421.3, 300 sec: 47430.3). Total num frames: 1835008000. Throughput: 0: 47135.4. Samples: 40230900. Policy #0 lag: (min: 2.0, avg: 36.6, max: 74.0) [2024-03-21 11:01:27,806][14687] Avg episode reward: [(0, '0.796')] [2024-03-21 11:01:28,909][14919] Updated weights for policy 0, policy_version 56004 (0.0011) [2024-03-21 11:01:29,526][14898] Signal inference workers to stop experience collection... (750 times) [2024-03-21 11:01:29,532][14898] Signal inference workers to resume experience collection... (750 times) [2024-03-21 11:01:29,579][14919] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-03-21 11:01:29,580][14919] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-03-21 11:01:32,549][14919] Updated weights for policy 0, policy_version 56014 (0.0010) [2024-03-21 11:01:32,805][14687] Fps is (10 sec: 81918.9, 60 sec: 51336.6, 300 sec: 48541.1). Total num frames: 1835499520. Throughput: 0: 46433.3. Samples: 40357600. Policy #0 lag: (min: 2.0, avg: 36.6, max: 74.0) [2024-03-21 11:01:32,806][14687] Avg episode reward: [(0, '1.491')] [2024-03-21 11:01:37,805][14687] Fps is (10 sec: 49152.2, 60 sec: 47513.6, 300 sec: 47763.5). Total num frames: 1835499520. Throughput: 0: 46220.1. Samples: 40641700. Policy #0 lag: (min: 2.0, avg: 36.6, max: 74.0) [2024-03-21 11:01:37,806][14687] Avg episode reward: [(0, '1.259')] [2024-03-21 11:01:42,805][14687] Fps is (10 sec: 19660.9, 60 sec: 46421.3, 300 sec: 48207.8). Total num frames: 1835696128. Throughput: 0: 46495.6. Samples: 40940100. Policy #0 lag: (min: 2.0, avg: 36.6, max: 74.0) [2024-03-21 11:01:42,805][14687] Avg episode reward: [(0, '1.070')] [2024-03-21 11:01:44,499][14919] Updated weights for policy 0, policy_version 56024 (0.0009) [2024-03-21 11:01:47,805][14687] Fps is (10 sec: 45875.7, 60 sec: 46967.5, 300 sec: 48207.8). Total num frames: 1835958272. Throughput: 0: 46540.2. Samples: 41087100. Policy #0 lag: (min: 0.0, avg: 53.6, max: 110.0) [2024-03-21 11:01:47,805][14687] Avg episode reward: [(0, '0.815')] [2024-03-21 11:01:49,152][14919] Updated weights for policy 0, policy_version 56034 (0.0019) [2024-03-21 11:01:52,805][14687] Fps is (10 sec: 62257.9, 60 sec: 48605.9, 300 sec: 49207.5). Total num frames: 1836318720. Throughput: 0: 46839.8. Samples: 41336300. Policy #0 lag: (min: 0.0, avg: 53.6, max: 110.0) [2024-03-21 11:01:52,806][14687] Avg episode reward: [(0, '1.315')] [2024-03-21 11:01:57,805][14687] Fps is (10 sec: 45875.2, 60 sec: 45875.2, 300 sec: 48096.8). Total num frames: 1836417024. Throughput: 0: 46606.6. Samples: 41478100. Policy #0 lag: (min: 0.0, avg: 53.6, max: 110.0) [2024-03-21 11:01:57,806][14687] Avg episode reward: [(0, '1.038')] [2024-03-21 11:02:00,929][14919] Updated weights for policy 0, policy_version 56044 (0.0016) [2024-03-21 11:02:02,805][14687] Fps is (10 sec: 29491.5, 60 sec: 47513.7, 300 sec: 47652.5). Total num frames: 1836613632. Throughput: 0: 46506.6. Samples: 41769000. Policy #0 lag: (min: 0.0, avg: 53.6, max: 110.0) [2024-03-21 11:02:02,806][14687] Avg episode reward: [(0, '1.447')] [2024-03-21 11:02:04,153][14919] Updated weights for policy 0, policy_version 56054 (0.0014) [2024-03-21 11:02:07,805][14687] Fps is (10 sec: 49151.8, 60 sec: 46421.3, 300 sec: 47541.4). Total num frames: 1836908544. Throughput: 0: 45939.9. Samples: 42029200. Policy #0 lag: (min: 0.0, avg: 53.6, max: 110.0) [2024-03-21 11:02:07,806][14687] Avg episode reward: [(0, '1.441')] [2024-03-21 11:02:12,805][14687] Fps is (10 sec: 42598.7, 60 sec: 45875.2, 300 sec: 47208.2). Total num frames: 1837039616. Throughput: 0: 46080.1. Samples: 42304500. Policy #0 lag: (min: 0.0, avg: 53.6, max: 110.0) [2024-03-21 11:02:12,806][14687] Avg episode reward: [(0, '1.441')] [2024-03-21 11:02:13,794][14919] Updated weights for policy 0, policy_version 56064 (0.0010) [2024-03-21 11:02:17,805][14687] Fps is (10 sec: 26214.4, 60 sec: 45875.2, 300 sec: 47097.1). Total num frames: 1837170688. Throughput: 0: 46182.2. Samples: 42435800. Policy #0 lag: (min: 0.0, avg: 53.6, max: 110.0) [2024-03-21 11:02:17,805][14687] Avg episode reward: [(0, '0.869')] [2024-03-21 11:02:22,226][14919] Updated weights for policy 0, policy_version 56074 (0.0012) [2024-03-21 11:02:22,661][14898] Signal inference workers to stop experience collection... (800 times) [2024-03-21 11:02:22,731][14898] Signal inference workers to resume experience collection... (800 times) [2024-03-21 11:02:22,735][14919] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-03-21 11:02:22,788][14919] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-03-21 11:02:22,805][14687] Fps is (10 sec: 42597.8, 60 sec: 46421.1, 300 sec: 47319.2). Total num frames: 1837465600. Throughput: 0: 44971.0. Samples: 42665400. Policy #0 lag: (min: 0.0, avg: 49.3, max: 103.0) [2024-03-21 11:02:22,806][14687] Avg episode reward: [(0, '1.386')] [2024-03-21 11:02:27,805][14687] Fps is (10 sec: 49151.6, 60 sec: 44236.8, 300 sec: 47430.3). Total num frames: 1837662208. Throughput: 0: 44533.2. Samples: 42944100. Policy #0 lag: (min: 0.0, avg: 49.3, max: 103.0) [2024-03-21 11:02:27,806][14687] Avg episode reward: [(0, '1.462')] [2024-03-21 11:02:27,818][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000056081_1837662208.pth... [2024-03-21 11:02:27,973][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000055740_1826488320.pth [2024-03-21 11:02:30,368][14919] Updated weights for policy 0, policy_version 56084 (0.0016) [2024-03-21 11:02:32,805][14687] Fps is (10 sec: 42598.9, 60 sec: 39867.7, 300 sec: 47652.4). Total num frames: 1837891584. Throughput: 0: 44384.4. Samples: 43084400. Policy #0 lag: (min: 0.0, avg: 49.3, max: 103.0) [2024-03-21 11:02:32,805][14687] Avg episode reward: [(0, '1.389')] [2024-03-21 11:02:36,997][14919] Updated weights for policy 0, policy_version 56094 (0.0010) [2024-03-21 11:02:37,805][14687] Fps is (10 sec: 45875.3, 60 sec: 43690.7, 300 sec: 47541.4). Total num frames: 1838120960. Throughput: 0: 44973.5. Samples: 43360100. Policy #0 lag: (min: 0.0, avg: 49.3, max: 103.0) [2024-03-21 11:02:37,806][14687] Avg episode reward: [(0, '1.370')] [2024-03-21 11:02:41,833][14919] Updated weights for policy 0, policy_version 56104 (0.0016) [2024-03-21 11:02:42,805][14687] Fps is (10 sec: 52428.7, 60 sec: 45329.0, 300 sec: 47874.6). Total num frames: 1838415872. Throughput: 0: 45255.5. Samples: 43514600. Policy #0 lag: (min: 0.0, avg: 49.3, max: 103.0) [2024-03-21 11:02:42,806][14687] Avg episode reward: [(0, '1.113')] [2024-03-21 11:02:46,892][14919] Updated weights for policy 0, policy_version 56114 (0.0016) [2024-03-21 11:02:47,805][14687] Fps is (10 sec: 65535.6, 60 sec: 46967.3, 300 sec: 47985.7). Total num frames: 1838776320. Throughput: 0: 44957.7. Samples: 43792100. Policy #0 lag: (min: 0.0, avg: 49.3, max: 103.0) [2024-03-21 11:02:47,806][14687] Avg episode reward: [(0, '1.365')] [2024-03-21 11:02:52,805][14687] Fps is (10 sec: 62259.0, 60 sec: 45329.2, 300 sec: 48207.8). Total num frames: 1839038464. Throughput: 0: 45277.7. Samples: 44066700. Policy #0 lag: (min: 0.0, avg: 49.3, max: 103.0) [2024-03-21 11:02:52,806][14687] Avg episode reward: [(0, '0.475')] [2024-03-21 11:02:55,278][14919] Updated weights for policy 0, policy_version 56124 (0.0017) [2024-03-21 11:02:57,805][14687] Fps is (10 sec: 45875.3, 60 sec: 46967.4, 300 sec: 47985.7). Total num frames: 1839235072. Throughput: 0: 45764.3. Samples: 44363900. Policy #0 lag: (min: 0.0, avg: 43.3, max: 84.0) [2024-03-21 11:02:57,806][14687] Avg episode reward: [(0, '1.428')] [2024-03-21 11:03:02,805][14687] Fps is (10 sec: 29491.5, 60 sec: 45329.2, 300 sec: 47541.4). Total num frames: 1839333376. Throughput: 0: 46184.5. Samples: 44514100. Policy #0 lag: (min: 0.0, avg: 43.3, max: 84.0) [2024-03-21 11:03:02,805][14687] Avg episode reward: [(0, '0.969')] [2024-03-21 11:03:03,714][14919] Updated weights for policy 0, policy_version 56134 (0.0011) [2024-03-21 11:03:06,738][14898] Signal inference workers to stop experience collection... (850 times) [2024-03-21 11:03:06,812][14898] Signal inference workers to resume experience collection... (850 times) [2024-03-21 11:03:06,891][14919] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-03-21 11:03:06,962][14919] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-03-21 11:03:07,168][14919] Updated weights for policy 0, policy_version 56144 (0.0011) [2024-03-21 11:03:07,805][14687] Fps is (10 sec: 52429.0, 60 sec: 47513.6, 300 sec: 47763.5). Total num frames: 1839759360. Throughput: 0: 46869.0. Samples: 44774500. Policy #0 lag: (min: 0.0, avg: 43.3, max: 84.0) [2024-03-21 11:03:07,806][14687] Avg episode reward: [(0, '1.543')] [2024-03-21 11:03:12,805][14687] Fps is (10 sec: 62258.9, 60 sec: 48605.9, 300 sec: 47430.3). Total num frames: 1839955968. Throughput: 0: 43826.7. Samples: 44916300. Policy #0 lag: (min: 0.0, avg: 43.3, max: 84.0) [2024-03-21 11:03:12,805][14687] Avg episode reward: [(0, '0.935')] [2024-03-21 11:03:14,973][14919] Updated weights for policy 0, policy_version 56154 (0.0013) [2024-03-21 11:03:17,805][14687] Fps is (10 sec: 36044.9, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 1840119808. Throughput: 0: 47631.1. Samples: 45227800. Policy #0 lag: (min: 0.0, avg: 43.3, max: 84.0) [2024-03-21 11:03:17,806][14687] Avg episode reward: [(0, '0.935')] [2024-03-21 11:03:22,690][14919] Updated weights for policy 0, policy_version 56164 (0.0016) [2024-03-21 11:03:22,805][14687] Fps is (10 sec: 42597.8, 60 sec: 48605.9, 300 sec: 47097.1). Total num frames: 1840381952. Throughput: 0: 48633.2. Samples: 45548600. Policy #0 lag: (min: 0.0, avg: 43.3, max: 84.0) [2024-03-21 11:03:22,806][14687] Avg episode reward: [(0, '0.935')] [2024-03-21 11:03:27,805][14687] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 47208.1). Total num frames: 1840611328. Throughput: 0: 48224.5. Samples: 45684700. Policy #0 lag: (min: 0.0, avg: 43.3, max: 84.0) [2024-03-21 11:03:27,806][14687] Avg episode reward: [(0, '0.881')] [2024-03-21 11:03:29,206][14919] Updated weights for policy 0, policy_version 56174 (0.0011) [2024-03-21 11:03:32,805][14687] Fps is (10 sec: 55706.4, 60 sec: 50790.4, 300 sec: 47874.6). Total num frames: 1840939008. Throughput: 0: 48313.5. Samples: 45966200. Policy #0 lag: (min: 1.0, avg: 32.6, max: 88.0) [2024-03-21 11:03:32,806][14687] Avg episode reward: [(0, '0.950')] [2024-03-21 11:03:37,387][14919] Updated weights for policy 0, policy_version 56184 (0.0010) [2024-03-21 11:03:37,805][14687] Fps is (10 sec: 42598.3, 60 sec: 48605.9, 300 sec: 47097.0). Total num frames: 1841037312. Throughput: 0: 49537.8. Samples: 46295900. Policy #0 lag: (min: 1.0, avg: 32.6, max: 88.0) [2024-03-21 11:03:37,806][14687] Avg episode reward: [(0, '0.637')] [2024-03-21 11:03:42,805][14687] Fps is (10 sec: 39321.8, 60 sec: 48605.9, 300 sec: 47541.4). Total num frames: 1841332224. Throughput: 0: 49544.6. Samples: 46593400. Policy #0 lag: (min: 1.0, avg: 32.6, max: 88.0) [2024-03-21 11:03:42,806][14687] Avg episode reward: [(0, '1.450')] [2024-03-21 11:03:44,083][14919] Updated weights for policy 0, policy_version 56194 (0.0010) [2024-03-21 11:03:47,805][14687] Fps is (10 sec: 45874.7, 60 sec: 45329.1, 300 sec: 47208.1). Total num frames: 1841496064. Throughput: 0: 49835.3. Samples: 46756700. Policy #0 lag: (min: 1.0, avg: 32.6, max: 88.0) [2024-03-21 11:03:47,806][14687] Avg episode reward: [(0, '1.700')] [2024-03-21 11:03:49,659][14919] Updated weights for policy 0, policy_version 56204 (0.0014) [2024-03-21 11:03:52,805][14687] Fps is (10 sec: 55705.1, 60 sec: 47513.6, 300 sec: 47985.7). Total num frames: 1841889280. Throughput: 0: 50386.7. Samples: 47041900. Policy #0 lag: (min: 1.0, avg: 32.6, max: 88.0) [2024-03-21 11:03:52,806][14687] Avg episode reward: [(0, '0.975')] [2024-03-21 11:03:53,875][14919] Updated weights for policy 0, policy_version 56214 (0.0014) [2024-03-21 11:03:57,252][14898] Signal inference workers to stop experience collection... (900 times) [2024-03-21 11:03:57,321][14898] Signal inference workers to resume experience collection... (900 times) [2024-03-21 11:03:57,355][14919] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-03-21 11:03:57,396][14919] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-03-21 11:03:57,805][14687] Fps is (10 sec: 78643.3, 60 sec: 50790.4, 300 sec: 48318.9). Total num frames: 1842282496. Throughput: 0: 50439.9. Samples: 47186100. Policy #0 lag: (min: 1.0, avg: 32.6, max: 88.0) [2024-03-21 11:03:57,806][14687] Avg episode reward: [(0, '0.975')] [2024-03-21 11:03:59,503][14919] Updated weights for policy 0, policy_version 56225 (0.0008) [2024-03-21 11:04:02,805][14687] Fps is (10 sec: 58982.4, 60 sec: 52428.7, 300 sec: 48430.0). Total num frames: 1842479104. Throughput: 0: 49855.5. Samples: 47471300. Policy #0 lag: (min: 1.0, avg: 32.6, max: 88.0) [2024-03-21 11:04:02,806][14687] Avg episode reward: [(0, '1.616')] [2024-03-21 11:04:06,678][14919] Updated weights for policy 0, policy_version 56235 (0.0016) [2024-03-21 11:04:07,805][14687] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 47874.6). Total num frames: 1842774016. Throughput: 0: 49231.2. Samples: 47764000. Policy #0 lag: (min: 0.0, avg: 46.7, max: 95.0) [2024-03-21 11:04:07,806][14687] Avg episode reward: [(0, '1.608')] [2024-03-21 11:04:12,805][14687] Fps is (10 sec: 45875.1, 60 sec: 49698.1, 300 sec: 47874.6). Total num frames: 1842937856. Throughput: 0: 52888.8. Samples: 48064700. Policy #0 lag: (min: 0.0, avg: 46.7, max: 95.0) [2024-03-21 11:04:12,806][14687] Avg episode reward: [(0, '1.206')] [2024-03-21 11:04:14,417][14919] Updated weights for policy 0, policy_version 56245 (0.0009) [2024-03-21 11:04:17,805][14687] Fps is (10 sec: 45875.2, 60 sec: 51882.6, 300 sec: 47874.6). Total num frames: 1843232768. Throughput: 0: 49753.3. Samples: 48205100. Policy #0 lag: (min: 0.0, avg: 46.7, max: 95.0) [2024-03-21 11:04:17,805][14687] Avg episode reward: [(0, '0.894')] [2024-03-21 11:04:22,493][14919] Updated weights for policy 0, policy_version 56255 (0.0023) [2024-03-21 11:04:22,805][14687] Fps is (10 sec: 45875.2, 60 sec: 50244.3, 300 sec: 47874.6). Total num frames: 1843396608. Throughput: 0: 49202.2. Samples: 48510000. Policy #0 lag: (min: 0.0, avg: 46.7, max: 95.0) [2024-03-21 11:04:22,806][14687] Avg episode reward: [(0, '0.894')] [2024-03-21 11:04:27,805][14687] Fps is (10 sec: 29491.4, 60 sec: 48605.9, 300 sec: 47097.1). Total num frames: 1843527680. Throughput: 0: 45460.0. Samples: 48639100. Policy #0 lag: (min: 0.0, avg: 46.7, max: 95.0) [2024-03-21 11:04:27,805][14687] Avg episode reward: [(0, '1.578')] [2024-03-21 11:04:27,817][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000056260_1843527680.pth... [2024-03-21 11:04:27,952][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000055915_1832222720.pth [2024-03-21 11:04:32,055][14919] Updated weights for policy 0, policy_version 56265 (0.0012) [2024-03-21 11:04:32,805][14687] Fps is (10 sec: 29491.0, 60 sec: 45875.1, 300 sec: 47097.1). Total num frames: 1843691520. Throughput: 0: 49048.9. Samples: 48963900. Policy #0 lag: (min: 0.0, avg: 46.7, max: 95.0) [2024-03-21 11:04:32,806][14687] Avg episode reward: [(0, '1.209')] [2024-03-21 11:04:37,805][14687] Fps is (10 sec: 32767.8, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 1843855360. Throughput: 0: 49226.7. Samples: 49257100. Policy #0 lag: (min: 0.0, avg: 46.7, max: 95.0) [2024-03-21 11:04:37,806][14687] Avg episode reward: [(0, '0.980')] [2024-03-21 11:04:39,810][14919] Updated weights for policy 0, policy_version 56275 (0.0021) [2024-03-21 11:04:42,805][14687] Fps is (10 sec: 49152.8, 60 sec: 47513.6, 300 sec: 47430.3). Total num frames: 1844183040. Throughput: 0: 52091.3. Samples: 49530200. Policy #0 lag: (min: 2.0, avg: 39.6, max: 88.0) [2024-03-21 11:04:42,805][14687] Avg episode reward: [(0, '0.816')] [2024-03-21 11:04:44,807][14919] Updated weights for policy 0, policy_version 56285 (0.0013) [2024-03-21 11:04:47,805][14687] Fps is (10 sec: 68813.9, 60 sec: 50790.6, 300 sec: 47652.5). Total num frames: 1844543488. Throughput: 0: 48435.7. Samples: 49650900. Policy #0 lag: (min: 2.0, avg: 39.6, max: 88.0) [2024-03-21 11:04:47,805][14687] Avg episode reward: [(0, '1.021')] [2024-03-21 11:04:49,967][14919] Updated weights for policy 0, policy_version 56295 (0.0016) [2024-03-21 11:04:52,348][14898] Signal inference workers to stop experience collection... (950 times) [2024-03-21 11:04:52,427][14919] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-03-21 11:04:52,690][14898] Signal inference workers to resume experience collection... (950 times) [2024-03-21 11:04:52,690][14919] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-03-21 11:04:52,805][14687] Fps is (10 sec: 68813.2, 60 sec: 49698.2, 300 sec: 47763.5). Total num frames: 1844871168. Throughput: 0: 47933.5. Samples: 49921000. Policy #0 lag: (min: 2.0, avg: 39.6, max: 88.0) [2024-03-21 11:04:52,805][14687] Avg episode reward: [(0, '1.514')] [2024-03-21 11:04:53,931][14919] Updated weights for policy 0, policy_version 56305 (0.0021) [2024-03-21 11:04:57,805][14687] Fps is (10 sec: 58981.5, 60 sec: 47513.7, 300 sec: 47874.6). Total num frames: 1845133312. Throughput: 0: 47355.6. Samples: 50195700. Policy #0 lag: (min: 2.0, avg: 39.6, max: 88.0) [2024-03-21 11:04:57,806][14687] Avg episode reward: [(0, '1.915')] [2024-03-21 11:05:02,176][14919] Updated weights for policy 0, policy_version 56315 (0.0011) [2024-03-21 11:05:02,805][14687] Fps is (10 sec: 45874.3, 60 sec: 47513.5, 300 sec: 47763.5). Total num frames: 1845329920. Throughput: 0: 47708.8. Samples: 50352000. Policy #0 lag: (min: 2.0, avg: 39.6, max: 88.0) [2024-03-21 11:05:02,806][14687] Avg episode reward: [(0, '1.177')] [2024-03-21 11:05:07,805][14687] Fps is (10 sec: 49152.1, 60 sec: 47513.6, 300 sec: 47985.7). Total num frames: 1845624832. Throughput: 0: 46853.4. Samples: 50618400. Policy #0 lag: (min: 2.0, avg: 39.6, max: 88.0) [2024-03-21 11:05:07,806][14687] Avg episode reward: [(0, '0.996')] [2024-03-21 11:05:08,377][14919] Updated weights for policy 0, policy_version 56325 (0.0012) [2024-03-21 11:05:12,805][14687] Fps is (10 sec: 55706.2, 60 sec: 49152.0, 300 sec: 47874.6). Total num frames: 1845886976. Throughput: 0: 50426.6. Samples: 50908300. Policy #0 lag: (min: 2.0, avg: 39.6, max: 88.0) [2024-03-21 11:05:12,806][14687] Avg episode reward: [(0, '1.222')] [2024-03-21 11:05:16,258][14919] Updated weights for policy 0, policy_version 56335 (0.0015) [2024-03-21 11:05:17,805][14687] Fps is (10 sec: 39321.0, 60 sec: 46421.3, 300 sec: 47985.7). Total num frames: 1846018048. Throughput: 0: 46613.3. Samples: 51061500. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 11:05:17,806][14687] Avg episode reward: [(0, '0.755')] [2024-03-21 11:05:22,805][14687] Fps is (10 sec: 26214.5, 60 sec: 45875.3, 300 sec: 47208.2). Total num frames: 1846149120. Throughput: 0: 46440.1. Samples: 51346900. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 11:05:22,805][14687] Avg episode reward: [(0, '1.085')] [2024-03-21 11:05:25,990][14919] Updated weights for policy 0, policy_version 56345 (0.0013) [2024-03-21 11:05:27,805][14687] Fps is (10 sec: 32768.4, 60 sec: 46967.4, 300 sec: 47208.2). Total num frames: 1846345728. Throughput: 0: 45902.2. Samples: 51595800. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 11:05:27,806][14687] Avg episode reward: [(0, '1.282')] [2024-03-21 11:05:32,805][14687] Fps is (10 sec: 36044.5, 60 sec: 46967.5, 300 sec: 46986.0). Total num frames: 1846509568. Throughput: 0: 45968.7. Samples: 51719500. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 11:05:32,806][14687] Avg episode reward: [(0, '1.284')] [2024-03-21 11:05:34,308][14919] Updated weights for policy 0, policy_version 56355 (0.0011) [2024-03-21 11:05:37,805][14687] Fps is (10 sec: 42598.5, 60 sec: 48605.9, 300 sec: 46986.0). Total num frames: 1846771712. Throughput: 0: 45399.9. Samples: 51964000. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 11:05:37,805][14687] Avg episode reward: [(0, '1.221')] [2024-03-21 11:05:41,381][14919] Updated weights for policy 0, policy_version 56365 (0.0010) [2024-03-21 11:05:42,805][14687] Fps is (10 sec: 55705.8, 60 sec: 48059.7, 300 sec: 47208.1). Total num frames: 1847066624. Throughput: 0: 45251.1. Samples: 52232000. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 11:05:42,806][14687] Avg episode reward: [(0, '0.985')] [2024-03-21 11:05:47,805][14687] Fps is (10 sec: 36044.3, 60 sec: 43144.4, 300 sec: 46541.7). Total num frames: 1847132160. Throughput: 0: 44975.5. Samples: 52375900. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 11:05:47,806][14687] Avg episode reward: [(0, '1.297')] [2024-03-21 11:05:51,728][14919] Updated weights for policy 0, policy_version 56375 (0.0016) [2024-03-21 11:05:52,805][14687] Fps is (10 sec: 26214.4, 60 sec: 40959.9, 300 sec: 46319.5). Total num frames: 1847328768. Throughput: 0: 45366.6. Samples: 52659900. Policy #0 lag: (min: 0.0, avg: 38.3, max: 75.0) [2024-03-21 11:05:52,806][14687] Avg episode reward: [(0, '1.438')] [2024-03-21 11:05:57,764][14919] Updated weights for policy 0, policy_version 56385 (0.0019) [2024-03-21 11:05:57,805][14687] Fps is (10 sec: 49152.0, 60 sec: 41506.1, 300 sec: 46986.0). Total num frames: 1847623680. Throughput: 0: 41744.3. Samples: 52786800. Policy #0 lag: (min: 0.0, avg: 35.5, max: 100.0) [2024-03-21 11:05:57,806][14687] Avg episode reward: [(0, '1.403')] [2024-03-21 11:05:59,620][14898] Signal inference workers to stop experience collection... (1000 times) [2024-03-21 11:05:59,696][14919] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-03-21 11:05:59,907][14898] Signal inference workers to resume experience collection... (1000 times) [2024-03-21 11:05:59,907][14919] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-03-21 11:06:01,454][14919] Updated weights for policy 0, policy_version 56395 (0.0018) [2024-03-21 11:06:02,805][14687] Fps is (10 sec: 65535.6, 60 sec: 44236.8, 300 sec: 46986.0). Total num frames: 1847984128. Throughput: 0: 43666.7. Samples: 53026500. Policy #0 lag: (min: 0.0, avg: 35.5, max: 100.0) [2024-03-21 11:06:02,806][14687] Avg episode reward: [(0, '0.864')] [2024-03-21 11:06:07,805][14687] Fps is (10 sec: 62259.7, 60 sec: 43690.6, 300 sec: 47319.2). Total num frames: 1848246272. Throughput: 0: 43617.7. Samples: 53309700. Policy #0 lag: (min: 0.0, avg: 35.5, max: 100.0) [2024-03-21 11:06:07,806][14687] Avg episode reward: [(0, '0.864')] [2024-03-21 11:06:07,901][14919] Updated weights for policy 0, policy_version 56405 (0.0011) [2024-03-21 11:06:12,805][14687] Fps is (10 sec: 52429.3, 60 sec: 43690.7, 300 sec: 47763.5). Total num frames: 1848508416. Throughput: 0: 44215.6. Samples: 53585500. Policy #0 lag: (min: 0.0, avg: 35.5, max: 100.0) [2024-03-21 11:06:12,806][14687] Avg episode reward: [(0, '1.381')] [2024-03-21 11:06:16,245][14919] Updated weights for policy 0, policy_version 56415 (0.0012) [2024-03-21 11:06:17,805][14687] Fps is (10 sec: 42598.4, 60 sec: 44236.9, 300 sec: 47430.3). Total num frames: 1848672256. Throughput: 0: 45035.6. Samples: 53746100. Policy #0 lag: (min: 0.0, avg: 35.5, max: 100.0) [2024-03-21 11:06:17,806][14687] Avg episode reward: [(0, '1.122')] [2024-03-21 11:06:22,203][14919] Updated weights for policy 0, policy_version 56425 (0.0014) [2024-03-21 11:06:22,805][14687] Fps is (10 sec: 45875.4, 60 sec: 46967.5, 300 sec: 47319.2). Total num frames: 1848967168. Throughput: 0: 45648.9. Samples: 54018200. Policy #0 lag: (min: 0.0, avg: 35.5, max: 100.0) [2024-03-21 11:06:22,805][14687] Avg episode reward: [(0, '1.525')] [2024-03-21 11:06:26,471][14919] Updated weights for policy 0, policy_version 56435 (0.0013) [2024-03-21 11:06:27,805][14687] Fps is (10 sec: 58982.7, 60 sec: 48605.9, 300 sec: 46652.8). Total num frames: 1849262080. Throughput: 0: 45655.6. Samples: 54286500. Policy #0 lag: (min: 0.0, avg: 35.5, max: 100.0) [2024-03-21 11:06:27,806][14687] Avg episode reward: [(0, '1.640')] [2024-03-21 11:06:27,817][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000056435_1849262080.pth... [2024-03-21 11:06:27,974][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000056081_1837662208.pth [2024-03-21 11:06:32,805][14687] Fps is (10 sec: 42597.7, 60 sec: 48059.7, 300 sec: 47097.1). Total num frames: 1849393152. Throughput: 0: 45955.6. Samples: 54443900. Policy #0 lag: (min: 0.0, avg: 44.0, max: 81.0) [2024-03-21 11:06:32,806][14687] Avg episode reward: [(0, '1.640')] [2024-03-21 11:06:37,676][14919] Updated weights for policy 0, policy_version 56445 (0.0015) [2024-03-21 11:06:37,805][14687] Fps is (10 sec: 32768.0, 60 sec: 46967.5, 300 sec: 47097.1). Total num frames: 1849589760. Throughput: 0: 46417.8. Samples: 54748700. Policy #0 lag: (min: 0.0, avg: 44.0, max: 81.0) [2024-03-21 11:06:37,805][14687] Avg episode reward: [(0, '1.114')] [2024-03-21 11:06:42,805][14687] Fps is (10 sec: 39322.0, 60 sec: 45329.1, 300 sec: 46874.9). Total num frames: 1849786368. Throughput: 0: 46782.3. Samples: 54892000. Policy #0 lag: (min: 0.0, avg: 44.0, max: 81.0) [2024-03-21 11:06:42,806][14687] Avg episode reward: [(0, '1.500')] [2024-03-21 11:06:46,963][14919] Updated weights for policy 0, policy_version 56455 (0.0019) [2024-03-21 11:06:47,805][14687] Fps is (10 sec: 36044.5, 60 sec: 46967.5, 300 sec: 46208.5). Total num frames: 1849950208. Throughput: 0: 47002.2. Samples: 55141600. Policy #0 lag: (min: 0.0, avg: 44.0, max: 81.0) [2024-03-21 11:06:47,806][14687] Avg episode reward: [(0, '1.585')] [2024-03-21 11:06:51,733][14919] Updated weights for policy 0, policy_version 56465 (0.0018) [2024-03-21 11:06:52,805][14687] Fps is (10 sec: 55704.7, 60 sec: 50244.1, 300 sec: 47208.1). Total num frames: 1850343424. Throughput: 0: 45682.1. Samples: 55365400. Policy #0 lag: (min: 0.0, avg: 44.0, max: 81.0) [2024-03-21 11:06:52,806][14687] Avg episode reward: [(0, '0.917')] [2024-03-21 11:06:52,855][14898] Signal inference workers to stop experience collection... (1050 times) [2024-03-21 11:06:52,897][14919] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-03-21 11:06:53,082][14898] Signal inference workers to resume experience collection... (1050 times) [2024-03-21 11:06:53,082][14919] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-03-21 11:06:57,805][14687] Fps is (10 sec: 55705.7, 60 sec: 48059.8, 300 sec: 47097.1). Total num frames: 1850507264. Throughput: 0: 42868.8. Samples: 55514600. Policy #0 lag: (min: 0.0, avg: 44.0, max: 81.0) [2024-03-21 11:06:57,806][14687] Avg episode reward: [(0, '1.301')] [2024-03-21 11:07:02,805][14687] Fps is (10 sec: 19661.2, 60 sec: 42598.5, 300 sec: 46208.4). Total num frames: 1850540032. Throughput: 0: 46877.9. Samples: 55855600. Policy #0 lag: (min: 0.0, avg: 44.0, max: 81.0) [2024-03-21 11:07:02,805][14687] Avg episode reward: [(0, '1.301')] [2024-03-21 11:07:04,092][14919] Updated weights for policy 0, policy_version 56475 (0.0014) [2024-03-21 11:07:07,805][14687] Fps is (10 sec: 16384.1, 60 sec: 40413.9, 300 sec: 46208.4). Total num frames: 1850671104. Throughput: 0: 47431.1. Samples: 56152600. Policy #0 lag: (min: 1.0, avg: 34.2, max: 73.0) [2024-03-21 11:07:07,814][14687] Avg episode reward: [(0, '0.902')] [2024-03-21 11:07:11,905][14919] Updated weights for policy 0, policy_version 56485 (0.0019) [2024-03-21 11:07:12,805][14687] Fps is (10 sec: 45875.3, 60 sec: 41506.2, 300 sec: 46874.9). Total num frames: 1850998784. Throughput: 0: 44573.4. Samples: 56292300. Policy #0 lag: (min: 1.0, avg: 34.2, max: 73.0) [2024-03-21 11:07:12,805][14687] Avg episode reward: [(0, '1.521')] [2024-03-21 11:07:14,816][14919] Updated weights for policy 0, policy_version 56495 (0.0011) [2024-03-21 11:07:17,805][14687] Fps is (10 sec: 85196.3, 60 sec: 47513.6, 300 sec: 47652.5). Total num frames: 1851523072. Throughput: 0: 46517.8. Samples: 56537200. Policy #0 lag: (min: 1.0, avg: 34.2, max: 73.0) [2024-03-21 11:07:17,806][14687] Avg episode reward: [(0, '1.605')] [2024-03-21 11:07:18,239][14919] Updated weights for policy 0, policy_version 56505 (0.0011) [2024-03-21 11:07:21,818][14919] Updated weights for policy 0, policy_version 56515 (0.0010) [2024-03-21 11:07:22,805][14687] Fps is (10 sec: 88472.8, 60 sec: 48605.8, 300 sec: 48207.8). Total num frames: 1851883520. Throughput: 0: 45684.4. Samples: 56804500. Policy #0 lag: (min: 1.0, avg: 34.2, max: 73.0) [2024-03-21 11:07:22,814][14687] Avg episode reward: [(0, '1.605')] [2024-03-21 11:07:27,805][14687] Fps is (10 sec: 62259.0, 60 sec: 48059.6, 300 sec: 48318.9). Total num frames: 1852145664. Throughput: 0: 45937.7. Samples: 56959200. Policy #0 lag: (min: 1.0, avg: 34.2, max: 73.0) [2024-03-21 11:07:27,806][14687] Avg episode reward: [(0, '1.605')] [2024-03-21 11:07:31,631][14919] Updated weights for policy 0, policy_version 56525 (0.0011) [2024-03-21 11:07:32,805][14687] Fps is (10 sec: 39321.6, 60 sec: 48059.8, 300 sec: 47985.7). Total num frames: 1852276736. Throughput: 0: 47582.3. Samples: 57282800. Policy #0 lag: (min: 1.0, avg: 34.2, max: 73.0) [2024-03-21 11:07:32,806][14687] Avg episode reward: [(0, '1.282')] [2024-03-21 11:07:37,805][14687] Fps is (10 sec: 36044.8, 60 sec: 48605.8, 300 sec: 47763.5). Total num frames: 1852506112. Throughput: 0: 49357.9. Samples: 57586500. Policy #0 lag: (min: 1.0, avg: 34.2, max: 73.0) [2024-03-21 11:07:37,806][14687] Avg episode reward: [(0, '1.282')] [2024-03-21 11:07:37,976][14919] Updated weights for policy 0, policy_version 56535 (0.0019) [2024-03-21 11:07:38,794][14898] Signal inference workers to stop experience collection... (1100 times) [2024-03-21 11:07:38,855][14919] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-03-21 11:07:39,035][14898] Signal inference workers to resume experience collection... (1100 times) [2024-03-21 11:07:39,036][14919] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-03-21 11:07:42,790][14919] Updated weights for policy 0, policy_version 56545 (0.0011) [2024-03-21 11:07:42,805][14687] Fps is (10 sec: 58982.3, 60 sec: 51336.5, 300 sec: 47763.5). Total num frames: 1852866560. Throughput: 0: 51548.9. Samples: 57834300. Policy #0 lag: (min: 1.0, avg: 44.5, max: 83.0) [2024-03-21 11:07:42,805][14687] Avg episode reward: [(0, '1.197')] [2024-03-21 11:07:47,805][14687] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 47319.2). Total num frames: 1852997632. Throughput: 0: 47333.2. Samples: 57985600. Policy #0 lag: (min: 1.0, avg: 44.5, max: 83.0) [2024-03-21 11:07:47,806][14687] Avg episode reward: [(0, '1.377')] [2024-03-21 11:07:52,805][14687] Fps is (10 sec: 26214.4, 60 sec: 46421.4, 300 sec: 47097.1). Total num frames: 1853128704. Throughput: 0: 47808.8. Samples: 58304000. Policy #0 lag: (min: 1.0, avg: 44.5, max: 83.0) [2024-03-21 11:07:52,806][14687] Avg episode reward: [(0, '0.746')] [2024-03-21 11:07:53,388][14919] Updated weights for policy 0, policy_version 56555 (0.0011) [2024-03-21 11:07:57,805][14687] Fps is (10 sec: 32768.2, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 1853325312. Throughput: 0: 51597.7. Samples: 58614200. Policy #0 lag: (min: 1.0, avg: 44.5, max: 83.0) [2024-03-21 11:07:57,806][14687] Avg episode reward: [(0, '1.561')] [2024-03-21 11:08:02,805][14687] Fps is (10 sec: 29491.4, 60 sec: 48059.7, 300 sec: 46319.5). Total num frames: 1853423616. Throughput: 0: 49266.7. Samples: 58754200. Policy #0 lag: (min: 1.0, avg: 44.5, max: 83.0) [2024-03-21 11:08:02,805][14687] Avg episode reward: [(0, '1.438')] [2024-03-21 11:08:04,931][14919] Updated weights for policy 0, policy_version 56565 (0.0010) [2024-03-21 11:08:07,805][14687] Fps is (10 sec: 36044.4, 60 sec: 50244.1, 300 sec: 46541.6). Total num frames: 1853685760. Throughput: 0: 50102.1. Samples: 59059100. Policy #0 lag: (min: 1.0, avg: 44.5, max: 83.0) [2024-03-21 11:08:07,806][14687] Avg episode reward: [(0, '0.997')] [2024-03-21 11:08:11,297][14919] Updated weights for policy 0, policy_version 56575 (0.0010) [2024-03-21 11:08:12,805][14687] Fps is (10 sec: 55704.6, 60 sec: 49697.9, 300 sec: 46986.0). Total num frames: 1853980672. Throughput: 0: 50233.2. Samples: 59219700. Policy #0 lag: (min: 1.0, avg: 44.5, max: 83.0) [2024-03-21 11:08:12,806][14687] Avg episode reward: [(0, '0.639')] [2024-03-21 11:08:14,612][14919] Updated weights for policy 0, policy_version 56585 (0.0018) [2024-03-21 11:08:17,805][14687] Fps is (10 sec: 75366.9, 60 sec: 48605.8, 300 sec: 47652.5). Total num frames: 1854439424. Throughput: 0: 48724.4. Samples: 59475400. Policy #0 lag: (min: 5.0, avg: 40.7, max: 97.0) [2024-03-21 11:08:17,806][14687] Avg episode reward: [(0, '1.398')] [2024-03-21 11:08:18,352][14919] Updated weights for policy 0, policy_version 56595 (0.0016) [2024-03-21 11:08:22,805][14687] Fps is (10 sec: 78643.7, 60 sec: 48059.7, 300 sec: 47985.7). Total num frames: 1854767104. Throughput: 0: 48260.0. Samples: 59758200. Policy #0 lag: (min: 5.0, avg: 40.7, max: 97.0) [2024-03-21 11:08:22,806][14687] Avg episode reward: [(0, '0.793')] [2024-03-21 11:08:23,453][14919] Updated weights for policy 0, policy_version 56605 (0.0015) [2024-03-21 11:08:27,426][14898] Signal inference workers to stop experience collection... (1150 times) [2024-03-21 11:08:27,495][14898] Signal inference workers to resume experience collection... (1150 times) [2024-03-21 11:08:27,507][14919] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-03-21 11:08:27,562][14919] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-03-21 11:08:27,805][14687] Fps is (10 sec: 65535.9, 60 sec: 49152.0, 300 sec: 47985.7). Total num frames: 1855094784. Throughput: 0: 45837.7. Samples: 59897000. Policy #0 lag: (min: 5.0, avg: 40.7, max: 97.0) [2024-03-21 11:08:27,806][14687] Avg episode reward: [(0, '1.672')] [2024-03-21 11:08:27,821][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000056614_1855127552.pth... [2024-03-21 11:08:27,967][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000056260_1843527680.pth [2024-03-21 11:08:28,305][14919] Updated weights for policy 0, policy_version 56615 (0.0010) [2024-03-21 11:08:32,805][14687] Fps is (10 sec: 62259.2, 60 sec: 51882.6, 300 sec: 48652.1). Total num frames: 1855389696. Throughput: 0: 49228.9. Samples: 60200900. Policy #0 lag: (min: 5.0, avg: 40.7, max: 97.0) [2024-03-21 11:08:32,806][14687] Avg episode reward: [(0, '1.341')] [2024-03-21 11:08:36,487][14919] Updated weights for policy 0, policy_version 56625 (0.0011) [2024-03-21 11:08:37,805][14687] Fps is (10 sec: 42599.0, 60 sec: 50244.3, 300 sec: 48096.8). Total num frames: 1855520768. Throughput: 0: 48877.9. Samples: 60503500. Policy #0 lag: (min: 5.0, avg: 40.7, max: 97.0) [2024-03-21 11:08:37,805][14687] Avg episode reward: [(0, '1.208')] [2024-03-21 11:08:42,805][14687] Fps is (10 sec: 26214.3, 60 sec: 46421.3, 300 sec: 47985.7). Total num frames: 1855651840. Throughput: 0: 48726.6. Samples: 60806900. Policy #0 lag: (min: 5.0, avg: 40.7, max: 97.0) [2024-03-21 11:08:42,806][14687] Avg episode reward: [(0, '0.939')] [2024-03-21 11:08:47,805][14687] Fps is (10 sec: 22937.3, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 1855750144. Throughput: 0: 48891.0. Samples: 60954300. Policy #0 lag: (min: 5.0, avg: 40.7, max: 97.0) [2024-03-21 11:08:47,806][14687] Avg episode reward: [(0, '1.478')] [2024-03-21 11:08:48,369][14919] Updated weights for policy 0, policy_version 56635 (0.0010) [2024-03-21 11:08:52,805][14687] Fps is (10 sec: 39322.0, 60 sec: 48605.9, 300 sec: 46652.8). Total num frames: 1856045056. Throughput: 0: 48220.1. Samples: 61229000. Policy #0 lag: (min: 2.0, avg: 37.9, max: 84.0) [2024-03-21 11:08:52,806][14687] Avg episode reward: [(0, '0.623')] [2024-03-21 11:08:53,446][14919] Updated weights for policy 0, policy_version 56645 (0.0011) [2024-03-21 11:08:57,805][14687] Fps is (10 sec: 58983.3, 60 sec: 50244.4, 300 sec: 46986.0). Total num frames: 1856339968. Throughput: 0: 50571.4. Samples: 61495400. Policy #0 lag: (min: 2.0, avg: 37.9, max: 84.0) [2024-03-21 11:08:57,805][14687] Avg episode reward: [(0, '1.332')] [2024-03-21 11:09:02,805][14687] Fps is (10 sec: 36045.0, 60 sec: 49698.1, 300 sec: 46208.4). Total num frames: 1856405504. Throughput: 0: 48235.7. Samples: 61646000. Policy #0 lag: (min: 2.0, avg: 37.9, max: 84.0) [2024-03-21 11:09:02,806][14687] Avg episode reward: [(0, '1.665')] [2024-03-21 11:09:04,991][14919] Updated weights for policy 0, policy_version 56655 (0.0010) [2024-03-21 11:09:07,805][14687] Fps is (10 sec: 19660.7, 60 sec: 47513.8, 300 sec: 46097.4). Total num frames: 1856536576. Throughput: 0: 48435.7. Samples: 61937800. Policy #0 lag: (min: 2.0, avg: 37.9, max: 84.0) [2024-03-21 11:09:07,805][14687] Avg episode reward: [(0, '0.880')] [2024-03-21 11:09:11,058][14919] Updated weights for policy 0, policy_version 56665 (0.0023) [2024-03-21 11:09:12,805][14687] Fps is (10 sec: 55705.2, 60 sec: 49698.2, 300 sec: 46541.7). Total num frames: 1856962560. Throughput: 0: 48022.3. Samples: 62058000. Policy #0 lag: (min: 2.0, avg: 37.9, max: 84.0) [2024-03-21 11:09:12,806][14687] Avg episode reward: [(0, '1.099')] [2024-03-21 11:09:15,263][14919] Updated weights for policy 0, policy_version 56675 (0.0022) [2024-03-21 11:09:17,805][14687] Fps is (10 sec: 65535.9, 60 sec: 45875.3, 300 sec: 46763.8). Total num frames: 1857191936. Throughput: 0: 46657.9. Samples: 62300500. Policy #0 lag: (min: 2.0, avg: 37.9, max: 84.0) [2024-03-21 11:09:17,806][14687] Avg episode reward: [(0, '1.718')] [2024-03-21 11:09:21,880][14919] Updated weights for policy 0, policy_version 56685 (0.0019) [2024-03-21 11:09:22,805][14687] Fps is (10 sec: 55706.2, 60 sec: 45875.3, 300 sec: 47430.3). Total num frames: 1857519616. Throughput: 0: 46431.2. Samples: 62592900. Policy #0 lag: (min: 2.0, avg: 37.9, max: 84.0) [2024-03-21 11:09:22,806][14687] Avg episode reward: [(0, '1.718')] [2024-03-21 11:09:25,768][14898] Signal inference workers to stop experience collection... (1200 times) [2024-03-21 11:09:25,839][14919] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-03-21 11:09:26,032][14898] Signal inference workers to resume experience collection... (1200 times) [2024-03-21 11:09:26,033][14919] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-03-21 11:09:27,805][14687] Fps is (10 sec: 49152.0, 60 sec: 43144.6, 300 sec: 47430.3). Total num frames: 1857683456. Throughput: 0: 46269.0. Samples: 62889000. Policy #0 lag: (min: 0.0, avg: 29.9, max: 60.0) [2024-03-21 11:09:27,806][14687] Avg episode reward: [(0, '1.598')] [2024-03-21 11:09:28,889][14919] Updated weights for policy 0, policy_version 56695 (0.0018) [2024-03-21 11:09:32,805][14687] Fps is (10 sec: 45875.1, 60 sec: 43144.7, 300 sec: 47874.6). Total num frames: 1857978368. Throughput: 0: 45831.3. Samples: 63016700. Policy #0 lag: (min: 0.0, avg: 29.9, max: 60.0) [2024-03-21 11:09:32,805][14687] Avg episode reward: [(0, '0.951')] [2024-03-21 11:09:37,423][14919] Updated weights for policy 0, policy_version 56705 (0.0010) [2024-03-21 11:09:37,805][14687] Fps is (10 sec: 45874.5, 60 sec: 43690.5, 300 sec: 47319.2). Total num frames: 1858142208. Throughput: 0: 46308.8. Samples: 63312900. Policy #0 lag: (min: 0.0, avg: 29.9, max: 60.0) [2024-03-21 11:09:37,806][14687] Avg episode reward: [(0, '1.569')] [2024-03-21 11:09:41,184][14919] Updated weights for policy 0, policy_version 56715 (0.0017) [2024-03-21 11:09:42,805][14687] Fps is (10 sec: 55705.9, 60 sec: 48059.9, 300 sec: 47430.3). Total num frames: 1858535424. Throughput: 0: 46202.3. Samples: 63574500. Policy #0 lag: (min: 0.0, avg: 29.9, max: 60.0) [2024-03-21 11:09:42,805][14687] Avg episode reward: [(0, '1.332')] [2024-03-21 11:09:47,041][14919] Updated weights for policy 0, policy_version 56725 (0.0012) [2024-03-21 11:09:47,805][14687] Fps is (10 sec: 65537.1, 60 sec: 50790.5, 300 sec: 47208.1). Total num frames: 1858797568. Throughput: 0: 45773.3. Samples: 63705800. Policy #0 lag: (min: 0.0, avg: 29.9, max: 60.0) [2024-03-21 11:09:47,806][14687] Avg episode reward: [(0, '0.724')] [2024-03-21 11:09:52,805][14687] Fps is (10 sec: 42597.4, 60 sec: 48605.8, 300 sec: 46874.9). Total num frames: 1858961408. Throughput: 0: 45113.2. Samples: 63967900. Policy #0 lag: (min: 0.0, avg: 29.9, max: 60.0) [2024-03-21 11:09:52,806][14687] Avg episode reward: [(0, '1.188')] [2024-03-21 11:09:57,805][14687] Fps is (10 sec: 26214.0, 60 sec: 45328.9, 300 sec: 46541.7). Total num frames: 1859059712. Throughput: 0: 45451.0. Samples: 64103300. Policy #0 lag: (min: 0.0, avg: 29.9, max: 60.0) [2024-03-21 11:09:57,806][14687] Avg episode reward: [(0, '1.124')] [2024-03-21 11:09:59,688][14919] Updated weights for policy 0, policy_version 56735 (0.0011) [2024-03-21 11:10:02,805][14687] Fps is (10 sec: 39321.9, 60 sec: 49151.9, 300 sec: 46541.7). Total num frames: 1859354624. Throughput: 0: 46455.5. Samples: 64391000. Policy #0 lag: (min: 2.0, avg: 32.4, max: 71.0) [2024-03-21 11:10:02,806][14687] Avg episode reward: [(0, '1.284')] [2024-03-21 11:10:04,414][14919] Updated weights for policy 0, policy_version 56745 (0.0016) [2024-03-21 11:10:07,805][14687] Fps is (10 sec: 36045.2, 60 sec: 48059.7, 300 sec: 45875.2). Total num frames: 1859420160. Throughput: 0: 46073.2. Samples: 64666200. Policy #0 lag: (min: 2.0, avg: 32.4, max: 71.0) [2024-03-21 11:10:07,806][14687] Avg episode reward: [(0, '1.107')] [2024-03-21 11:10:12,805][14687] Fps is (10 sec: 32767.9, 60 sec: 45329.0, 300 sec: 46319.5). Total num frames: 1859682304. Throughput: 0: 45531.0. Samples: 64937900. Policy #0 lag: (min: 2.0, avg: 32.4, max: 71.0) [2024-03-21 11:10:12,806][14687] Avg episode reward: [(0, '1.424')] [2024-03-21 11:10:14,086][14919] Updated weights for policy 0, policy_version 56755 (0.0015) [2024-03-21 11:10:17,805][14687] Fps is (10 sec: 55705.1, 60 sec: 46421.2, 300 sec: 46874.9). Total num frames: 1859977216. Throughput: 0: 45793.2. Samples: 65077400. Policy #0 lag: (min: 2.0, avg: 32.4, max: 71.0) [2024-03-21 11:10:17,815][14687] Avg episode reward: [(0, '0.652')] [2024-03-21 11:10:21,025][14919] Updated weights for policy 0, policy_version 56765 (0.0015) [2024-03-21 11:10:22,805][14687] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 46652.7). Total num frames: 1860108288. Throughput: 0: 45480.0. Samples: 65359500. Policy #0 lag: (min: 2.0, avg: 32.4, max: 71.0) [2024-03-21 11:10:22,806][14687] Avg episode reward: [(0, '0.778')] [2024-03-21 11:10:23,674][14898] Signal inference workers to stop experience collection... (1250 times) [2024-03-21 11:10:23,746][14919] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-03-21 11:10:23,961][14898] Signal inference workers to resume experience collection... (1250 times) [2024-03-21 11:10:23,961][14919] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-03-21 11:10:27,811][14687] Fps is (10 sec: 36023.6, 60 sec: 44232.4, 300 sec: 46874.0). Total num frames: 1860337664. Throughput: 0: 45569.4. Samples: 65625400. Policy #0 lag: (min: 2.0, avg: 32.4, max: 71.0) [2024-03-21 11:10:27,812][14687] Avg episode reward: [(0, '1.289')] [2024-03-21 11:10:28,156][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000056775_1860403200.pth... [2024-03-21 11:10:28,157][14919] Updated weights for policy 0, policy_version 56775 (0.0014) [2024-03-21 11:10:28,287][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000056435_1849262080.pth [2024-03-21 11:10:32,805][14687] Fps is (10 sec: 58982.8, 60 sec: 45329.0, 300 sec: 47208.1). Total num frames: 1860698112. Throughput: 0: 45097.7. Samples: 65735200. Policy #0 lag: (min: 2.0, avg: 32.4, max: 71.0) [2024-03-21 11:10:32,806][14687] Avg episode reward: [(0, '1.139')] [2024-03-21 11:10:33,647][14919] Updated weights for policy 0, policy_version 56785 (0.0036) [2024-03-21 11:10:37,805][14687] Fps is (10 sec: 45902.5, 60 sec: 44236.9, 300 sec: 46541.7). Total num frames: 1860796416. Throughput: 0: 46146.8. Samples: 66044500. Policy #0 lag: (min: 2.0, avg: 32.4, max: 71.0) [2024-03-21 11:10:37,806][14687] Avg episode reward: [(0, '0.695')] [2024-03-21 11:10:42,805][14687] Fps is (10 sec: 29491.5, 60 sec: 40960.0, 300 sec: 46986.0). Total num frames: 1860993024. Throughput: 0: 46262.4. Samples: 66185100. Policy #0 lag: (min: 0.0, avg: 46.4, max: 108.0) [2024-03-21 11:10:42,806][14687] Avg episode reward: [(0, '1.328')] [2024-03-21 11:10:43,700][14919] Updated weights for policy 0, policy_version 56795 (0.0014) [2024-03-21 11:10:47,805][14687] Fps is (10 sec: 52429.0, 60 sec: 42052.3, 300 sec: 47430.3). Total num frames: 1861320704. Throughput: 0: 46209.0. Samples: 66470400. Policy #0 lag: (min: 0.0, avg: 46.4, max: 108.0) [2024-03-21 11:10:47,806][14687] Avg episode reward: [(0, '1.488')] [2024-03-21 11:10:48,369][14919] Updated weights for policy 0, policy_version 56805 (0.0024) [2024-03-21 11:10:52,287][14919] Updated weights for policy 0, policy_version 56815 (0.0019) [2024-03-21 11:10:52,805][14687] Fps is (10 sec: 72089.8, 60 sec: 45875.4, 300 sec: 47763.6). Total num frames: 1861713920. Throughput: 0: 45889.0. Samples: 66731200. Policy #0 lag: (min: 0.0, avg: 46.4, max: 108.0) [2024-03-21 11:10:52,805][14687] Avg episode reward: [(0, '1.389')] [2024-03-21 11:10:57,805][14687] Fps is (10 sec: 65536.2, 60 sec: 48606.0, 300 sec: 47430.3). Total num frames: 1861976064. Throughput: 0: 46295.7. Samples: 67021200. Policy #0 lag: (min: 0.0, avg: 46.4, max: 108.0) [2024-03-21 11:10:57,806][14687] Avg episode reward: [(0, '1.147')] [2024-03-21 11:11:00,017][14919] Updated weights for policy 0, policy_version 56825 (0.0015) [2024-03-21 11:11:02,805][14687] Fps is (10 sec: 42598.0, 60 sec: 46421.4, 300 sec: 47097.1). Total num frames: 1862139904. Throughput: 0: 46331.2. Samples: 67162300. Policy #0 lag: (min: 0.0, avg: 46.4, max: 108.0) [2024-03-21 11:11:02,805][14687] Avg episode reward: [(0, '1.147')] [2024-03-21 11:11:07,525][14919] Updated weights for policy 0, policy_version 56835 (0.0012) [2024-03-21 11:11:07,805][14687] Fps is (10 sec: 39321.5, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 1862369280. Throughput: 0: 45929.0. Samples: 67426300. Policy #0 lag: (min: 0.0, avg: 46.4, max: 108.0) [2024-03-21 11:11:07,806][14687] Avg episode reward: [(0, '0.777')] [2024-03-21 11:11:12,732][14919] Updated weights for policy 0, policy_version 56845 (0.0016) [2024-03-21 11:11:12,805][14687] Fps is (10 sec: 55705.6, 60 sec: 50244.4, 300 sec: 47541.4). Total num frames: 1862696960. Throughput: 0: 45770.5. Samples: 67684800. Policy #0 lag: (min: 0.0, avg: 46.4, max: 108.0) [2024-03-21 11:11:12,806][14687] Avg episode reward: [(0, '1.109')] [2024-03-21 11:11:15,829][14898] Signal inference workers to stop experience collection... (1300 times) [2024-03-21 11:11:15,889][14919] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-03-21 11:11:15,904][14898] Signal inference workers to resume experience collection... (1300 times) [2024-03-21 11:11:15,925][14919] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-03-21 11:11:17,805][14687] Fps is (10 sec: 52428.6, 60 sec: 48605.9, 300 sec: 47208.1). Total num frames: 1862893568. Throughput: 0: 46382.2. Samples: 67822400. Policy #0 lag: (min: 0.0, avg: 51.8, max: 107.0) [2024-03-21 11:11:17,806][14687] Avg episode reward: [(0, '1.699')] [2024-03-21 11:11:22,805][14687] Fps is (10 sec: 26214.4, 60 sec: 47513.7, 300 sec: 46430.6). Total num frames: 1862959104. Throughput: 0: 45855.6. Samples: 68108000. Policy #0 lag: (min: 0.0, avg: 51.8, max: 107.0) [2024-03-21 11:11:22,806][14687] Avg episode reward: [(0, '1.063')] [2024-03-21 11:11:23,632][14919] Updated weights for policy 0, policy_version 56855 (0.0016) [2024-03-21 11:11:27,805][14687] Fps is (10 sec: 26214.3, 60 sec: 46972.1, 300 sec: 46652.8). Total num frames: 1863155712. Throughput: 0: 49564.3. Samples: 68415500. Policy #0 lag: (min: 0.0, avg: 51.8, max: 107.0) [2024-03-21 11:11:27,806][14687] Avg episode reward: [(0, '1.014')] [2024-03-21 11:11:32,805][14687] Fps is (10 sec: 22937.4, 60 sec: 41506.1, 300 sec: 46097.3). Total num frames: 1863188480. Throughput: 0: 46688.8. Samples: 68571400. Policy #0 lag: (min: 0.0, avg: 51.8, max: 107.0) [2024-03-21 11:11:32,806][14687] Avg episode reward: [(0, '0.946')] [2024-03-21 11:11:35,253][14919] Updated weights for policy 0, policy_version 56865 (0.0011) [2024-03-21 11:11:37,805][14687] Fps is (10 sec: 39321.6, 60 sec: 45875.2, 300 sec: 46652.7). Total num frames: 1863548928. Throughput: 0: 46904.3. Samples: 68841900. Policy #0 lag: (min: 0.0, avg: 51.8, max: 107.0) [2024-03-21 11:11:37,807][14687] Avg episode reward: [(0, '0.749')] [2024-03-21 11:11:42,105][14919] Updated weights for policy 0, policy_version 56875 (0.0012) [2024-03-21 11:11:42,805][14687] Fps is (10 sec: 55705.3, 60 sec: 45875.1, 300 sec: 46763.8). Total num frames: 1863745536. Throughput: 0: 43986.5. Samples: 69000600. Policy #0 lag: (min: 0.0, avg: 51.8, max: 107.0) [2024-03-21 11:11:42,806][14687] Avg episode reward: [(0, '0.950')] [2024-03-21 11:11:46,790][14919] Updated weights for policy 0, policy_version 56885 (0.0012) [2024-03-21 11:11:47,805][14687] Fps is (10 sec: 55705.6, 60 sec: 46421.3, 300 sec: 46652.8). Total num frames: 1864105984. Throughput: 0: 47551.0. Samples: 69302100. Policy #0 lag: (min: 0.0, avg: 51.8, max: 107.0) [2024-03-21 11:11:47,806][14687] Avg episode reward: [(0, '0.950')] [2024-03-21 11:11:50,229][14919] Updated weights for policy 0, policy_version 56895 (0.0014) [2024-03-21 11:11:52,805][14687] Fps is (10 sec: 81920.9, 60 sec: 47513.5, 300 sec: 47652.5). Total num frames: 1864564736. Throughput: 0: 46955.5. Samples: 69539300. Policy #0 lag: (min: 2.0, avg: 43.3, max: 85.0) [2024-03-21 11:11:52,806][14687] Avg episode reward: [(0, '0.950')] [2024-03-21 11:11:53,585][14919] Updated weights for policy 0, policy_version 56905 (0.0018) [2024-03-21 11:11:57,805][14687] Fps is (10 sec: 85196.3, 60 sec: 49698.0, 300 sec: 48874.3). Total num frames: 1864957952. Throughput: 0: 43857.7. Samples: 69658400. Policy #0 lag: (min: 2.0, avg: 43.3, max: 85.0) [2024-03-21 11:11:57,806][14687] Avg episode reward: [(0, '1.169')] [2024-03-21 11:11:58,187][14919] Updated weights for policy 0, policy_version 56915 (0.0015) [2024-03-21 11:12:00,418][14898] Signal inference workers to stop experience collection... (1350 times) [2024-03-21 11:12:00,487][14919] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-03-21 11:12:00,739][14898] Signal inference workers to resume experience collection... (1350 times) [2024-03-21 11:12:00,740][14919] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-03-21 11:12:02,805][14687] Fps is (10 sec: 62259.3, 60 sec: 50790.4, 300 sec: 49207.5). Total num frames: 1865187328. Throughput: 0: 47444.5. Samples: 69957400. Policy #0 lag: (min: 2.0, avg: 43.3, max: 85.0) [2024-03-21 11:12:02,806][14687] Avg episode reward: [(0, '1.457')] [2024-03-21 11:12:07,805][14687] Fps is (10 sec: 29491.4, 60 sec: 48059.7, 300 sec: 48318.9). Total num frames: 1865252864. Throughput: 0: 47591.1. Samples: 70249600. Policy #0 lag: (min: 2.0, avg: 43.3, max: 85.0) [2024-03-21 11:12:07,806][14687] Avg episode reward: [(0, '1.457')] [2024-03-21 11:12:10,080][14919] Updated weights for policy 0, policy_version 56925 (0.0010) [2024-03-21 11:12:12,805][14687] Fps is (10 sec: 32768.2, 60 sec: 46967.5, 300 sec: 47430.3). Total num frames: 1865515008. Throughput: 0: 43753.4. Samples: 70384400. Policy #0 lag: (min: 2.0, avg: 43.3, max: 85.0) [2024-03-21 11:12:12,805][14687] Avg episode reward: [(0, '1.526')] [2024-03-21 11:12:14,073][14919] Updated weights for policy 0, policy_version 56935 (0.0011) [2024-03-21 11:12:17,805][14687] Fps is (10 sec: 45875.5, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 1865711616. Throughput: 0: 46475.6. Samples: 70662800. Policy #0 lag: (min: 2.0, avg: 43.3, max: 85.0) [2024-03-21 11:12:17,806][14687] Avg episode reward: [(0, '1.293')] [2024-03-21 11:12:22,805][14687] Fps is (10 sec: 36044.4, 60 sec: 48605.8, 300 sec: 46541.7). Total num frames: 1865875456. Throughput: 0: 47520.0. Samples: 70980300. Policy #0 lag: (min: 2.0, avg: 43.3, max: 85.0) [2024-03-21 11:12:22,806][14687] Avg episode reward: [(0, '0.982')] [2024-03-21 11:12:27,805][14687] Fps is (10 sec: 16383.8, 60 sec: 45329.0, 300 sec: 46097.3). Total num frames: 1865875456. Throughput: 0: 47266.7. Samples: 71127600. Policy #0 lag: (min: 2.0, avg: 43.3, max: 85.0) [2024-03-21 11:12:27,806][14687] Avg episode reward: [(0, '1.331')] [2024-03-21 11:12:27,979][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000056943_1865908224.pth... [2024-03-21 11:12:28,095][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000056614_1855127552.pth [2024-03-21 11:12:29,515][14919] Updated weights for policy 0, policy_version 56945 (0.0008) [2024-03-21 11:12:32,805][14687] Fps is (10 sec: 22937.4, 60 sec: 48605.8, 300 sec: 46097.3). Total num frames: 1866104832. Throughput: 0: 47391.0. Samples: 71434700. Policy #0 lag: (min: 0.0, avg: 29.9, max: 82.0) [2024-03-21 11:12:32,806][14687] Avg episode reward: [(0, '0.957')] [2024-03-21 11:12:37,668][14919] Updated weights for policy 0, policy_version 56955 (0.0018) [2024-03-21 11:12:37,805][14687] Fps is (10 sec: 42598.4, 60 sec: 45875.1, 300 sec: 45542.0). Total num frames: 1866301440. Throughput: 0: 48673.2. Samples: 71729600. Policy #0 lag: (min: 0.0, avg: 29.9, max: 82.0) [2024-03-21 11:12:37,806][14687] Avg episode reward: [(0, '0.957')] [2024-03-21 11:12:41,591][14919] Updated weights for policy 0, policy_version 56965 (0.0015) [2024-03-21 11:12:42,805][14687] Fps is (10 sec: 58983.3, 60 sec: 49152.1, 300 sec: 46430.6). Total num frames: 1866694656. Throughput: 0: 52017.9. Samples: 71999200. Policy #0 lag: (min: 0.0, avg: 29.9, max: 82.0) [2024-03-21 11:12:42,806][14687] Avg episode reward: [(0, '0.487')] [2024-03-21 11:12:46,591][14919] Updated weights for policy 0, policy_version 56975 (0.0014) [2024-03-21 11:12:47,805][14687] Fps is (10 sec: 72089.9, 60 sec: 48605.8, 300 sec: 47097.1). Total num frames: 1867022336. Throughput: 0: 48719.9. Samples: 72149800. Policy #0 lag: (min: 0.0, avg: 29.9, max: 82.0) [2024-03-21 11:12:47,806][14687] Avg episode reward: [(0, '1.202')] [2024-03-21 11:12:50,118][14919] Updated weights for policy 0, policy_version 56985 (0.0023) [2024-03-21 11:12:52,293][14898] Signal inference workers to stop experience collection... (1400 times) [2024-03-21 11:12:52,371][14919] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-03-21 11:12:52,522][14898] Signal inference workers to resume experience collection... (1400 times) [2024-03-21 11:12:52,522][14919] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-03-21 11:12:52,805][14687] Fps is (10 sec: 81920.9, 60 sec: 49152.1, 300 sec: 48096.8). Total num frames: 1867513856. Throughput: 0: 47293.5. Samples: 72377800. Policy #0 lag: (min: 0.0, avg: 29.9, max: 82.0) [2024-03-21 11:12:52,805][14687] Avg episode reward: [(0, '1.612')] [2024-03-21 11:12:53,413][14919] Updated weights for policy 0, policy_version 56995 (0.0027) [2024-03-21 11:12:57,805][14687] Fps is (10 sec: 72090.4, 60 sec: 46421.4, 300 sec: 48541.1). Total num frames: 1867743232. Throughput: 0: 50902.2. Samples: 72675000. Policy #0 lag: (min: 0.0, avg: 29.9, max: 82.0) [2024-03-21 11:12:57,805][14687] Avg episode reward: [(0, '1.612')] [2024-03-21 11:13:02,805][14687] Fps is (10 sec: 36044.5, 60 sec: 44782.9, 300 sec: 48096.8). Total num frames: 1867874304. Throughput: 0: 48415.5. Samples: 72841500. Policy #0 lag: (min: 0.0, avg: 29.9, max: 82.0) [2024-03-21 11:13:02,806][14687] Avg episode reward: [(0, '1.612')] [2024-03-21 11:13:03,576][14919] Updated weights for policy 0, policy_version 57005 (0.0018) [2024-03-21 11:13:07,805][14687] Fps is (10 sec: 32767.7, 60 sec: 46967.5, 300 sec: 47763.5). Total num frames: 1868070912. Throughput: 0: 47913.3. Samples: 73136400. Policy #0 lag: (min: 1.0, avg: 41.5, max: 83.0) [2024-03-21 11:13:07,806][14687] Avg episode reward: [(0, '0.988')] [2024-03-21 11:13:11,717][14919] Updated weights for policy 0, policy_version 57015 (0.0010) [2024-03-21 11:13:12,805][14687] Fps is (10 sec: 39321.9, 60 sec: 45875.2, 300 sec: 46874.9). Total num frames: 1868267520. Throughput: 0: 51095.8. Samples: 73426900. Policy #0 lag: (min: 1.0, avg: 41.5, max: 83.0) [2024-03-21 11:13:12,805][14687] Avg episode reward: [(0, '1.132')] [2024-03-21 11:13:17,805][14687] Fps is (10 sec: 36044.9, 60 sec: 45329.0, 300 sec: 46319.5). Total num frames: 1868431360. Throughput: 0: 47551.2. Samples: 73574500. Policy #0 lag: (min: 1.0, avg: 41.5, max: 83.0) [2024-03-21 11:13:17,806][14687] Avg episode reward: [(0, '1.397')] [2024-03-21 11:13:20,452][14919] Updated weights for policy 0, policy_version 57025 (0.0015) [2024-03-21 11:13:22,805][14687] Fps is (10 sec: 52428.0, 60 sec: 48605.8, 300 sec: 46430.6). Total num frames: 1868791808. Throughput: 0: 47031.1. Samples: 73846000. Policy #0 lag: (min: 1.0, avg: 41.5, max: 83.0) [2024-03-21 11:13:22,806][14687] Avg episode reward: [(0, '1.397')] [2024-03-21 11:13:24,031][14919] Updated weights for policy 0, policy_version 57035 (0.0019) [2024-03-21 11:13:27,805][14687] Fps is (10 sec: 49152.8, 60 sec: 50790.6, 300 sec: 45875.2). Total num frames: 1868922880. Throughput: 0: 47080.1. Samples: 74117800. Policy #0 lag: (min: 1.0, avg: 41.5, max: 83.0) [2024-03-21 11:13:27,805][14687] Avg episode reward: [(0, '1.767')] [2024-03-21 11:13:32,805][14687] Fps is (10 sec: 36044.7, 60 sec: 50790.4, 300 sec: 46208.4). Total num frames: 1869152256. Throughput: 0: 46211.1. Samples: 74229300. Policy #0 lag: (min: 1.0, avg: 41.5, max: 83.0) [2024-03-21 11:13:32,806][14687] Avg episode reward: [(0, '1.677')] [2024-03-21 11:13:35,626][14919] Updated weights for policy 0, policy_version 57045 (0.0016) [2024-03-21 11:13:37,805][14687] Fps is (10 sec: 45874.6, 60 sec: 51336.6, 300 sec: 46541.7). Total num frames: 1869381632. Throughput: 0: 47851.0. Samples: 74531100. Policy #0 lag: (min: 1.0, avg: 41.5, max: 83.0) [2024-03-21 11:13:37,806][14687] Avg episode reward: [(0, '1.000')] [2024-03-21 11:13:42,805][14687] Fps is (10 sec: 39322.5, 60 sec: 47513.7, 300 sec: 46763.9). Total num frames: 1869545472. Throughput: 0: 47813.4. Samples: 74826600. Policy #0 lag: (min: 0.0, avg: 40.2, max: 93.0) [2024-03-21 11:13:42,805][14687] Avg episode reward: [(0, '1.596')] [2024-03-21 11:13:43,044][14919] Updated weights for policy 0, policy_version 57055 (0.0017) [2024-03-21 11:13:47,805][14687] Fps is (10 sec: 49152.4, 60 sec: 47513.7, 300 sec: 46874.9). Total num frames: 1869873152. Throughput: 0: 46991.2. Samples: 74956100. Policy #0 lag: (min: 0.0, avg: 40.2, max: 93.0) [2024-03-21 11:13:47,805][14687] Avg episode reward: [(0, '0.608')] [2024-03-21 11:13:48,158][14919] Updated weights for policy 0, policy_version 57065 (0.0025) [2024-03-21 11:13:49,698][14898] Signal inference workers to stop experience collection... (1450 times) [2024-03-21 11:13:49,699][14898] Signal inference workers to resume experience collection... (1450 times) [2024-03-21 11:13:49,763][14919] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-03-21 11:13:49,763][14919] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-03-21 11:13:52,805][14687] Fps is (10 sec: 52427.7, 60 sec: 42598.3, 300 sec: 46541.6). Total num frames: 1870069760. Throughput: 0: 46997.7. Samples: 75251300. Policy #0 lag: (min: 0.0, avg: 40.2, max: 93.0) [2024-03-21 11:13:52,806][14687] Avg episode reward: [(0, '1.382')] [2024-03-21 11:13:57,805][14687] Fps is (10 sec: 32767.3, 60 sec: 40959.9, 300 sec: 46763.8). Total num frames: 1870200832. Throughput: 0: 47562.0. Samples: 75567200. Policy #0 lag: (min: 0.0, avg: 40.2, max: 93.0) [2024-03-21 11:13:57,806][14687] Avg episode reward: [(0, '0.919')] [2024-03-21 11:13:58,173][14919] Updated weights for policy 0, policy_version 57075 (0.0010) [2024-03-21 11:14:02,805][14687] Fps is (10 sec: 45875.2, 60 sec: 44236.7, 300 sec: 47430.3). Total num frames: 1870528512. Throughput: 0: 47557.7. Samples: 75714600. Policy #0 lag: (min: 0.0, avg: 40.2, max: 93.0) [2024-03-21 11:14:02,806][14687] Avg episode reward: [(0, '0.919')] [2024-03-21 11:14:02,923][14919] Updated weights for policy 0, policy_version 57085 (0.0028) [2024-03-21 11:14:06,436][14919] Updated weights for policy 0, policy_version 57095 (0.0012) [2024-03-21 11:14:07,805][14687] Fps is (10 sec: 75367.3, 60 sec: 48059.8, 300 sec: 47430.3). Total num frames: 1870954496. Throughput: 0: 46748.9. Samples: 75949700. Policy #0 lag: (min: 0.0, avg: 40.2, max: 93.0) [2024-03-21 11:14:07,806][14687] Avg episode reward: [(0, '1.304')] [2024-03-21 11:14:10,680][14919] Updated weights for policy 0, policy_version 57105 (0.0016) [2024-03-21 11:14:12,805][14687] Fps is (10 sec: 81920.3, 60 sec: 51336.4, 300 sec: 47985.7). Total num frames: 1871347712. Throughput: 0: 46510.9. Samples: 76210800. Policy #0 lag: (min: 0.0, avg: 40.2, max: 93.0) [2024-03-21 11:14:12,806][14687] Avg episode reward: [(0, '1.253')] [2024-03-21 11:14:17,805][14687] Fps is (10 sec: 42598.4, 60 sec: 49152.0, 300 sec: 46986.0). Total num frames: 1871380480. Throughput: 0: 47606.7. Samples: 76371600. Policy #0 lag: (min: 0.0, avg: 37.8, max: 76.0) [2024-03-21 11:14:17,806][14687] Avg episode reward: [(0, '1.103')] [2024-03-21 11:14:22,805][14687] Fps is (10 sec: 16384.0, 60 sec: 45329.1, 300 sec: 46874.9). Total num frames: 1871511552. Throughput: 0: 47186.6. Samples: 76654500. Policy #0 lag: (min: 0.0, avg: 37.8, max: 76.0) [2024-03-21 11:14:22,806][14687] Avg episode reward: [(0, '1.608')] [2024-03-21 11:14:24,483][14919] Updated weights for policy 0, policy_version 57115 (0.0012) [2024-03-21 11:14:27,805][14687] Fps is (10 sec: 42598.2, 60 sec: 48059.6, 300 sec: 46874.9). Total num frames: 1871806464. Throughput: 0: 46637.6. Samples: 76925300. Policy #0 lag: (min: 0.0, avg: 37.8, max: 76.0) [2024-03-21 11:14:27,806][14687] Avg episode reward: [(0, '0.528')] [2024-03-21 11:14:28,108][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000057125_1871872000.pth... [2024-03-21 11:14:28,111][14919] Updated weights for policy 0, policy_version 57125 (0.0019) [2024-03-21 11:14:28,186][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000056775_1860403200.pth [2024-03-21 11:14:32,805][14687] Fps is (10 sec: 55705.7, 60 sec: 48605.9, 300 sec: 47208.2). Total num frames: 1872068608. Throughput: 0: 46795.5. Samples: 77061900. Policy #0 lag: (min: 0.0, avg: 37.8, max: 76.0) [2024-03-21 11:14:32,815][14687] Avg episode reward: [(0, '1.654')] [2024-03-21 11:14:37,805][14687] Fps is (10 sec: 36045.0, 60 sec: 46421.3, 300 sec: 46208.4). Total num frames: 1872166912. Throughput: 0: 45880.1. Samples: 77315900. Policy #0 lag: (min: 0.0, avg: 37.8, max: 76.0) [2024-03-21 11:14:37,814][14687] Avg episode reward: [(0, '1.388')] [2024-03-21 11:14:38,368][14919] Updated weights for policy 0, policy_version 57135 (0.0013) [2024-03-21 11:14:38,549][14898] Signal inference workers to stop experience collection... (1500 times) [2024-03-21 11:14:38,599][14919] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-03-21 11:14:38,624][14898] Signal inference workers to resume experience collection... (1500 times) [2024-03-21 11:14:38,636][14919] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-03-21 11:14:42,810][14687] Fps is (10 sec: 39302.6, 60 sec: 48601.8, 300 sec: 46318.7). Total num frames: 1872461824. Throughput: 0: 41429.0. Samples: 77431700. Policy #0 lag: (min: 0.0, avg: 37.8, max: 76.0) [2024-03-21 11:14:42,811][14687] Avg episode reward: [(0, '1.023')] [2024-03-21 11:14:47,044][14919] Updated weights for policy 0, policy_version 57145 (0.0012) [2024-03-21 11:14:47,805][14687] Fps is (10 sec: 39321.6, 60 sec: 44782.9, 300 sec: 46097.4). Total num frames: 1872560128. Throughput: 0: 43431.2. Samples: 77669000. Policy #0 lag: (min: 0.0, avg: 37.8, max: 76.0) [2024-03-21 11:14:47,806][14687] Avg episode reward: [(0, '1.219')] [2024-03-21 11:14:51,589][14919] Updated weights for policy 0, policy_version 57155 (0.0017) [2024-03-21 11:14:52,805][14687] Fps is (10 sec: 42619.1, 60 sec: 46967.5, 300 sec: 46874.9). Total num frames: 1872887808. Throughput: 0: 44046.7. Samples: 77931800. Policy #0 lag: (min: 0.0, avg: 37.8, max: 76.0) [2024-03-21 11:14:52,806][14687] Avg episode reward: [(0, '1.500')] [2024-03-21 11:14:57,805][14687] Fps is (10 sec: 52428.8, 60 sec: 48059.8, 300 sec: 46541.7). Total num frames: 1873084416. Throughput: 0: 41788.9. Samples: 78091300. Policy #0 lag: (min: 0.0, avg: 28.8, max: 76.0) [2024-03-21 11:14:57,806][14687] Avg episode reward: [(0, '0.970')] [2024-03-21 11:15:00,706][14919] Updated weights for policy 0, policy_version 57165 (0.0011) [2024-03-21 11:15:02,805][14687] Fps is (10 sec: 39321.4, 60 sec: 45875.2, 300 sec: 46986.0). Total num frames: 1873281024. Throughput: 0: 44671.1. Samples: 78381800. Policy #0 lag: (min: 0.0, avg: 28.8, max: 76.0) [2024-03-21 11:15:02,806][14687] Avg episode reward: [(0, '0.970')] [2024-03-21 11:15:07,805][14687] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 46763.8). Total num frames: 1873477632. Throughput: 0: 44591.1. Samples: 78661100. Policy #0 lag: (min: 0.0, avg: 28.8, max: 76.0) [2024-03-21 11:15:07,806][14687] Avg episode reward: [(0, '1.831')] [2024-03-21 11:15:08,899][14919] Updated weights for policy 0, policy_version 57175 (0.0020) [2024-03-21 11:15:12,805][14687] Fps is (10 sec: 42598.6, 60 sec: 39321.6, 300 sec: 46541.7). Total num frames: 1873707008. Throughput: 0: 41586.7. Samples: 78796700. Policy #0 lag: (min: 0.0, avg: 28.8, max: 76.0) [2024-03-21 11:15:12,806][14687] Avg episode reward: [(0, '1.635')] [2024-03-21 11:15:15,749][14919] Updated weights for policy 0, policy_version 57185 (0.0014) [2024-03-21 11:15:17,805][14687] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 46763.8). Total num frames: 1873903616. Throughput: 0: 44408.9. Samples: 79060300. Policy #0 lag: (min: 0.0, avg: 28.8, max: 76.0) [2024-03-21 11:15:17,806][14687] Avg episode reward: [(0, '1.245')] [2024-03-21 11:15:22,698][14919] Updated weights for policy 0, policy_version 57195 (0.0018) [2024-03-21 11:15:22,805][14687] Fps is (10 sec: 45875.0, 60 sec: 44236.8, 300 sec: 46875.8). Total num frames: 1874165760. Throughput: 0: 44995.5. Samples: 79340700. Policy #0 lag: (min: 0.0, avg: 28.8, max: 76.0) [2024-03-21 11:15:22,806][14687] Avg episode reward: [(0, '1.513')] [2024-03-21 11:15:26,124][14919] Updated weights for policy 0, policy_version 57205 (0.0014) [2024-03-21 11:15:26,714][14898] Signal inference workers to stop experience collection... (1550 times) [2024-03-21 11:15:26,788][14919] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-03-21 11:15:26,981][14898] Signal inference workers to resume experience collection... (1550 times) [2024-03-21 11:15:26,982][14919] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-03-21 11:15:27,805][14687] Fps is (10 sec: 75366.1, 60 sec: 47513.6, 300 sec: 47319.2). Total num frames: 1874657280. Throughput: 0: 44813.7. Samples: 79448100. Policy #0 lag: (min: 0.0, avg: 28.8, max: 76.0) [2024-03-21 11:15:27,806][14687] Avg episode reward: [(0, '0.640')] [2024-03-21 11:15:32,458][14919] Updated weights for policy 0, policy_version 57215 (0.0018) [2024-03-21 11:15:32,805][14687] Fps is (10 sec: 68813.1, 60 sec: 46421.3, 300 sec: 47652.4). Total num frames: 1874853888. Throughput: 0: 45915.5. Samples: 79735200. Policy #0 lag: (min: 0.0, avg: 39.2, max: 75.0) [2024-03-21 11:15:32,805][14687] Avg episode reward: [(0, '0.640')] [2024-03-21 11:15:37,805][14687] Fps is (10 sec: 32767.8, 60 sec: 46967.4, 300 sec: 47430.3). Total num frames: 1874984960. Throughput: 0: 46204.4. Samples: 80011000. Policy #0 lag: (min: 0.0, avg: 39.2, max: 75.0) [2024-03-21 11:15:37,806][14687] Avg episode reward: [(0, '1.484')] [2024-03-21 11:15:42,805][14687] Fps is (10 sec: 22937.7, 60 sec: 43694.2, 300 sec: 46652.7). Total num frames: 1875083264. Throughput: 0: 45591.1. Samples: 80142900. Policy #0 lag: (min: 0.0, avg: 39.2, max: 75.0) [2024-03-21 11:15:42,806][14687] Avg episode reward: [(0, '1.508')] [2024-03-21 11:15:44,520][14919] Updated weights for policy 0, policy_version 57225 (0.0011) [2024-03-21 11:15:47,805][14687] Fps is (10 sec: 36044.9, 60 sec: 46421.3, 300 sec: 46208.4). Total num frames: 1875345408. Throughput: 0: 44751.1. Samples: 80395600. Policy #0 lag: (min: 0.0, avg: 39.2, max: 75.0) [2024-03-21 11:15:47,806][14687] Avg episode reward: [(0, '1.597')] [2024-03-21 11:15:49,374][14919] Updated weights for policy 0, policy_version 57235 (0.0011) [2024-03-21 11:15:52,805][14687] Fps is (10 sec: 55705.4, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 1875640320. Throughput: 0: 44435.6. Samples: 80660700. Policy #0 lag: (min: 0.0, avg: 39.2, max: 75.0) [2024-03-21 11:15:52,806][14687] Avg episode reward: [(0, '1.083')] [2024-03-21 11:15:55,400][14919] Updated weights for policy 0, policy_version 57245 (0.0014) [2024-03-21 11:15:57,805][14687] Fps is (10 sec: 62259.4, 60 sec: 48059.7, 300 sec: 46874.9). Total num frames: 1875968000. Throughput: 0: 44573.3. Samples: 80802500. Policy #0 lag: (min: 0.0, avg: 39.2, max: 75.0) [2024-03-21 11:15:57,806][14687] Avg episode reward: [(0, '0.863')] [2024-03-21 11:16:01,729][14919] Updated weights for policy 0, policy_version 57255 (0.0019) [2024-03-21 11:16:02,805][14687] Fps is (10 sec: 52428.6, 60 sec: 48059.8, 300 sec: 46763.8). Total num frames: 1876164608. Throughput: 0: 44884.4. Samples: 81080100. Policy #0 lag: (min: 0.0, avg: 39.2, max: 75.0) [2024-03-21 11:16:02,806][14687] Avg episode reward: [(0, '1.346')] [2024-03-21 11:16:07,805][14687] Fps is (10 sec: 22937.6, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 1876197376. Throughput: 0: 45406.7. Samples: 81384000. Policy #0 lag: (min: 0.0, avg: 39.2, max: 75.0) [2024-03-21 11:16:07,806][14687] Avg episode reward: [(0, '1.568')] [2024-03-21 11:16:12,805][14687] Fps is (10 sec: 13107.2, 60 sec: 43144.5, 300 sec: 45430.9). Total num frames: 1876295680. Throughput: 0: 50213.3. Samples: 81707700. Policy #0 lag: (min: 0.0, avg: 34.9, max: 83.0) [2024-03-21 11:16:12,806][14687] Avg episode reward: [(0, '1.568')] [2024-03-21 11:16:17,748][14919] Updated weights for policy 0, policy_version 57265 (0.0014) [2024-03-21 11:16:17,805][14687] Fps is (10 sec: 26213.3, 60 sec: 42598.1, 300 sec: 45764.0). Total num frames: 1876459520. Throughput: 0: 47388.5. Samples: 81867700. Policy #0 lag: (min: 0.0, avg: 34.9, max: 83.0) [2024-03-21 11:16:17,807][14687] Avg episode reward: [(0, '1.568')] [2024-03-21 11:16:21,088][14919] Updated weights for policy 0, policy_version 57275 (0.0030) [2024-03-21 11:16:22,805][14687] Fps is (10 sec: 62259.6, 60 sec: 45875.3, 300 sec: 46652.8). Total num frames: 1876918272. Throughput: 0: 47162.3. Samples: 82133300. Policy #0 lag: (min: 0.0, avg: 34.9, max: 83.0) [2024-03-21 11:16:22,805][14687] Avg episode reward: [(0, '1.318')] [2024-03-21 11:16:24,959][14919] Updated weights for policy 0, policy_version 57285 (0.0023) [2024-03-21 11:16:27,193][14898] Signal inference workers to stop experience collection... (1600 times) [2024-03-21 11:16:27,246][14919] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-03-21 11:16:27,268][14898] Signal inference workers to resume experience collection... (1600 times) [2024-03-21 11:16:27,296][14919] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-03-21 11:16:27,805][14687] Fps is (10 sec: 88476.7, 60 sec: 44782.9, 300 sec: 47985.7). Total num frames: 1877344256. Throughput: 0: 47426.5. Samples: 82277100. Policy #0 lag: (min: 0.0, avg: 34.9, max: 83.0) [2024-03-21 11:16:27,806][14687] Avg episode reward: [(0, '1.318')] [2024-03-21 11:16:27,887][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000057293_1877377024.pth... [2024-03-21 11:16:28,012][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000056943_1865908224.pth [2024-03-21 11:16:29,362][14919] Updated weights for policy 0, policy_version 57295 (0.0014) [2024-03-21 11:16:32,805][14687] Fps is (10 sec: 72089.8, 60 sec: 46421.4, 300 sec: 47763.5). Total num frames: 1877639168. Throughput: 0: 48237.9. Samples: 82566300. Policy #0 lag: (min: 0.0, avg: 34.9, max: 83.0) [2024-03-21 11:16:32,805][14687] Avg episode reward: [(0, '1.288')] [2024-03-21 11:16:33,821][14919] Updated weights for policy 0, policy_version 57305 (0.0014) [2024-03-21 11:16:37,805][14687] Fps is (10 sec: 72090.9, 60 sec: 51336.7, 300 sec: 48541.1). Total num frames: 1878065152. Throughput: 0: 47853.4. Samples: 82814100. Policy #0 lag: (min: 0.0, avg: 34.9, max: 83.0) [2024-03-21 11:16:37,805][14687] Avg episode reward: [(0, '1.100')] [2024-03-21 11:16:42,805][14687] Fps is (10 sec: 42598.1, 60 sec: 49698.1, 300 sec: 47319.2). Total num frames: 1878065152. Throughput: 0: 48033.4. Samples: 82964000. Policy #0 lag: (min: 0.0, avg: 34.9, max: 83.0) [2024-03-21 11:16:42,806][14687] Avg episode reward: [(0, '0.897')] [2024-03-21 11:16:42,914][14919] Updated weights for policy 0, policy_version 57315 (0.0012) [2024-03-21 11:16:47,805][14687] Fps is (10 sec: 29490.8, 60 sec: 50244.3, 300 sec: 46763.8). Total num frames: 1878360064. Throughput: 0: 48326.7. Samples: 83254800. Policy #0 lag: (min: 0.0, avg: 47.5, max: 93.0) [2024-03-21 11:16:47,806][14687] Avg episode reward: [(0, '0.821')] [2024-03-21 11:16:52,318][14919] Updated weights for policy 0, policy_version 57325 (0.0015) [2024-03-21 11:16:52,805][14687] Fps is (10 sec: 39321.5, 60 sec: 46967.4, 300 sec: 45764.1). Total num frames: 1878458368. Throughput: 0: 48497.8. Samples: 83566400. Policy #0 lag: (min: 0.0, avg: 47.5, max: 93.0) [2024-03-21 11:16:52,806][14687] Avg episode reward: [(0, '1.296')] [2024-03-21 11:16:57,805][14687] Fps is (10 sec: 29491.4, 60 sec: 44783.0, 300 sec: 45653.0). Total num frames: 1878654976. Throughput: 0: 44628.9. Samples: 83716000. Policy #0 lag: (min: 0.0, avg: 47.5, max: 93.0) [2024-03-21 11:16:57,806][14687] Avg episode reward: [(0, '1.117')] [2024-03-21 11:17:00,479][14919] Updated weights for policy 0, policy_version 57335 (0.0012) [2024-03-21 11:17:02,805][14687] Fps is (10 sec: 36045.0, 60 sec: 44236.8, 300 sec: 45986.3). Total num frames: 1878818816. Throughput: 0: 47604.9. Samples: 84009900. Policy #0 lag: (min: 0.0, avg: 47.5, max: 93.0) [2024-03-21 11:17:02,806][14687] Avg episode reward: [(0, '1.570')] [2024-03-21 11:17:07,805][14687] Fps is (10 sec: 36044.9, 60 sec: 46967.5, 300 sec: 45764.1). Total num frames: 1879015424. Throughput: 0: 48584.5. Samples: 84319600. Policy #0 lag: (min: 0.0, avg: 47.5, max: 93.0) [2024-03-21 11:17:07,806][14687] Avg episode reward: [(0, '1.570')] [2024-03-21 11:17:08,132][14919] Updated weights for policy 0, policy_version 57345 (0.0019) [2024-03-21 11:17:12,805][14687] Fps is (10 sec: 55706.1, 60 sec: 51336.7, 300 sec: 46319.5). Total num frames: 1879375872. Throughput: 0: 49060.2. Samples: 84484800. Policy #0 lag: (min: 0.0, avg: 47.5, max: 93.0) [2024-03-21 11:17:12,805][14687] Avg episode reward: [(0, '1.570')] [2024-03-21 11:17:13,969][14919] Updated weights for policy 0, policy_version 57355 (0.0016) [2024-03-21 11:17:17,805][14687] Fps is (10 sec: 58981.5, 60 sec: 52429.1, 300 sec: 46541.7). Total num frames: 1879605248. Throughput: 0: 48819.8. Samples: 84763200. Policy #0 lag: (min: 0.0, avg: 47.5, max: 93.0) [2024-03-21 11:17:17,806][14687] Avg episode reward: [(0, '0.916')] [2024-03-21 11:17:19,965][14919] Updated weights for policy 0, policy_version 57365 (0.0014) [2024-03-21 11:17:21,132][14898] Signal inference workers to stop experience collection... (1650 times) [2024-03-21 11:17:21,201][14898] Signal inference workers to resume experience collection... (1650 times) [2024-03-21 11:17:21,219][14919] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-03-21 11:17:21,279][14919] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-03-21 11:17:22,805][14687] Fps is (10 sec: 58981.4, 60 sec: 50790.3, 300 sec: 47763.5). Total num frames: 1879965696. Throughput: 0: 49397.6. Samples: 85037000. Policy #0 lag: (min: 0.0, avg: 47.5, max: 93.0) [2024-03-21 11:17:22,806][14687] Avg episode reward: [(0, '0.608')] [2024-03-21 11:17:24,989][14919] Updated weights for policy 0, policy_version 57375 (0.0012) [2024-03-21 11:17:27,805][14687] Fps is (10 sec: 62258.2, 60 sec: 48059.6, 300 sec: 47874.6). Total num frames: 1880227840. Throughput: 0: 49337.5. Samples: 85184200. Policy #0 lag: (min: 2.0, avg: 45.9, max: 91.0) [2024-03-21 11:17:27,806][14687] Avg episode reward: [(0, '1.111')] [2024-03-21 11:17:29,960][14919] Updated weights for policy 0, policy_version 57385 (0.0021) [2024-03-21 11:17:32,805][14687] Fps is (10 sec: 58982.6, 60 sec: 48605.8, 300 sec: 48318.9). Total num frames: 1880555520. Throughput: 0: 49164.4. Samples: 85467200. Policy #0 lag: (min: 2.0, avg: 45.9, max: 91.0) [2024-03-21 11:17:32,806][14687] Avg episode reward: [(0, '0.992')] [2024-03-21 11:17:35,467][14919] Updated weights for policy 0, policy_version 57395 (0.0012) [2024-03-21 11:17:37,805][14687] Fps is (10 sec: 62260.4, 60 sec: 46421.2, 300 sec: 47985.7). Total num frames: 1880850432. Throughput: 0: 48722.2. Samples: 85758900. Policy #0 lag: (min: 2.0, avg: 45.9, max: 91.0) [2024-03-21 11:17:37,806][14687] Avg episode reward: [(0, '0.992')] [2024-03-21 11:17:42,805][14687] Fps is (10 sec: 42598.9, 60 sec: 48605.9, 300 sec: 47319.2). Total num frames: 1880981504. Throughput: 0: 51657.8. Samples: 86040600. Policy #0 lag: (min: 2.0, avg: 45.9, max: 91.0) [2024-03-21 11:17:42,805][14687] Avg episode reward: [(0, '1.348')] [2024-03-21 11:17:43,810][14919] Updated weights for policy 0, policy_version 57405 (0.0014) [2024-03-21 11:17:47,805][14687] Fps is (10 sec: 22937.8, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 1881079808. Throughput: 0: 48402.2. Samples: 86188000. Policy #0 lag: (min: 2.0, avg: 45.9, max: 91.0) [2024-03-21 11:17:47,805][14687] Avg episode reward: [(0, '1.627')] [2024-03-21 11:17:52,805][14687] Fps is (10 sec: 22937.5, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 1881210880. Throughput: 0: 47717.7. Samples: 86466900. Policy #0 lag: (min: 2.0, avg: 45.9, max: 91.0) [2024-03-21 11:17:52,806][14687] Avg episode reward: [(0, '1.299')] [2024-03-21 11:17:55,039][14919] Updated weights for policy 0, policy_version 57415 (0.0022) [2024-03-21 11:17:57,805][14687] Fps is (10 sec: 55705.6, 60 sec: 49698.1, 300 sec: 46652.7). Total num frames: 1881636864. Throughput: 0: 46995.5. Samples: 86599600. Policy #0 lag: (min: 2.0, avg: 45.9, max: 91.0) [2024-03-21 11:17:57,805][14687] Avg episode reward: [(0, '1.413')] [2024-03-21 11:17:58,238][14919] Updated weights for policy 0, policy_version 57425 (0.0025) [2024-03-21 11:18:02,805][14687] Fps is (10 sec: 68813.0, 60 sec: 51336.6, 300 sec: 46874.9). Total num frames: 1881899008. Throughput: 0: 45935.7. Samples: 86830300. Policy #0 lag: (min: 3.0, avg: 37.0, max: 91.0) [2024-03-21 11:18:02,805][14687] Avg episode reward: [(0, '1.587')] [2024-03-21 11:18:05,352][14919] Updated weights for policy 0, policy_version 57435 (0.0018) [2024-03-21 11:18:07,805][14687] Fps is (10 sec: 45874.9, 60 sec: 51336.5, 300 sec: 46874.9). Total num frames: 1882095616. Throughput: 0: 46188.9. Samples: 87115500. Policy #0 lag: (min: 3.0, avg: 37.0, max: 91.0) [2024-03-21 11:18:07,806][14687] Avg episode reward: [(0, '1.587')] [2024-03-21 11:18:12,805][14687] Fps is (10 sec: 39321.3, 60 sec: 48605.8, 300 sec: 46986.0). Total num frames: 1882292224. Throughput: 0: 45729.1. Samples: 87242000. Policy #0 lag: (min: 3.0, avg: 37.0, max: 91.0) [2024-03-21 11:18:12,806][14687] Avg episode reward: [(0, '0.568')] [2024-03-21 11:18:15,160][14898] Signal inference workers to stop experience collection... (1700 times) [2024-03-21 11:18:15,204][14919] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-03-21 11:18:15,492][14898] Signal inference workers to resume experience collection... (1700 times) [2024-03-21 11:18:15,493][14919] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-03-21 11:18:15,802][14919] Updated weights for policy 0, policy_version 57445 (0.0015) [2024-03-21 11:18:17,805][14687] Fps is (10 sec: 29491.2, 60 sec: 46421.4, 300 sec: 46097.4). Total num frames: 1882390528. Throughput: 0: 45728.9. Samples: 87525000. Policy #0 lag: (min: 3.0, avg: 37.0, max: 91.0) [2024-03-21 11:18:17,806][14687] Avg episode reward: [(0, '1.187')] [2024-03-21 11:18:22,805][14687] Fps is (10 sec: 32768.0, 60 sec: 44236.8, 300 sec: 46430.6). Total num frames: 1882619904. Throughput: 0: 45508.9. Samples: 87806800. Policy #0 lag: (min: 3.0, avg: 37.0, max: 91.0) [2024-03-21 11:18:22,806][14687] Avg episode reward: [(0, '1.477')] [2024-03-21 11:18:24,330][14919] Updated weights for policy 0, policy_version 57455 (0.0021) [2024-03-21 11:18:27,805][14687] Fps is (10 sec: 36045.0, 60 sec: 42052.5, 300 sec: 46097.4). Total num frames: 1882750976. Throughput: 0: 42353.3. Samples: 87946500. Policy #0 lag: (min: 3.0, avg: 37.0, max: 91.0) [2024-03-21 11:18:27,805][14687] Avg episode reward: [(0, '1.284')] [2024-03-21 11:18:27,819][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000057457_1882750976.pth... [2024-03-21 11:18:27,998][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000057125_1871872000.pth [2024-03-21 11:18:32,153][14919] Updated weights for policy 0, policy_version 57465 (0.0011) [2024-03-21 11:18:32,805][14687] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 46208.4). Total num frames: 1883013120. Throughput: 0: 45162.2. Samples: 88220300. Policy #0 lag: (min: 3.0, avg: 37.0, max: 91.0) [2024-03-21 11:18:32,806][14687] Avg episode reward: [(0, '1.105')] [2024-03-21 11:18:37,805][14687] Fps is (10 sec: 52429.0, 60 sec: 40413.9, 300 sec: 46541.7). Total num frames: 1883275264. Throughput: 0: 44904.5. Samples: 88487600. Policy #0 lag: (min: 2.0, avg: 29.0, max: 68.0) [2024-03-21 11:18:37,805][14687] Avg episode reward: [(0, '1.223')] [2024-03-21 11:18:38,198][14919] Updated weights for policy 0, policy_version 57475 (0.0011) [2024-03-21 11:18:41,497][14919] Updated weights for policy 0, policy_version 57485 (0.0037) [2024-03-21 11:18:42,805][14687] Fps is (10 sec: 78643.8, 60 sec: 46967.4, 300 sec: 47208.1). Total num frames: 1883799552. Throughput: 0: 47224.5. Samples: 88724700. Policy #0 lag: (min: 2.0, avg: 29.0, max: 68.0) [2024-03-21 11:18:42,806][14687] Avg episode reward: [(0, '0.624')] [2024-03-21 11:18:47,805][14687] Fps is (10 sec: 65535.2, 60 sec: 47513.5, 300 sec: 46986.0). Total num frames: 1883930624. Throughput: 0: 45268.7. Samples: 88867400. Policy #0 lag: (min: 2.0, avg: 29.0, max: 68.0) [2024-03-21 11:18:47,806][14687] Avg episode reward: [(0, '1.491')] [2024-03-21 11:18:48,545][14919] Updated weights for policy 0, policy_version 57495 (0.0015) [2024-03-21 11:18:52,805][14687] Fps is (10 sec: 32768.0, 60 sec: 48605.9, 300 sec: 47208.2). Total num frames: 1884127232. Throughput: 0: 45035.7. Samples: 89142100. Policy #0 lag: (min: 2.0, avg: 29.0, max: 68.0) [2024-03-21 11:18:52,806][14687] Avg episode reward: [(0, '1.120')] [2024-03-21 11:18:57,430][14919] Updated weights for policy 0, policy_version 57505 (0.0011) [2024-03-21 11:18:57,805][14687] Fps is (10 sec: 42598.4, 60 sec: 45329.0, 300 sec: 46874.9). Total num frames: 1884356608. Throughput: 0: 45484.4. Samples: 89288800. Policy #0 lag: (min: 2.0, avg: 29.0, max: 68.0) [2024-03-21 11:18:57,806][14687] Avg episode reward: [(0, '0.895')] [2024-03-21 11:19:02,805][14687] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 46097.4). Total num frames: 1884553216. Throughput: 0: 44926.7. Samples: 89546700. Policy #0 lag: (min: 2.0, avg: 29.0, max: 68.0) [2024-03-21 11:19:02,806][14687] Avg episode reward: [(0, '1.607')] [2024-03-21 11:19:04,617][14919] Updated weights for policy 0, policy_version 57515 (0.0013) [2024-03-21 11:19:07,347][14898] Signal inference workers to stop experience collection... (1750 times) [2024-03-21 11:19:07,425][14919] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-03-21 11:19:07,575][14898] Signal inference workers to resume experience collection... (1750 times) [2024-03-21 11:19:07,576][14919] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-03-21 11:19:07,805][14687] Fps is (10 sec: 45875.3, 60 sec: 45329.0, 300 sec: 45653.0). Total num frames: 1884815360. Throughput: 0: 43853.3. Samples: 89780200. Policy #0 lag: (min: 2.0, avg: 29.0, max: 68.0) [2024-03-21 11:19:07,806][14687] Avg episode reward: [(0, '1.026')] [2024-03-21 11:19:12,805][14687] Fps is (10 sec: 39321.7, 60 sec: 44236.9, 300 sec: 45986.3). Total num frames: 1884946432. Throughput: 0: 43435.6. Samples: 89901100. Policy #0 lag: (min: 0.0, avg: 42.6, max: 109.0) [2024-03-21 11:19:12,805][14687] Avg episode reward: [(0, '0.835')] [2024-03-21 11:19:13,492][14919] Updated weights for policy 0, policy_version 57525 (0.0015) [2024-03-21 11:19:17,805][14687] Fps is (10 sec: 39322.0, 60 sec: 46967.5, 300 sec: 46430.6). Total num frames: 1885208576. Throughput: 0: 42395.6. Samples: 90128100. Policy #0 lag: (min: 0.0, avg: 42.6, max: 109.0) [2024-03-21 11:19:17,805][14687] Avg episode reward: [(0, '1.253')] [2024-03-21 11:19:20,043][14919] Updated weights for policy 0, policy_version 57535 (0.0021) [2024-03-21 11:19:22,805][14687] Fps is (10 sec: 49151.9, 60 sec: 46967.5, 300 sec: 46208.5). Total num frames: 1885437952. Throughput: 0: 42528.9. Samples: 90401400. Policy #0 lag: (min: 0.0, avg: 42.6, max: 109.0) [2024-03-21 11:19:22,806][14687] Avg episode reward: [(0, '1.345')] [2024-03-21 11:19:27,805][14687] Fps is (10 sec: 26214.4, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 1885470720. Throughput: 0: 40737.7. Samples: 90557900. Policy #0 lag: (min: 0.0, avg: 42.6, max: 109.0) [2024-03-21 11:19:27,806][14687] Avg episode reward: [(0, '1.430')] [2024-03-21 11:19:30,717][14919] Updated weights for policy 0, policy_version 57545 (0.0017) [2024-03-21 11:19:32,805][14687] Fps is (10 sec: 32768.2, 60 sec: 45875.3, 300 sec: 46097.4). Total num frames: 1885765632. Throughput: 0: 44053.5. Samples: 90849800. Policy #0 lag: (min: 0.0, avg: 42.6, max: 109.0) [2024-03-21 11:19:32,805][14687] Avg episode reward: [(0, '1.430')] [2024-03-21 11:19:36,290][14919] Updated weights for policy 0, policy_version 57555 (0.0020) [2024-03-21 11:19:37,805][14687] Fps is (10 sec: 52428.9, 60 sec: 45329.1, 300 sec: 45876.0). Total num frames: 1885995008. Throughput: 0: 44144.4. Samples: 91128600. Policy #0 lag: (min: 0.0, avg: 42.6, max: 109.0) [2024-03-21 11:19:37,805][14687] Avg episode reward: [(0, '1.304')] [2024-03-21 11:19:42,805][14687] Fps is (10 sec: 32768.1, 60 sec: 38229.4, 300 sec: 45875.2). Total num frames: 1886093312. Throughput: 0: 46993.6. Samples: 91403500. Policy #0 lag: (min: 0.0, avg: 42.6, max: 109.0) [2024-03-21 11:19:42,805][14687] Avg episode reward: [(0, '1.154')] [2024-03-21 11:19:46,588][14919] Updated weights for policy 0, policy_version 57565 (0.0012) [2024-03-21 11:19:47,805][14687] Fps is (10 sec: 36044.6, 60 sec: 40413.9, 300 sec: 45653.0). Total num frames: 1886355456. Throughput: 0: 44073.3. Samples: 91530000. Policy #0 lag: (min: 0.0, avg: 42.6, max: 109.0) [2024-03-21 11:19:47,806][14687] Avg episode reward: [(0, '0.914')] [2024-03-21 11:19:51,284][14919] Updated weights for policy 0, policy_version 57575 (0.0016) [2024-03-21 11:19:52,805][14687] Fps is (10 sec: 55704.5, 60 sec: 42052.2, 300 sec: 45986.3). Total num frames: 1886650368. Throughput: 0: 44077.8. Samples: 91763700. Policy #0 lag: (min: 0.0, avg: 32.1, max: 110.0) [2024-03-21 11:19:52,807][14687] Avg episode reward: [(0, '0.675')] [2024-03-21 11:19:57,052][14919] Updated weights for policy 0, policy_version 57585 (0.0009) [2024-03-21 11:19:57,805][14687] Fps is (10 sec: 58982.7, 60 sec: 43144.6, 300 sec: 46319.5). Total num frames: 1886945280. Throughput: 0: 47646.6. Samples: 92045200. Policy #0 lag: (min: 0.0, avg: 32.1, max: 110.0) [2024-03-21 11:19:57,806][14687] Avg episode reward: [(0, '1.777')] [2024-03-21 11:20:02,728][14898] Signal inference workers to stop experience collection... (1800 times) [2024-03-21 11:20:02,793][14919] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-03-21 11:20:02,805][14687] Fps is (10 sec: 58982.4, 60 sec: 44782.9, 300 sec: 46652.7). Total num frames: 1887240192. Throughput: 0: 45606.6. Samples: 92180400. Policy #0 lag: (min: 0.0, avg: 32.1, max: 110.0) [2024-03-21 11:20:02,806][14687] Avg episode reward: [(0, '1.152')] [2024-03-21 11:20:02,992][14898] Signal inference workers to resume experience collection... (1800 times) [2024-03-21 11:20:02,992][14919] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-03-21 11:20:02,995][14919] Updated weights for policy 0, policy_version 57595 (0.0028) [2024-03-21 11:20:07,805][14687] Fps is (10 sec: 58982.2, 60 sec: 45329.1, 300 sec: 46874.9). Total num frames: 1887535104. Throughput: 0: 45368.8. Samples: 92443000. Policy #0 lag: (min: 0.0, avg: 32.1, max: 110.0) [2024-03-21 11:20:07,806][14687] Avg episode reward: [(0, '1.674')] [2024-03-21 11:20:11,119][14919] Updated weights for policy 0, policy_version 57605 (0.0012) [2024-03-21 11:20:12,805][14687] Fps is (10 sec: 42599.0, 60 sec: 45329.1, 300 sec: 46652.8). Total num frames: 1887666176. Throughput: 0: 48424.5. Samples: 92737000. Policy #0 lag: (min: 0.0, avg: 32.1, max: 110.0) [2024-03-21 11:20:12,806][14687] Avg episode reward: [(0, '1.674')] [2024-03-21 11:20:17,805][14687] Fps is (10 sec: 22937.6, 60 sec: 42598.4, 300 sec: 46097.4). Total num frames: 1887764480. Throughput: 0: 45164.3. Samples: 92882200. Policy #0 lag: (min: 0.0, avg: 32.1, max: 110.0) [2024-03-21 11:20:17,806][14687] Avg episode reward: [(0, '1.330')] [2024-03-21 11:20:21,894][14919] Updated weights for policy 0, policy_version 57615 (0.0017) [2024-03-21 11:20:22,805][14687] Fps is (10 sec: 32767.9, 60 sec: 42598.4, 300 sec: 45208.7). Total num frames: 1887993856. Throughput: 0: 45391.1. Samples: 93171200. Policy #0 lag: (min: 0.0, avg: 32.1, max: 110.0) [2024-03-21 11:20:22,806][14687] Avg episode reward: [(0, '1.268')] [2024-03-21 11:20:27,805][14687] Fps is (10 sec: 42597.9, 60 sec: 45329.0, 300 sec: 45208.7). Total num frames: 1888190464. Throughput: 0: 42255.3. Samples: 93305000. Policy #0 lag: (min: 0.0, avg: 32.6, max: 81.0) [2024-03-21 11:20:27,806][14687] Avg episode reward: [(0, '1.115')] [2024-03-21 11:20:27,871][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000057624_1888223232.pth... [2024-03-21 11:20:27,984][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000057293_1877377024.pth [2024-03-21 11:20:29,556][14919] Updated weights for policy 0, policy_version 57625 (0.0015) [2024-03-21 11:20:32,740][14919] Updated weights for policy 0, policy_version 57635 (0.0023) [2024-03-21 11:20:32,805][14687] Fps is (10 sec: 58982.1, 60 sec: 46967.4, 300 sec: 46097.4). Total num frames: 1888583680. Throughput: 0: 45280.0. Samples: 93567600. Policy #0 lag: (min: 0.0, avg: 32.6, max: 81.0) [2024-03-21 11:20:32,806][14687] Avg episode reward: [(0, '1.427')] [2024-03-21 11:20:37,805][14687] Fps is (10 sec: 55706.3, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 1888747520. Throughput: 0: 46217.8. Samples: 93843500. Policy #0 lag: (min: 0.0, avg: 32.6, max: 81.0) [2024-03-21 11:20:37,806][14687] Avg episode reward: [(0, '1.427')] [2024-03-21 11:20:41,388][14919] Updated weights for policy 0, policy_version 57645 (0.0011) [2024-03-21 11:20:42,805][14687] Fps is (10 sec: 36045.2, 60 sec: 47513.6, 300 sec: 46097.4). Total num frames: 1888944128. Throughput: 0: 46431.2. Samples: 94134600. Policy #0 lag: (min: 0.0, avg: 32.6, max: 81.0) [2024-03-21 11:20:42,805][14687] Avg episode reward: [(0, '1.224')] [2024-03-21 11:20:46,758][14919] Updated weights for policy 0, policy_version 57655 (0.0016) [2024-03-21 11:20:47,805][14687] Fps is (10 sec: 52428.4, 60 sec: 48605.8, 300 sec: 46208.4). Total num frames: 1889271808. Throughput: 0: 46531.1. Samples: 94274300. Policy #0 lag: (min: 0.0, avg: 32.6, max: 81.0) [2024-03-21 11:20:47,806][14687] Avg episode reward: [(0, '1.259')] [2024-03-21 11:20:52,805][14687] Fps is (10 sec: 45874.5, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 1889402880. Throughput: 0: 47080.0. Samples: 94561600. Policy #0 lag: (min: 0.0, avg: 32.6, max: 81.0) [2024-03-21 11:20:52,806][14687] Avg episode reward: [(0, '0.752')] [2024-03-21 11:20:57,805][14687] Fps is (10 sec: 26214.7, 60 sec: 43144.5, 300 sec: 45319.8). Total num frames: 1889533952. Throughput: 0: 43544.4. Samples: 94696500. Policy #0 lag: (min: 0.0, avg: 32.6, max: 81.0) [2024-03-21 11:20:57,806][14687] Avg episode reward: [(0, '0.776')] [2024-03-21 11:20:58,298][14919] Updated weights for policy 0, policy_version 57665 (0.0011) [2024-03-21 11:21:01,718][14898] Signal inference workers to stop experience collection... (1850 times) [2024-03-21 11:21:01,724][14898] Signal inference workers to resume experience collection... (1850 times) [2024-03-21 11:21:01,779][14919] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-03-21 11:21:01,779][14919] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-03-21 11:21:02,805][14687] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 46208.4). Total num frames: 1889828864. Throughput: 0: 47066.7. Samples: 95000200. Policy #0 lag: (min: 0.0, avg: 28.6, max: 63.0) [2024-03-21 11:21:02,806][14687] Avg episode reward: [(0, '0.872')] [2024-03-21 11:21:03,206][14919] Updated weights for policy 0, policy_version 57675 (0.0010) [2024-03-21 11:21:07,651][14919] Updated weights for policy 0, policy_version 57685 (0.0011) [2024-03-21 11:21:07,805][14687] Fps is (10 sec: 68812.7, 60 sec: 44782.9, 300 sec: 47208.1). Total num frames: 1890222080. Throughput: 0: 46042.2. Samples: 95243100. Policy #0 lag: (min: 0.0, avg: 28.6, max: 63.0) [2024-03-21 11:21:07,806][14687] Avg episode reward: [(0, '0.761')] [2024-03-21 11:21:12,805][14687] Fps is (10 sec: 58983.6, 60 sec: 45875.3, 300 sec: 47319.3). Total num frames: 1890418688. Throughput: 0: 49556.0. Samples: 95535000. Policy #0 lag: (min: 0.0, avg: 28.6, max: 63.0) [2024-03-21 11:21:12,805][14687] Avg episode reward: [(0, '0.716')] [2024-03-21 11:21:13,880][14919] Updated weights for policy 0, policy_version 57695 (0.0010) [2024-03-21 11:21:17,805][14687] Fps is (10 sec: 45874.7, 60 sec: 48605.8, 300 sec: 46652.7). Total num frames: 1890680832. Throughput: 0: 46962.1. Samples: 95680900. Policy #0 lag: (min: 0.0, avg: 28.6, max: 63.0) [2024-03-21 11:21:17,806][14687] Avg episode reward: [(0, '0.916')] [2024-03-21 11:21:21,734][14919] Updated weights for policy 0, policy_version 57705 (0.0016) [2024-03-21 11:21:22,805][14687] Fps is (10 sec: 49151.3, 60 sec: 48605.9, 300 sec: 45986.3). Total num frames: 1890910208. Throughput: 0: 47689.0. Samples: 95989500. Policy #0 lag: (min: 0.0, avg: 28.6, max: 63.0) [2024-03-21 11:21:22,805][14687] Avg episode reward: [(0, '0.916')] [2024-03-21 11:21:27,805][14687] Fps is (10 sec: 39322.6, 60 sec: 48060.0, 300 sec: 45542.0). Total num frames: 1891074048. Throughput: 0: 47640.0. Samples: 96278400. Policy #0 lag: (min: 0.0, avg: 28.6, max: 63.0) [2024-03-21 11:21:27,805][14687] Avg episode reward: [(0, '0.798')] [2024-03-21 11:21:28,782][14919] Updated weights for policy 0, policy_version 57715 (0.0010) [2024-03-21 11:21:32,805][14687] Fps is (10 sec: 42598.4, 60 sec: 45875.3, 300 sec: 44986.6). Total num frames: 1891336192. Throughput: 0: 47460.2. Samples: 96410000. Policy #0 lag: (min: 0.0, avg: 28.6, max: 63.0) [2024-03-21 11:21:32,805][14687] Avg episode reward: [(0, '0.735')] [2024-03-21 11:21:35,956][14919] Updated weights for policy 0, policy_version 57725 (0.0016) [2024-03-21 11:21:37,805][14687] Fps is (10 sec: 55704.8, 60 sec: 48059.7, 300 sec: 45986.3). Total num frames: 1891631104. Throughput: 0: 46995.6. Samples: 96676400. Policy #0 lag: (min: 0.0, avg: 28.6, max: 63.0) [2024-03-21 11:21:37,806][14687] Avg episode reward: [(0, '0.649')] [2024-03-21 11:21:42,504][14919] Updated weights for policy 0, policy_version 57735 (0.0013) [2024-03-21 11:21:42,805][14687] Fps is (10 sec: 52428.7, 60 sec: 48605.8, 300 sec: 45764.1). Total num frames: 1891860480. Throughput: 0: 47122.3. Samples: 96817000. Policy #0 lag: (min: 0.0, avg: 38.0, max: 76.0) [2024-03-21 11:21:42,805][14687] Avg episode reward: [(0, '0.916')] [2024-03-21 11:21:47,805][14687] Fps is (10 sec: 36044.7, 60 sec: 45329.1, 300 sec: 45875.2). Total num frames: 1891991552. Throughput: 0: 46857.7. Samples: 97108800. Policy #0 lag: (min: 0.0, avg: 38.0, max: 76.0) [2024-03-21 11:21:47,806][14687] Avg episode reward: [(0, '1.669')] [2024-03-21 11:21:52,805][14687] Fps is (10 sec: 26214.1, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 1892122624. Throughput: 0: 47966.6. Samples: 97401600. Policy #0 lag: (min: 0.0, avg: 38.0, max: 76.0) [2024-03-21 11:21:52,806][14687] Avg episode reward: [(0, '1.669')] [2024-03-21 11:21:53,330][14919] Updated weights for policy 0, policy_version 57745 (0.0009) [2024-03-21 11:21:56,464][14898] Signal inference workers to stop experience collection... (1900 times) [2024-03-21 11:21:56,512][14919] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-03-21 11:21:56,592][14898] Signal inference workers to resume experience collection... (1900 times) [2024-03-21 11:21:56,593][14919] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-03-21 11:21:57,544][14919] Updated weights for policy 0, policy_version 57755 (0.0016) [2024-03-21 11:21:57,805][14687] Fps is (10 sec: 52428.4, 60 sec: 49698.0, 300 sec: 46430.6). Total num frames: 1892515840. Throughput: 0: 44308.5. Samples: 97528900. Policy #0 lag: (min: 0.0, avg: 38.0, max: 76.0) [2024-03-21 11:21:57,806][14687] Avg episode reward: [(0, '1.460')] [2024-03-21 11:22:02,805][14687] Fps is (10 sec: 55705.6, 60 sec: 47513.6, 300 sec: 46319.5). Total num frames: 1892679680. Throughput: 0: 47029.0. Samples: 97797200. Policy #0 lag: (min: 0.0, avg: 38.0, max: 76.0) [2024-03-21 11:22:02,806][14687] Avg episode reward: [(0, '0.821')] [2024-03-21 11:22:04,572][14919] Updated weights for policy 0, policy_version 57765 (0.0020) [2024-03-21 11:22:07,805][14687] Fps is (10 sec: 42599.0, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 1892941824. Throughput: 0: 46448.8. Samples: 98079700. Policy #0 lag: (min: 0.0, avg: 38.0, max: 76.0) [2024-03-21 11:22:07,806][14687] Avg episode reward: [(0, '1.428')] [2024-03-21 11:22:10,946][14919] Updated weights for policy 0, policy_version 57775 (0.0011) [2024-03-21 11:22:12,805][14687] Fps is (10 sec: 58982.5, 60 sec: 47513.4, 300 sec: 46319.5). Total num frames: 1893269504. Throughput: 0: 43246.5. Samples: 98224500. Policy #0 lag: (min: 0.0, avg: 38.0, max: 76.0) [2024-03-21 11:22:12,806][14687] Avg episode reward: [(0, '1.088')] [2024-03-21 11:22:17,805][14687] Fps is (10 sec: 49151.8, 60 sec: 45875.3, 300 sec: 45653.1). Total num frames: 1893433344. Throughput: 0: 46311.0. Samples: 98494000. Policy #0 lag: (min: 0.0, avg: 34.2, max: 94.0) [2024-03-21 11:22:17,806][14687] Avg episode reward: [(0, '0.660')] [2024-03-21 11:22:18,340][14919] Updated weights for policy 0, policy_version 57785 (0.0011) [2024-03-21 11:22:22,805][14687] Fps is (10 sec: 36045.0, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 1893629952. Throughput: 0: 46800.0. Samples: 98782400. Policy #0 lag: (min: 0.0, avg: 34.2, max: 94.0) [2024-03-21 11:22:22,806][14687] Avg episode reward: [(0, '0.660')] [2024-03-21 11:22:26,162][14919] Updated weights for policy 0, policy_version 57795 (0.0013) [2024-03-21 11:22:27,805][14687] Fps is (10 sec: 39321.2, 60 sec: 45875.0, 300 sec: 44986.6). Total num frames: 1893826560. Throughput: 0: 46142.0. Samples: 98893400. Policy #0 lag: (min: 0.0, avg: 34.2, max: 94.0) [2024-03-21 11:22:27,806][14687] Avg episode reward: [(0, '1.169')] [2024-03-21 11:22:27,869][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000057796_1893859328.pth... [2024-03-21 11:22:27,990][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000057457_1882750976.pth [2024-03-21 11:22:32,805][14687] Fps is (10 sec: 42598.3, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 1894055936. Throughput: 0: 45264.5. Samples: 99145700. Policy #0 lag: (min: 0.0, avg: 34.2, max: 94.0) [2024-03-21 11:22:32,806][14687] Avg episode reward: [(0, '1.289')] [2024-03-21 11:22:34,468][14919] Updated weights for policy 0, policy_version 57805 (0.0031) [2024-03-21 11:22:37,805][14687] Fps is (10 sec: 58982.7, 60 sec: 46421.3, 300 sec: 45541.9). Total num frames: 1894416384. Throughput: 0: 43820.0. Samples: 99373500. Policy #0 lag: (min: 0.0, avg: 34.2, max: 94.0) [2024-03-21 11:22:37,806][14687] Avg episode reward: [(0, '1.326')] [2024-03-21 11:22:38,616][14919] Updated weights for policy 0, policy_version 57815 (0.0016) [2024-03-21 11:22:42,805][14687] Fps is (10 sec: 52429.5, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 1894580224. Throughput: 0: 47129.2. Samples: 99649700. Policy #0 lag: (min: 0.0, avg: 34.2, max: 94.0) [2024-03-21 11:22:42,805][14687] Avg episode reward: [(0, '1.379')] [2024-03-21 11:22:46,436][14898] Signal inference workers to stop experience collection... (1950 times) [2024-03-21 11:22:46,496][14898] Signal inference workers to resume experience collection... (1950 times) [2024-03-21 11:22:46,535][14919] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-03-21 11:22:46,536][14919] Updated weights for policy 0, policy_version 57825 (0.0012) [2024-03-21 11:22:46,571][14919] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-03-21 11:22:47,805][14687] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 46430.6). Total num frames: 1894907904. Throughput: 0: 44380.0. Samples: 99794300. Policy #0 lag: (min: 0.0, avg: 34.2, max: 94.0) [2024-03-21 11:22:47,806][14687] Avg episode reward: [(0, '1.189')] [2024-03-21 11:22:52,805][14687] Fps is (10 sec: 42597.6, 60 sec: 48059.7, 300 sec: 45319.8). Total num frames: 1895006208. Throughput: 0: 44475.5. Samples: 100081100. Policy #0 lag: (min: 0.0, avg: 34.2, max: 94.0) [2024-03-21 11:22:52,806][14687] Avg episode reward: [(0, '0.728')] [2024-03-21 11:22:57,062][14919] Updated weights for policy 0, policy_version 57835 (0.0016) [2024-03-21 11:22:57,805][14687] Fps is (10 sec: 22937.7, 60 sec: 43690.8, 300 sec: 44875.5). Total num frames: 1895137280. Throughput: 0: 44277.8. Samples: 100217000. Policy #0 lag: (min: 0.0, avg: 33.3, max: 85.0) [2024-03-21 11:22:57,806][14687] Avg episode reward: [(0, '0.720')] [2024-03-21 11:23:02,805][14687] Fps is (10 sec: 36045.1, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 1895366656. Throughput: 0: 43609.0. Samples: 100456400. Policy #0 lag: (min: 0.0, avg: 33.3, max: 85.0) [2024-03-21 11:23:02,805][14687] Avg episode reward: [(0, '1.170')] [2024-03-21 11:23:07,805][14687] Fps is (10 sec: 29491.0, 60 sec: 41506.1, 300 sec: 44542.3). Total num frames: 1895432192. Throughput: 0: 43239.9. Samples: 100728200. Policy #0 lag: (min: 0.0, avg: 33.3, max: 85.0) [2024-03-21 11:23:07,806][14687] Avg episode reward: [(0, '1.560')] [2024-03-21 11:23:07,992][14919] Updated weights for policy 0, policy_version 57845 (0.0012) [2024-03-21 11:23:12,805][14687] Fps is (10 sec: 36044.8, 60 sec: 40960.0, 300 sec: 45208.7). Total num frames: 1895727104. Throughput: 0: 43673.5. Samples: 100858700. Policy #0 lag: (min: 0.0, avg: 33.3, max: 85.0) [2024-03-21 11:23:12,805][14687] Avg episode reward: [(0, '0.592')] [2024-03-21 11:23:14,887][14919] Updated weights for policy 0, policy_version 57855 (0.0013) [2024-03-21 11:23:17,805][14687] Fps is (10 sec: 49152.1, 60 sec: 41506.1, 300 sec: 45097.6). Total num frames: 1895923712. Throughput: 0: 44055.5. Samples: 101128200. Policy #0 lag: (min: 0.0, avg: 33.3, max: 85.0) [2024-03-21 11:23:17,806][14687] Avg episode reward: [(0, '0.862')] [2024-03-21 11:23:20,483][14919] Updated weights for policy 0, policy_version 57865 (0.0016) [2024-03-21 11:23:22,805][14687] Fps is (10 sec: 55705.3, 60 sec: 44236.8, 300 sec: 45875.2). Total num frames: 1896284160. Throughput: 0: 44831.2. Samples: 101390900. Policy #0 lag: (min: 0.0, avg: 33.3, max: 85.0) [2024-03-21 11:23:22,806][14687] Avg episode reward: [(0, '1.359')] [2024-03-21 11:23:24,516][14919] Updated weights for policy 0, policy_version 57875 (0.0015) [2024-03-21 11:23:27,805][14687] Fps is (10 sec: 81919.6, 60 sec: 48605.9, 300 sec: 46541.7). Total num frames: 1896742912. Throughput: 0: 41357.5. Samples: 101510800. Policy #0 lag: (min: 0.0, avg: 33.3, max: 85.0) [2024-03-21 11:23:27,806][14687] Avg episode reward: [(0, '1.396')] [2024-03-21 11:23:28,101][14919] Updated weights for policy 0, policy_version 57885 (0.0026) [2024-03-21 11:23:32,805][14687] Fps is (10 sec: 65536.2, 60 sec: 48059.7, 300 sec: 46319.5). Total num frames: 1896939520. Throughput: 0: 44177.8. Samples: 101782300. Policy #0 lag: (min: 0.0, avg: 39.9, max: 72.0) [2024-03-21 11:23:32,806][14687] Avg episode reward: [(0, '0.665')] [2024-03-21 11:23:37,805][14687] Fps is (10 sec: 26214.7, 60 sec: 43144.6, 300 sec: 44764.4). Total num frames: 1897005056. Throughput: 0: 44217.8. Samples: 102070900. Policy #0 lag: (min: 0.0, avg: 39.9, max: 72.0) [2024-03-21 11:23:37,806][14687] Avg episode reward: [(0, '0.665')] [2024-03-21 11:23:39,532][14919] Updated weights for policy 0, policy_version 57895 (0.0011) [2024-03-21 11:23:42,805][14687] Fps is (10 sec: 26214.5, 60 sec: 43690.6, 300 sec: 44986.6). Total num frames: 1897201664. Throughput: 0: 44102.3. Samples: 102201600. Policy #0 lag: (min: 0.0, avg: 39.9, max: 72.0) [2024-03-21 11:23:42,806][14687] Avg episode reward: [(0, '1.059')] [2024-03-21 11:23:47,805][14687] Fps is (10 sec: 32768.1, 60 sec: 40413.9, 300 sec: 44764.4). Total num frames: 1897332736. Throughput: 0: 43875.5. Samples: 102430800. Policy #0 lag: (min: 0.0, avg: 39.9, max: 72.0) [2024-03-21 11:23:47,806][14687] Avg episode reward: [(0, '1.299')] [2024-03-21 11:23:51,073][14898] Signal inference workers to stop experience collection... (2000 times) [2024-03-21 11:23:51,074][14898] Signal inference workers to resume experience collection... (2000 times) [2024-03-21 11:23:51,149][14919] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-03-21 11:23:51,149][14919] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-03-21 11:23:51,485][14919] Updated weights for policy 0, policy_version 57905 (0.0012) [2024-03-21 11:23:52,805][14687] Fps is (10 sec: 32767.7, 60 sec: 42052.2, 300 sec: 44653.3). Total num frames: 1897529344. Throughput: 0: 43146.7. Samples: 102669800. Policy #0 lag: (min: 0.0, avg: 39.9, max: 72.0) [2024-03-21 11:23:52,806][14687] Avg episode reward: [(0, '1.387')] [2024-03-21 11:23:55,263][14919] Updated weights for policy 0, policy_version 57915 (0.0020) [2024-03-21 11:23:57,805][14687] Fps is (10 sec: 58982.3, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1897922560. Throughput: 0: 42988.9. Samples: 102793200. Policy #0 lag: (min: 0.0, avg: 39.9, max: 72.0) [2024-03-21 11:23:57,806][14687] Avg episode reward: [(0, '1.268')] [2024-03-21 11:24:02,805][14687] Fps is (10 sec: 52428.8, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 1898053632. Throughput: 0: 43280.0. Samples: 103075800. Policy #0 lag: (min: 0.0, avg: 39.9, max: 72.0) [2024-03-21 11:24:02,806][14687] Avg episode reward: [(0, '0.661')] [2024-03-21 11:24:02,901][14919] Updated weights for policy 0, policy_version 57925 (0.0012) [2024-03-21 11:24:07,805][14687] Fps is (10 sec: 36044.7, 60 sec: 47513.6, 300 sec: 45208.7). Total num frames: 1898283008. Throughput: 0: 43853.3. Samples: 103364300. Policy #0 lag: (min: 0.0, avg: 39.9, max: 72.0) [2024-03-21 11:24:07,806][14687] Avg episode reward: [(0, '0.661')] [2024-03-21 11:24:09,434][14919] Updated weights for policy 0, policy_version 57935 (0.0011) [2024-03-21 11:24:12,805][14687] Fps is (10 sec: 42598.3, 60 sec: 45875.1, 300 sec: 44986.6). Total num frames: 1898479616. Throughput: 0: 44464.5. Samples: 103511700. Policy #0 lag: (min: 0.0, avg: 31.4, max: 113.0) [2024-03-21 11:24:12,806][14687] Avg episode reward: [(0, '1.430')] [2024-03-21 11:24:17,805][14687] Fps is (10 sec: 36044.9, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1898643456. Throughput: 0: 44704.4. Samples: 103794000. Policy #0 lag: (min: 0.0, avg: 31.4, max: 113.0) [2024-03-21 11:24:17,806][14687] Avg episode reward: [(0, '0.644')] [2024-03-21 11:24:18,959][14919] Updated weights for policy 0, policy_version 57945 (0.0016) [2024-03-21 11:24:22,805][14687] Fps is (10 sec: 36044.8, 60 sec: 42598.4, 300 sec: 45319.8). Total num frames: 1898840064. Throughput: 0: 43517.7. Samples: 104029200. Policy #0 lag: (min: 0.0, avg: 31.4, max: 113.0) [2024-03-21 11:24:22,806][14687] Avg episode reward: [(0, '0.891')] [2024-03-21 11:24:27,805][14687] Fps is (10 sec: 36044.7, 60 sec: 37683.3, 300 sec: 44875.5). Total num frames: 1899003904. Throughput: 0: 46093.3. Samples: 104275800. Policy #0 lag: (min: 0.0, avg: 31.4, max: 113.0) [2024-03-21 11:24:27,806][14687] Avg episode reward: [(0, '1.257')] [2024-03-21 11:24:27,817][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000057953_1899003904.pth... [2024-03-21 11:24:27,932][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000057624_1888223232.pth [2024-03-21 11:24:31,567][14919] Updated weights for policy 0, policy_version 57955 (0.0028) [2024-03-21 11:24:32,805][14687] Fps is (10 sec: 29491.2, 60 sec: 36590.9, 300 sec: 44542.2). Total num frames: 1899134976. Throughput: 0: 44444.4. Samples: 104430800. Policy #0 lag: (min: 0.0, avg: 31.4, max: 113.0) [2024-03-21 11:24:32,806][14687] Avg episode reward: [(0, '1.229')] [2024-03-21 11:24:36,875][14919] Updated weights for policy 0, policy_version 57965 (0.0017) [2024-03-21 11:24:37,805][14687] Fps is (10 sec: 49151.6, 60 sec: 41506.1, 300 sec: 45430.9). Total num frames: 1899495424. Throughput: 0: 44660.0. Samples: 104679500. Policy #0 lag: (min: 0.0, avg: 31.4, max: 113.0) [2024-03-21 11:24:37,806][14687] Avg episode reward: [(0, '1.382')] [2024-03-21 11:24:39,947][14898] Signal inference workers to stop experience collection... (2050 times) [2024-03-21 11:24:40,012][14898] Signal inference workers to resume experience collection... (2050 times) [2024-03-21 11:24:40,014][14919] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-03-21 11:24:40,073][14919] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-03-21 11:24:40,348][14919] Updated weights for policy 0, policy_version 57975 (0.0017) [2024-03-21 11:24:42,805][14687] Fps is (10 sec: 81920.6, 60 sec: 45875.2, 300 sec: 46097.4). Total num frames: 1899954176. Throughput: 0: 46906.7. Samples: 104904000. Policy #0 lag: (min: 0.0, avg: 31.4, max: 113.0) [2024-03-21 11:24:42,806][14687] Avg episode reward: [(0, '1.382')] [2024-03-21 11:24:45,804][14919] Updated weights for policy 0, policy_version 57985 (0.0013) [2024-03-21 11:24:47,805][14687] Fps is (10 sec: 72090.6, 60 sec: 48059.7, 300 sec: 45986.3). Total num frames: 1900216320. Throughput: 0: 44006.7. Samples: 105056100. Policy #0 lag: (min: 0.0, avg: 55.0, max: 110.0) [2024-03-21 11:24:47,806][14687] Avg episode reward: [(0, '1.420')] [2024-03-21 11:24:51,272][14919] Updated weights for policy 0, policy_version 57995 (0.0010) [2024-03-21 11:24:52,805][14687] Fps is (10 sec: 42598.2, 60 sec: 47513.6, 300 sec: 45542.0). Total num frames: 1900380160. Throughput: 0: 43651.1. Samples: 105328600. Policy #0 lag: (min: 0.0, avg: 55.0, max: 110.0) [2024-03-21 11:24:52,806][14687] Avg episode reward: [(0, '1.420')] [2024-03-21 11:24:57,805][14687] Fps is (10 sec: 26214.3, 60 sec: 42598.4, 300 sec: 44875.5). Total num frames: 1900478464. Throughput: 0: 43653.4. Samples: 105476100. Policy #0 lag: (min: 0.0, avg: 55.0, max: 110.0) [2024-03-21 11:24:57,806][14687] Avg episode reward: [(0, '0.888')] [2024-03-21 11:25:02,805][14687] Fps is (10 sec: 26214.6, 60 sec: 43144.6, 300 sec: 44431.2). Total num frames: 1900642304. Throughput: 0: 43564.5. Samples: 105754400. Policy #0 lag: (min: 0.0, avg: 55.0, max: 110.0) [2024-03-21 11:25:02,805][14687] Avg episode reward: [(0, '0.786')] [2024-03-21 11:25:07,805][14687] Fps is (10 sec: 19660.9, 60 sec: 39867.8, 300 sec: 44097.9). Total num frames: 1900675072. Throughput: 0: 44953.4. Samples: 106052100. Policy #0 lag: (min: 0.0, avg: 55.0, max: 110.0) [2024-03-21 11:25:07,806][14687] Avg episode reward: [(0, '1.326')] [2024-03-21 11:25:08,841][14919] Updated weights for policy 0, policy_version 58005 (0.0011) [2024-03-21 11:25:12,805][14687] Fps is (10 sec: 26214.4, 60 sec: 40413.9, 300 sec: 44542.3). Total num frames: 1900904448. Throughput: 0: 42493.4. Samples: 106188000. Policy #0 lag: (min: 0.0, avg: 55.0, max: 110.0) [2024-03-21 11:25:12,805][14687] Avg episode reward: [(0, '0.928')] [2024-03-21 11:25:13,858][14919] Updated weights for policy 0, policy_version 58015 (0.0012) [2024-03-21 11:25:17,581][14919] Updated weights for policy 0, policy_version 58025 (0.0016) [2024-03-21 11:25:17,805][14687] Fps is (10 sec: 68812.0, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 1901363200. Throughput: 0: 44893.3. Samples: 106451000. Policy #0 lag: (min: 0.0, avg: 55.0, max: 110.0) [2024-03-21 11:25:17,806][14687] Avg episode reward: [(0, '1.231')] [2024-03-21 11:25:22,805][14687] Fps is (10 sec: 75366.8, 60 sec: 46967.6, 300 sec: 45653.1). Total num frames: 1901658112. Throughput: 0: 45229.1. Samples: 106714800. Policy #0 lag: (min: 1.0, avg: 36.1, max: 74.0) [2024-03-21 11:25:22,805][14687] Avg episode reward: [(0, '1.215')] [2024-03-21 11:25:25,203][14919] Updated weights for policy 0, policy_version 58035 (0.0015) [2024-03-21 11:25:27,805][14687] Fps is (10 sec: 39321.9, 60 sec: 45875.2, 300 sec: 44653.3). Total num frames: 1901756416. Throughput: 0: 43635.5. Samples: 106867600. Policy #0 lag: (min: 1.0, avg: 36.1, max: 74.0) [2024-03-21 11:25:27,806][14687] Avg episode reward: [(0, '1.523')] [2024-03-21 11:25:32,805][14687] Fps is (10 sec: 26213.7, 60 sec: 46421.3, 300 sec: 44653.3). Total num frames: 1901920256. Throughput: 0: 46393.1. Samples: 107143800. Policy #0 lag: (min: 1.0, avg: 36.1, max: 74.0) [2024-03-21 11:25:32,806][14687] Avg episode reward: [(0, '1.535')] [2024-03-21 11:25:33,694][14919] Updated weights for policy 0, policy_version 58045 (0.0010) [2024-03-21 11:25:36,012][14898] Signal inference workers to stop experience collection... (2100 times) [2024-03-21 11:25:36,108][14919] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-03-21 11:25:36,268][14898] Signal inference workers to resume experience collection... (2100 times) [2024-03-21 11:25:36,268][14919] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-03-21 11:25:37,805][14687] Fps is (10 sec: 45875.5, 60 sec: 45329.2, 300 sec: 44986.6). Total num frames: 1902215168. Throughput: 0: 46460.1. Samples: 107419300. Policy #0 lag: (min: 1.0, avg: 36.1, max: 74.0) [2024-03-21 11:25:37,805][14687] Avg episode reward: [(0, '0.507')] [2024-03-21 11:25:40,207][14919] Updated weights for policy 0, policy_version 58056 (0.0012) [2024-03-21 11:25:42,805][14687] Fps is (10 sec: 55706.8, 60 sec: 42052.3, 300 sec: 44764.4). Total num frames: 1902477312. Throughput: 0: 45842.3. Samples: 107539000. Policy #0 lag: (min: 1.0, avg: 36.1, max: 74.0) [2024-03-21 11:25:42,805][14687] Avg episode reward: [(0, '1.467')] [2024-03-21 11:25:45,296][14919] Updated weights for policy 0, policy_version 58066 (0.0012) [2024-03-21 11:25:47,805][14687] Fps is (10 sec: 65535.6, 60 sec: 44236.8, 300 sec: 45653.0). Total num frames: 1902870528. Throughput: 0: 45591.0. Samples: 107806000. Policy #0 lag: (min: 1.0, avg: 36.1, max: 74.0) [2024-03-21 11:25:47,806][14687] Avg episode reward: [(0, '1.244')] [2024-03-21 11:25:52,647][14919] Updated weights for policy 0, policy_version 58076 (0.0020) [2024-03-21 11:25:52,805][14687] Fps is (10 sec: 55705.5, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 1903034368. Throughput: 0: 45455.6. Samples: 108097600. Policy #0 lag: (min: 1.0, avg: 36.1, max: 74.0) [2024-03-21 11:25:52,806][14687] Avg episode reward: [(0, '1.140')] [2024-03-21 11:25:57,805][14687] Fps is (10 sec: 26214.5, 60 sec: 44236.8, 300 sec: 45097.7). Total num frames: 1903132672. Throughput: 0: 45786.6. Samples: 108248400. Policy #0 lag: (min: 1.0, avg: 36.1, max: 74.0) [2024-03-21 11:25:57,806][14687] Avg episode reward: [(0, '1.140')] [2024-03-21 11:26:02,805][14687] Fps is (10 sec: 26214.3, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 1903296512. Throughput: 0: 46329.0. Samples: 108535800. Policy #0 lag: (min: 1.0, avg: 49.8, max: 119.0) [2024-03-21 11:26:02,806][14687] Avg episode reward: [(0, '1.177')] [2024-03-21 11:26:03,544][14919] Updated weights for policy 0, policy_version 58086 (0.0011) [2024-03-21 11:26:07,805][14687] Fps is (10 sec: 52427.8, 60 sec: 49698.0, 300 sec: 44875.4). Total num frames: 1903656960. Throughput: 0: 46706.4. Samples: 108816600. Policy #0 lag: (min: 1.0, avg: 49.8, max: 119.0) [2024-03-21 11:26:07,806][14687] Avg episode reward: [(0, '1.311')] [2024-03-21 11:26:08,100][14919] Updated weights for policy 0, policy_version 58096 (0.0011) [2024-03-21 11:26:12,805][14687] Fps is (10 sec: 49151.8, 60 sec: 48059.7, 300 sec: 44431.2). Total num frames: 1903788032. Throughput: 0: 46420.0. Samples: 108956500. Policy #0 lag: (min: 1.0, avg: 49.8, max: 119.0) [2024-03-21 11:26:12,806][14687] Avg episode reward: [(0, '1.247')] [2024-03-21 11:26:15,048][14919] Updated weights for policy 0, policy_version 58106 (0.0016) [2024-03-21 11:26:17,805][14687] Fps is (10 sec: 45875.8, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1904115712. Throughput: 0: 46675.7. Samples: 109244200. Policy #0 lag: (min: 1.0, avg: 49.8, max: 119.0) [2024-03-21 11:26:17,806][14687] Avg episode reward: [(0, '0.796')] [2024-03-21 11:26:22,140][14919] Updated weights for policy 0, policy_version 58116 (0.0010) [2024-03-21 11:26:22,805][14687] Fps is (10 sec: 62259.9, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1904410624. Throughput: 0: 46875.6. Samples: 109528700. Policy #0 lag: (min: 1.0, avg: 49.8, max: 119.0) [2024-03-21 11:26:22,806][14687] Avg episode reward: [(0, '0.830')] [2024-03-21 11:26:27,805][14687] Fps is (10 sec: 49151.6, 60 sec: 47513.5, 300 sec: 44986.5). Total num frames: 1904607232. Throughput: 0: 47193.2. Samples: 109662700. Policy #0 lag: (min: 1.0, avg: 49.8, max: 119.0) [2024-03-21 11:26:27,806][14687] Avg episode reward: [(0, '1.032')] [2024-03-21 11:26:27,818][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000058124_1904607232.pth... [2024-03-21 11:26:27,991][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000057796_1893859328.pth [2024-03-21 11:26:28,856][14919] Updated weights for policy 0, policy_version 58126 (0.0011) [2024-03-21 11:26:32,805][14687] Fps is (10 sec: 32767.6, 60 sec: 46967.5, 300 sec: 44431.2). Total num frames: 1904738304. Throughput: 0: 47684.4. Samples: 109951800. Policy #0 lag: (min: 1.0, avg: 49.8, max: 119.0) [2024-03-21 11:26:32,806][14687] Avg episode reward: [(0, '1.466')] [2024-03-21 11:26:37,234][14898] Signal inference workers to stop experience collection... (2150 times) [2024-03-21 11:26:37,235][14898] Signal inference workers to resume experience collection... (2150 times) [2024-03-21 11:26:37,292][14919] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-03-21 11:26:37,293][14919] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-03-21 11:26:37,805][14687] Fps is (10 sec: 32768.3, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 1904934912. Throughput: 0: 47408.8. Samples: 110231000. Policy #0 lag: (min: 1.0, avg: 29.8, max: 57.0) [2024-03-21 11:26:37,806][14687] Avg episode reward: [(0, '0.960')] [2024-03-21 11:26:39,267][14919] Updated weights for policy 0, policy_version 58136 (0.0015) [2024-03-21 11:26:42,805][14687] Fps is (10 sec: 49152.5, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1905229824. Throughput: 0: 46922.2. Samples: 110359900. Policy #0 lag: (min: 1.0, avg: 29.8, max: 57.0) [2024-03-21 11:26:42,805][14687] Avg episode reward: [(0, '0.990')] [2024-03-21 11:26:43,695][14919] Updated weights for policy 0, policy_version 58146 (0.0014) [2024-03-21 11:26:47,385][14919] Updated weights for policy 0, policy_version 58156 (0.0017) [2024-03-21 11:26:47,805][14687] Fps is (10 sec: 72089.8, 60 sec: 46421.4, 300 sec: 45875.2). Total num frames: 1905655808. Throughput: 0: 46637.8. Samples: 110634500. Policy #0 lag: (min: 1.0, avg: 29.8, max: 57.0) [2024-03-21 11:26:47,806][14687] Avg episode reward: [(0, '0.742')] [2024-03-21 11:26:52,805][14687] Fps is (10 sec: 55705.0, 60 sec: 45875.1, 300 sec: 44986.6). Total num frames: 1905786880. Throughput: 0: 46884.5. Samples: 110926400. Policy #0 lag: (min: 1.0, avg: 29.8, max: 57.0) [2024-03-21 11:26:52,806][14687] Avg episode reward: [(0, '0.742')] [2024-03-21 11:26:57,243][14919] Updated weights for policy 0, policy_version 58166 (0.0018) [2024-03-21 11:26:57,805][14687] Fps is (10 sec: 36044.7, 60 sec: 48059.7, 300 sec: 45208.7). Total num frames: 1906016256. Throughput: 0: 49411.1. Samples: 111180000. Policy #0 lag: (min: 1.0, avg: 29.8, max: 57.0) [2024-03-21 11:26:57,806][14687] Avg episode reward: [(0, '1.242')] [2024-03-21 11:27:02,805][14687] Fps is (10 sec: 36045.2, 60 sec: 47513.6, 300 sec: 44764.4). Total num frames: 1906147328. Throughput: 0: 45915.6. Samples: 111310400. Policy #0 lag: (min: 1.0, avg: 29.8, max: 57.0) [2024-03-21 11:27:02,805][14687] Avg episode reward: [(0, '0.723')] [2024-03-21 11:27:04,937][14919] Updated weights for policy 0, policy_version 58176 (0.0017) [2024-03-21 11:27:07,805][14687] Fps is (10 sec: 55706.7, 60 sec: 48606.1, 300 sec: 45097.7). Total num frames: 1906573312. Throughput: 0: 44786.8. Samples: 111544100. Policy #0 lag: (min: 1.0, avg: 29.8, max: 57.0) [2024-03-21 11:27:07,806][14687] Avg episode reward: [(0, '1.256')] [2024-03-21 11:27:11,169][14919] Updated weights for policy 0, policy_version 58186 (0.0023) [2024-03-21 11:27:12,805][14687] Fps is (10 sec: 58981.8, 60 sec: 49152.0, 300 sec: 45097.7). Total num frames: 1906737152. Throughput: 0: 45020.1. Samples: 111688600. Policy #0 lag: (min: 0.0, avg: 44.7, max: 84.0) [2024-03-21 11:27:12,806][14687] Avg episode reward: [(0, '0.847')] [2024-03-21 11:27:16,741][14919] Updated weights for policy 0, policy_version 58196 (0.0010) [2024-03-21 11:27:17,805][14687] Fps is (10 sec: 39321.2, 60 sec: 47513.7, 300 sec: 45208.7). Total num frames: 1906966528. Throughput: 0: 44697.9. Samples: 111963200. Policy #0 lag: (min: 0.0, avg: 44.7, max: 84.0) [2024-03-21 11:27:17,805][14687] Avg episode reward: [(0, '0.790')] [2024-03-21 11:27:22,805][14687] Fps is (10 sec: 49152.3, 60 sec: 46967.4, 300 sec: 45430.9). Total num frames: 1907228672. Throughput: 0: 44642.2. Samples: 112239900. Policy #0 lag: (min: 0.0, avg: 44.7, max: 84.0) [2024-03-21 11:27:22,806][14687] Avg episode reward: [(0, '0.755')] [2024-03-21 11:27:23,667][14919] Updated weights for policy 0, policy_version 58206 (0.0018) [2024-03-21 11:27:27,500][14898] Signal inference workers to stop experience collection... (2200 times) [2024-03-21 11:27:27,501][14898] Signal inference workers to resume experience collection... (2200 times) [2024-03-21 11:27:27,573][14919] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-03-21 11:27:27,573][14919] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-03-21 11:27:27,805][14687] Fps is (10 sec: 39321.2, 60 sec: 45875.3, 300 sec: 45097.6). Total num frames: 1907359744. Throughput: 0: 45002.1. Samples: 112385000. Policy #0 lag: (min: 0.0, avg: 44.7, max: 84.0) [2024-03-21 11:27:27,806][14687] Avg episode reward: [(0, '1.535')] [2024-03-21 11:27:30,948][14919] Updated weights for policy 0, policy_version 58216 (0.0023) [2024-03-21 11:27:32,805][14687] Fps is (10 sec: 42598.4, 60 sec: 48605.9, 300 sec: 44875.5). Total num frames: 1907654656. Throughput: 0: 45124.4. Samples: 112665100. Policy #0 lag: (min: 0.0, avg: 44.7, max: 84.0) [2024-03-21 11:27:32,806][14687] Avg episode reward: [(0, '1.305')] [2024-03-21 11:27:37,805][14687] Fps is (10 sec: 49152.0, 60 sec: 48605.9, 300 sec: 44986.5). Total num frames: 1907851264. Throughput: 0: 44500.0. Samples: 112928900. Policy #0 lag: (min: 0.0, avg: 44.7, max: 84.0) [2024-03-21 11:27:37,806][14687] Avg episode reward: [(0, '1.273')] [2024-03-21 11:27:42,337][14919] Updated weights for policy 0, policy_version 58226 (0.0015) [2024-03-21 11:27:42,805][14687] Fps is (10 sec: 29491.0, 60 sec: 45329.0, 300 sec: 44209.0). Total num frames: 1907949568. Throughput: 0: 42028.9. Samples: 113071300. Policy #0 lag: (min: 0.0, avg: 44.7, max: 84.0) [2024-03-21 11:27:42,806][14687] Avg episode reward: [(0, '1.459')] [2024-03-21 11:27:47,805][14687] Fps is (10 sec: 19660.8, 60 sec: 39867.7, 300 sec: 44209.0). Total num frames: 1908047872. Throughput: 0: 45019.9. Samples: 113336300. Policy #0 lag: (min: 0.0, avg: 44.7, max: 84.0) [2024-03-21 11:27:47,806][14687] Avg episode reward: [(0, '1.417')] [2024-03-21 11:27:52,805][14687] Fps is (10 sec: 16383.8, 60 sec: 38775.4, 300 sec: 43986.8). Total num frames: 1908113408. Throughput: 0: 45835.2. Samples: 113606700. Policy #0 lag: (min: 0.0, avg: 35.6, max: 78.0) [2024-03-21 11:27:52,806][14687] Avg episode reward: [(0, '1.284')] [2024-03-21 11:27:54,811][14919] Updated weights for policy 0, policy_version 58236 (0.0016) [2024-03-21 11:27:57,805][14687] Fps is (10 sec: 42598.0, 60 sec: 40959.9, 300 sec: 44431.2). Total num frames: 1908473856. Throughput: 0: 47728.8. Samples: 113836400. Policy #0 lag: (min: 0.0, avg: 35.6, max: 78.0) [2024-03-21 11:27:57,806][14687] Avg episode reward: [(0, '0.856')] [2024-03-21 11:27:59,094][14919] Updated weights for policy 0, policy_version 58246 (0.0013) [2024-03-21 11:28:02,354][14919] Updated weights for policy 0, policy_version 58256 (0.0012) [2024-03-21 11:28:02,805][14687] Fps is (10 sec: 81922.3, 60 sec: 46421.4, 300 sec: 45764.1). Total num frames: 1908932608. Throughput: 0: 43946.7. Samples: 113940800. Policy #0 lag: (min: 0.0, avg: 35.6, max: 78.0) [2024-03-21 11:28:02,805][14687] Avg episode reward: [(0, '1.238')] [2024-03-21 11:28:06,570][14919] Updated weights for policy 0, policy_version 58266 (0.0019) [2024-03-21 11:28:07,805][14687] Fps is (10 sec: 85198.0, 60 sec: 45875.1, 300 sec: 46097.4). Total num frames: 1909325824. Throughput: 0: 42840.0. Samples: 114167700. Policy #0 lag: (min: 0.0, avg: 35.6, max: 78.0) [2024-03-21 11:28:07,806][14687] Avg episode reward: [(0, '0.960')] [2024-03-21 11:28:12,805][14687] Fps is (10 sec: 55705.3, 60 sec: 45875.3, 300 sec: 45986.3). Total num frames: 1909489664. Throughput: 0: 46046.8. Samples: 114457100. Policy #0 lag: (min: 0.0, avg: 35.6, max: 78.0) [2024-03-21 11:28:12,806][14687] Avg episode reward: [(0, '0.960')] [2024-03-21 11:28:13,903][14898] Signal inference workers to stop experience collection... (2250 times) [2024-03-21 11:28:13,936][14919] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-03-21 11:28:14,174][14898] Signal inference workers to resume experience collection... (2250 times) [2024-03-21 11:28:14,174][14919] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-03-21 11:28:14,498][14919] Updated weights for policy 0, policy_version 58276 (0.0010) [2024-03-21 11:28:17,805][14687] Fps is (10 sec: 36044.8, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 1909686272. Throughput: 0: 42617.8. Samples: 114582900. Policy #0 lag: (min: 0.0, avg: 35.6, max: 78.0) [2024-03-21 11:28:17,806][14687] Avg episode reward: [(0, '1.653')] [2024-03-21 11:28:22,673][14919] Updated weights for policy 0, policy_version 58286 (0.0013) [2024-03-21 11:28:22,805][14687] Fps is (10 sec: 42598.1, 60 sec: 44782.9, 300 sec: 44653.4). Total num frames: 1909915648. Throughput: 0: 42260.0. Samples: 114830600. Policy #0 lag: (min: 0.0, avg: 35.6, max: 78.0) [2024-03-21 11:28:22,806][14687] Avg episode reward: [(0, '1.373')] [2024-03-21 11:28:27,805][14687] Fps is (10 sec: 26214.4, 60 sec: 43144.6, 300 sec: 44098.0). Total num frames: 1909948416. Throughput: 0: 42075.6. Samples: 114964700. Policy #0 lag: (min: 0.0, avg: 35.6, max: 78.0) [2024-03-21 11:28:27,806][14687] Avg episode reward: [(0, '1.373')] [2024-03-21 11:28:28,087][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000058288_1909981184.pth... [2024-03-21 11:28:28,202][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000057953_1899003904.pth [2024-03-21 11:28:32,805][14687] Fps is (10 sec: 19660.9, 60 sec: 40960.0, 300 sec: 44431.2). Total num frames: 1910112256. Throughput: 0: 41895.6. Samples: 115221600. Policy #0 lag: (min: 1.0, avg: 47.4, max: 101.0) [2024-03-21 11:28:32,806][14687] Avg episode reward: [(0, '1.188')] [2024-03-21 11:28:37,805][14687] Fps is (10 sec: 26214.3, 60 sec: 39321.6, 300 sec: 44097.9). Total num frames: 1910210560. Throughput: 0: 41526.8. Samples: 115475400. Policy #0 lag: (min: 1.0, avg: 47.4, max: 101.0) [2024-03-21 11:28:37,806][14687] Avg episode reward: [(0, '0.770')] [2024-03-21 11:28:39,315][14919] Updated weights for policy 0, policy_version 58296 (0.0017) [2024-03-21 11:28:42,805][14687] Fps is (10 sec: 29491.4, 60 sec: 40960.1, 300 sec: 44320.1). Total num frames: 1910407168. Throughput: 0: 42478.0. Samples: 115747900. Policy #0 lag: (min: 1.0, avg: 47.4, max: 101.0) [2024-03-21 11:28:42,805][14687] Avg episode reward: [(0, '1.271')] [2024-03-21 11:28:46,416][14919] Updated weights for policy 0, policy_version 58306 (0.0012) [2024-03-21 11:28:47,805][14687] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 44431.2). Total num frames: 1910636544. Throughput: 0: 43040.0. Samples: 115877600. Policy #0 lag: (min: 1.0, avg: 47.4, max: 101.0) [2024-03-21 11:28:47,805][14687] Avg episode reward: [(0, '1.283')] [2024-03-21 11:28:51,514][14919] Updated weights for policy 0, policy_version 58316 (0.0013) [2024-03-21 11:28:52,805][14687] Fps is (10 sec: 52428.0, 60 sec: 46967.6, 300 sec: 44097.9). Total num frames: 1910931456. Throughput: 0: 43071.0. Samples: 116105900. Policy #0 lag: (min: 1.0, avg: 47.4, max: 101.0) [2024-03-21 11:28:52,806][14687] Avg episode reward: [(0, '1.332')] [2024-03-21 11:28:57,541][14919] Updated weights for policy 0, policy_version 58326 (0.0013) [2024-03-21 11:28:57,806][14687] Fps is (10 sec: 58974.8, 60 sec: 45874.3, 300 sec: 44653.2). Total num frames: 1911226368. Throughput: 0: 41738.8. Samples: 116335400. Policy #0 lag: (min: 1.0, avg: 47.4, max: 101.0) [2024-03-21 11:28:57,807][14687] Avg episode reward: [(0, '1.037')] [2024-03-21 11:29:02,805][14687] Fps is (10 sec: 49152.5, 60 sec: 41506.1, 300 sec: 44542.3). Total num frames: 1911422976. Throughput: 0: 41851.1. Samples: 116466200. Policy #0 lag: (min: 1.0, avg: 47.4, max: 101.0) [2024-03-21 11:29:02,805][14687] Avg episode reward: [(0, '0.777')] [2024-03-21 11:29:03,958][14919] Updated weights for policy 0, policy_version 58336 (0.0014) [2024-03-21 11:29:07,805][14687] Fps is (10 sec: 52435.1, 60 sec: 40413.8, 300 sec: 44986.6). Total num frames: 1911750656. Throughput: 0: 41944.4. Samples: 116718100. Policy #0 lag: (min: 3.0, avg: 50.1, max: 94.0) [2024-03-21 11:29:07,806][14687] Avg episode reward: [(0, '1.166')] [2024-03-21 11:29:12,805][14687] Fps is (10 sec: 42598.5, 60 sec: 39321.6, 300 sec: 44764.4). Total num frames: 1911848960. Throughput: 0: 41980.0. Samples: 116853800. Policy #0 lag: (min: 3.0, avg: 50.1, max: 94.0) [2024-03-21 11:29:12,806][14687] Avg episode reward: [(0, '1.571')] [2024-03-21 11:29:14,080][14919] Updated weights for policy 0, policy_version 58346 (0.0011) [2024-03-21 11:29:17,805][14687] Fps is (10 sec: 26214.4, 60 sec: 38775.4, 300 sec: 44653.3). Total num frames: 1912012800. Throughput: 0: 42628.9. Samples: 117139900. Policy #0 lag: (min: 3.0, avg: 50.1, max: 94.0) [2024-03-21 11:29:17,805][14687] Avg episode reward: [(0, '1.350')] [2024-03-21 11:29:20,628][14919] Updated weights for policy 0, policy_version 58356 (0.0012) [2024-03-21 11:29:22,805][14687] Fps is (10 sec: 36044.7, 60 sec: 38229.3, 300 sec: 44764.4). Total num frames: 1912209408. Throughput: 0: 43233.3. Samples: 117420900. Policy #0 lag: (min: 3.0, avg: 50.1, max: 94.0) [2024-03-21 11:29:22,806][14687] Avg episode reward: [(0, '1.265')] [2024-03-21 11:29:26,962][14898] Signal inference workers to stop experience collection... (2300 times) [2024-03-21 11:29:27,007][14919] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-03-21 11:29:27,271][14898] Signal inference workers to resume experience collection... (2300 times) [2024-03-21 11:29:27,271][14919] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-03-21 11:29:27,805][14687] Fps is (10 sec: 39321.8, 60 sec: 40960.0, 300 sec: 44986.6). Total num frames: 1912406016. Throughput: 0: 43582.2. Samples: 117709100. Policy #0 lag: (min: 3.0, avg: 50.1, max: 94.0) [2024-03-21 11:29:27,805][14687] Avg episode reward: [(0, '1.265')] [2024-03-21 11:29:28,842][14919] Updated weights for policy 0, policy_version 58366 (0.0023) [2024-03-21 11:29:32,805][14687] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1912733696. Throughput: 0: 43424.5. Samples: 117831700. Policy #0 lag: (min: 3.0, avg: 50.1, max: 94.0) [2024-03-21 11:29:32,805][14687] Avg episode reward: [(0, '1.265')] [2024-03-21 11:29:35,726][14919] Updated weights for policy 0, policy_version 58376 (0.0012) [2024-03-21 11:29:37,805][14687] Fps is (10 sec: 65535.7, 60 sec: 47513.6, 300 sec: 44431.2). Total num frames: 1913061376. Throughput: 0: 44422.3. Samples: 118104900. Policy #0 lag: (min: 3.0, avg: 50.1, max: 94.0) [2024-03-21 11:29:37,806][14687] Avg episode reward: [(0, '1.217')] [2024-03-21 11:29:42,805][14687] Fps is (10 sec: 42598.0, 60 sec: 45875.1, 300 sec: 43875.8). Total num frames: 1913159680. Throughput: 0: 42167.8. Samples: 118232900. Policy #0 lag: (min: 0.0, avg: 33.5, max: 80.0) [2024-03-21 11:29:42,806][14687] Avg episode reward: [(0, '1.340')] [2024-03-21 11:29:43,116][14919] Updated weights for policy 0, policy_version 58386 (0.0017) [2024-03-21 11:29:47,805][14687] Fps is (10 sec: 26214.7, 60 sec: 44783.0, 300 sec: 43875.8). Total num frames: 1913323520. Throughput: 0: 45586.8. Samples: 118517600. Policy #0 lag: (min: 0.0, avg: 33.5, max: 80.0) [2024-03-21 11:29:47,805][14687] Avg episode reward: [(0, '1.473')] [2024-03-21 11:29:51,128][14919] Updated weights for policy 0, policy_version 58396 (0.0012) [2024-03-21 11:29:52,806][14687] Fps is (10 sec: 42595.5, 60 sec: 44236.3, 300 sec: 44431.1). Total num frames: 1913585664. Throughput: 0: 45701.5. Samples: 118774700. Policy #0 lag: (min: 0.0, avg: 33.5, max: 80.0) [2024-03-21 11:29:52,807][14687] Avg episode reward: [(0, '1.278')] [2024-03-21 11:29:56,839][14919] Updated weights for policy 0, policy_version 58406 (0.0022) [2024-03-21 11:29:57,805][14687] Fps is (10 sec: 55704.4, 60 sec: 44237.6, 300 sec: 44875.5). Total num frames: 1913880576. Throughput: 0: 48682.1. Samples: 119044500. Policy #0 lag: (min: 0.0, avg: 33.5, max: 80.0) [2024-03-21 11:29:57,806][14687] Avg episode reward: [(0, '1.278')] [2024-03-21 11:30:02,805][14687] Fps is (10 sec: 52432.9, 60 sec: 44783.0, 300 sec: 45542.0). Total num frames: 1914109952. Throughput: 0: 45264.5. Samples: 119176800. Policy #0 lag: (min: 0.0, avg: 33.5, max: 80.0) [2024-03-21 11:30:02,805][14687] Avg episode reward: [(0, '1.236')] [2024-03-21 11:30:05,212][14919] Updated weights for policy 0, policy_version 58416 (0.0019) [2024-03-21 11:30:07,805][14687] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 1914306560. Throughput: 0: 45117.8. Samples: 119451200. Policy #0 lag: (min: 0.0, avg: 33.5, max: 80.0) [2024-03-21 11:30:07,806][14687] Avg episode reward: [(0, '1.043')] [2024-03-21 11:30:11,723][14919] Updated weights for policy 0, policy_version 58426 (0.0017) [2024-03-21 11:30:12,805][14687] Fps is (10 sec: 45874.8, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 1914568704. Throughput: 0: 42111.1. Samples: 119604100. Policy #0 lag: (min: 0.0, avg: 33.5, max: 80.0) [2024-03-21 11:30:12,806][14687] Avg episode reward: [(0, '0.919')] [2024-03-21 11:30:17,805][14687] Fps is (10 sec: 49151.6, 60 sec: 46421.3, 300 sec: 44542.2). Total num frames: 1914798080. Throughput: 0: 45373.2. Samples: 119873500. Policy #0 lag: (min: 0.0, avg: 33.5, max: 80.0) [2024-03-21 11:30:17,806][14687] Avg episode reward: [(0, '1.415')] [2024-03-21 11:30:18,710][14919] Updated weights for policy 0, policy_version 58436 (0.0012) [2024-03-21 11:30:19,047][14898] Signal inference workers to stop experience collection... (2350 times) [2024-03-21 11:30:19,130][14919] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-03-21 11:30:19,280][14898] Signal inference workers to resume experience collection... (2350 times) [2024-03-21 11:30:19,281][14919] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-03-21 11:30:22,805][14687] Fps is (10 sec: 49152.0, 60 sec: 47513.6, 300 sec: 45097.7). Total num frames: 1915060224. Throughput: 0: 45548.9. Samples: 120154600. Policy #0 lag: (min: 0.0, avg: 36.6, max: 71.0) [2024-03-21 11:30:22,806][14687] Avg episode reward: [(0, '1.415')] [2024-03-21 11:30:24,820][14919] Updated weights for policy 0, policy_version 58446 (0.0013) [2024-03-21 11:30:27,805][14687] Fps is (10 sec: 42598.9, 60 sec: 46967.4, 300 sec: 45097.7). Total num frames: 1915224064. Throughput: 0: 49228.9. Samples: 120448200. Policy #0 lag: (min: 0.0, avg: 36.6, max: 71.0) [2024-03-21 11:30:27,806][14687] Avg episode reward: [(0, '1.415')] [2024-03-21 11:30:27,972][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000058449_1915256832.pth... [2024-03-21 11:30:28,095][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000058124_1904607232.pth [2024-03-21 11:30:30,443][14919] Updated weights for policy 0, policy_version 58456 (0.0027) [2024-03-21 11:30:32,805][14687] Fps is (10 sec: 52429.2, 60 sec: 47513.6, 300 sec: 45319.8). Total num frames: 1915584512. Throughput: 0: 45784.4. Samples: 120577900. Policy #0 lag: (min: 0.0, avg: 36.6, max: 71.0) [2024-03-21 11:30:32,805][14687] Avg episode reward: [(0, '1.580')] [2024-03-21 11:30:37,805][14687] Fps is (10 sec: 45875.3, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 1915682816. Throughput: 0: 46578.6. Samples: 120870700. Policy #0 lag: (min: 0.0, avg: 36.6, max: 71.0) [2024-03-21 11:30:37,806][14687] Avg episode reward: [(0, '1.178')] [2024-03-21 11:30:42,805][14687] Fps is (10 sec: 19660.6, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 1915781120. Throughput: 0: 43706.7. Samples: 121011300. Policy #0 lag: (min: 0.0, avg: 36.6, max: 71.0) [2024-03-21 11:30:42,806][14687] Avg episode reward: [(0, '0.559')] [2024-03-21 11:30:43,239][14919] Updated weights for policy 0, policy_version 58466 (0.0016) [2024-03-21 11:30:47,805][14687] Fps is (10 sec: 32767.7, 60 sec: 44782.8, 300 sec: 43986.9). Total num frames: 1916010496. Throughput: 0: 46913.2. Samples: 121287900. Policy #0 lag: (min: 0.0, avg: 36.6, max: 71.0) [2024-03-21 11:30:47,806][14687] Avg episode reward: [(0, '1.277')] [2024-03-21 11:30:49,506][14919] Updated weights for policy 0, policy_version 58476 (0.0019) [2024-03-21 11:30:52,805][14687] Fps is (10 sec: 62259.2, 60 sec: 46968.0, 300 sec: 44986.6). Total num frames: 1916403712. Throughput: 0: 46680.0. Samples: 121551800. Policy #0 lag: (min: 0.0, avg: 36.6, max: 71.0) [2024-03-21 11:30:52,806][14687] Avg episode reward: [(0, '1.234')] [2024-03-21 11:30:53,410][14919] Updated weights for policy 0, policy_version 58486 (0.0011) [2024-03-21 11:30:57,805][14687] Fps is (10 sec: 68813.5, 60 sec: 46967.6, 300 sec: 45430.9). Total num frames: 1916698624. Throughput: 0: 46400.1. Samples: 121692100. Policy #0 lag: (min: 0.0, avg: 36.4, max: 69.0) [2024-03-21 11:30:57,805][14687] Avg episode reward: [(0, '1.234')] [2024-03-21 11:30:58,590][14919] Updated weights for policy 0, policy_version 58496 (0.0014) [2024-03-21 11:31:02,805][14687] Fps is (10 sec: 55705.8, 60 sec: 47513.5, 300 sec: 45097.7). Total num frames: 1916960768. Throughput: 0: 46735.6. Samples: 121976600. Policy #0 lag: (min: 0.0, avg: 36.4, max: 69.0) [2024-03-21 11:31:02,806][14687] Avg episode reward: [(0, '0.784')] [2024-03-21 11:31:05,922][14919] Updated weights for policy 0, policy_version 58506 (0.0015) [2024-03-21 11:31:07,805][14687] Fps is (10 sec: 45875.0, 60 sec: 47513.6, 300 sec: 45319.8). Total num frames: 1917157376. Throughput: 0: 46266.7. Samples: 122236600. Policy #0 lag: (min: 0.0, avg: 36.4, max: 69.0) [2024-03-21 11:31:07,805][14687] Avg episode reward: [(0, '1.169')] [2024-03-21 11:31:12,805][14687] Fps is (10 sec: 32768.2, 60 sec: 45329.1, 300 sec: 44653.4). Total num frames: 1917288448. Throughput: 0: 46422.3. Samples: 122537200. Policy #0 lag: (min: 0.0, avg: 36.4, max: 69.0) [2024-03-21 11:31:12,806][14687] Avg episode reward: [(0, '1.087')] [2024-03-21 11:31:14,984][14898] Signal inference workers to stop experience collection... (2400 times) [2024-03-21 11:31:15,022][14919] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-03-21 11:31:15,271][14898] Signal inference workers to resume experience collection... (2400 times) [2024-03-21 11:31:15,271][14919] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-03-21 11:31:15,569][14919] Updated weights for policy 0, policy_version 58516 (0.0012) [2024-03-21 11:31:17,805][14687] Fps is (10 sec: 45875.2, 60 sec: 46967.6, 300 sec: 44764.4). Total num frames: 1917616128. Throughput: 0: 46404.4. Samples: 122666100. Policy #0 lag: (min: 0.0, avg: 36.4, max: 69.0) [2024-03-21 11:31:17,806][14687] Avg episode reward: [(0, '1.154')] [2024-03-21 11:31:21,828][14919] Updated weights for policy 0, policy_version 58526 (0.0014) [2024-03-21 11:31:22,805][14687] Fps is (10 sec: 52428.3, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1917812736. Throughput: 0: 46473.2. Samples: 122962000. Policy #0 lag: (min: 0.0, avg: 36.4, max: 69.0) [2024-03-21 11:31:22,806][14687] Avg episode reward: [(0, '1.154')] [2024-03-21 11:31:27,805][14687] Fps is (10 sec: 39321.4, 60 sec: 46421.3, 300 sec: 44986.6). Total num frames: 1918009344. Throughput: 0: 46780.0. Samples: 123116400. Policy #0 lag: (min: 0.0, avg: 36.4, max: 69.0) [2024-03-21 11:31:27,806][14687] Avg episode reward: [(0, '1.177')] [2024-03-21 11:31:29,218][14919] Updated weights for policy 0, policy_version 58536 (0.0016) [2024-03-21 11:31:32,805][14687] Fps is (10 sec: 49152.6, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1918304256. Throughput: 0: 46520.1. Samples: 123381300. Policy #0 lag: (min: 0.0, avg: 36.4, max: 69.0) [2024-03-21 11:31:32,806][14687] Avg episode reward: [(0, '1.177')] [2024-03-21 11:31:35,701][14919] Updated weights for policy 0, policy_version 58546 (0.0014) [2024-03-21 11:31:37,805][14687] Fps is (10 sec: 42598.4, 60 sec: 45875.1, 300 sec: 44764.4). Total num frames: 1918435328. Throughput: 0: 46337.8. Samples: 123637000. Policy #0 lag: (min: 0.0, avg: 34.9, max: 93.0) [2024-03-21 11:31:37,806][14687] Avg episode reward: [(0, '1.756')] [2024-03-21 11:31:41,916][14919] Updated weights for policy 0, policy_version 58556 (0.0022) [2024-03-21 11:31:42,805][14687] Fps is (10 sec: 49151.0, 60 sec: 50244.2, 300 sec: 44542.2). Total num frames: 1918795776. Throughput: 0: 45413.1. Samples: 123735700. Policy #0 lag: (min: 0.0, avg: 34.9, max: 93.0) [2024-03-21 11:31:42,806][14687] Avg episode reward: [(0, '1.342')] [2024-03-21 11:31:47,805][14687] Fps is (10 sec: 42598.6, 60 sec: 47513.6, 300 sec: 44320.1). Total num frames: 1918861312. Throughput: 0: 45080.0. Samples: 124005200. Policy #0 lag: (min: 0.0, avg: 34.9, max: 93.0) [2024-03-21 11:31:47,806][14687] Avg episode reward: [(0, '1.502')] [2024-03-21 11:31:52,741][14919] Updated weights for policy 0, policy_version 58566 (0.0012) [2024-03-21 11:31:52,805][14687] Fps is (10 sec: 29491.6, 60 sec: 44783.0, 300 sec: 44320.1). Total num frames: 1919090688. Throughput: 0: 45793.3. Samples: 124297300. Policy #0 lag: (min: 0.0, avg: 34.9, max: 93.0) [2024-03-21 11:31:52,805][14687] Avg episode reward: [(0, '1.502')] [2024-03-21 11:31:57,667][14919] Updated weights for policy 0, policy_version 58576 (0.0020) [2024-03-21 11:31:57,805][14687] Fps is (10 sec: 55705.5, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 1919418368. Throughput: 0: 44891.1. Samples: 124557300. Policy #0 lag: (min: 0.0, avg: 34.9, max: 93.0) [2024-03-21 11:31:57,806][14687] Avg episode reward: [(0, '1.371')] [2024-03-21 11:32:02,051][14898] Signal inference workers to stop experience collection... (2450 times) [2024-03-21 11:32:02,133][14919] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-03-21 11:32:02,261][14898] Signal inference workers to resume experience collection... (2450 times) [2024-03-21 11:32:02,261][14919] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-03-21 11:32:02,805][14687] Fps is (10 sec: 62259.8, 60 sec: 45875.3, 300 sec: 44542.3). Total num frames: 1919713280. Throughput: 0: 45184.5. Samples: 124699400. Policy #0 lag: (min: 0.0, avg: 34.9, max: 93.0) [2024-03-21 11:32:02,805][14687] Avg episode reward: [(0, '0.482')] [2024-03-21 11:32:02,895][14919] Updated weights for policy 0, policy_version 58586 (0.0024) [2024-03-21 11:32:07,805][14687] Fps is (10 sec: 52429.0, 60 sec: 46421.3, 300 sec: 44764.4). Total num frames: 1919942656. Throughput: 0: 43962.3. Samples: 124940300. Policy #0 lag: (min: 0.0, avg: 34.9, max: 93.0) [2024-03-21 11:32:07,805][14687] Avg episode reward: [(0, '1.281')] [2024-03-21 11:32:11,619][14919] Updated weights for policy 0, policy_version 58596 (0.0011) [2024-03-21 11:32:12,805][14687] Fps is (10 sec: 36044.4, 60 sec: 46421.3, 300 sec: 44431.2). Total num frames: 1920073728. Throughput: 0: 43215.5. Samples: 125061100. Policy #0 lag: (min: 0.0, avg: 37.7, max: 78.0) [2024-03-21 11:32:12,806][14687] Avg episode reward: [(0, '1.409')] [2024-03-21 11:32:17,805][14687] Fps is (10 sec: 22937.4, 60 sec: 42598.4, 300 sec: 43875.8). Total num frames: 1920172032. Throughput: 0: 43633.2. Samples: 125344800. Policy #0 lag: (min: 0.0, avg: 37.7, max: 78.0) [2024-03-21 11:32:17,806][14687] Avg episode reward: [(0, '0.886')] [2024-03-21 11:32:20,123][14919] Updated weights for policy 0, policy_version 58606 (0.0023) [2024-03-21 11:32:22,805][14687] Fps is (10 sec: 39321.9, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1920466944. Throughput: 0: 43942.3. Samples: 125614400. Policy #0 lag: (min: 0.0, avg: 37.7, max: 78.0) [2024-03-21 11:32:22,805][14687] Avg episode reward: [(0, '0.836')] [2024-03-21 11:32:26,098][14919] Updated weights for policy 0, policy_version 58616 (0.0023) [2024-03-21 11:32:27,805][14687] Fps is (10 sec: 58982.3, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1920761856. Throughput: 0: 44760.1. Samples: 125749900. Policy #0 lag: (min: 0.0, avg: 37.7, max: 78.0) [2024-03-21 11:32:27,806][14687] Avg episode reward: [(0, '1.821')] [2024-03-21 11:32:27,887][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000058618_1920794624.pth... [2024-03-21 11:32:28,018][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000058288_1909981184.pth [2024-03-21 11:32:32,805][14687] Fps is (10 sec: 52428.3, 60 sec: 44782.8, 300 sec: 44542.3). Total num frames: 1920991232. Throughput: 0: 44324.4. Samples: 125999800. Policy #0 lag: (min: 0.0, avg: 37.7, max: 78.0) [2024-03-21 11:32:32,806][14687] Avg episode reward: [(0, '1.821')] [2024-03-21 11:32:37,260][14919] Updated weights for policy 0, policy_version 58626 (0.0011) [2024-03-21 11:32:37,805][14687] Fps is (10 sec: 32768.2, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1921089536. Throughput: 0: 44220.0. Samples: 126287200. Policy #0 lag: (min: 0.0, avg: 37.7, max: 78.0) [2024-03-21 11:32:37,806][14687] Avg episode reward: [(0, '1.052')] [2024-03-21 11:32:42,805][14687] Fps is (10 sec: 32768.1, 60 sec: 42052.4, 300 sec: 44986.6). Total num frames: 1921318912. Throughput: 0: 44097.8. Samples: 126541700. Policy #0 lag: (min: 0.0, avg: 37.7, max: 78.0) [2024-03-21 11:32:42,806][14687] Avg episode reward: [(0, '1.213')] [2024-03-21 11:32:44,289][14919] Updated weights for policy 0, policy_version 58636 (0.0011) [2024-03-21 11:32:47,805][14687] Fps is (10 sec: 55705.8, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 1921646592. Throughput: 0: 43864.4. Samples: 126673300. Policy #0 lag: (min: 0.0, avg: 37.7, max: 78.0) [2024-03-21 11:32:47,806][14687] Avg episode reward: [(0, '0.916')] [2024-03-21 11:32:48,250][14919] Updated weights for policy 0, policy_version 58646 (0.0011) [2024-03-21 11:32:52,805][14687] Fps is (10 sec: 49152.0, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 1921810432. Throughput: 0: 44437.7. Samples: 126940000. Policy #0 lag: (min: 4.0, avg: 29.7, max: 71.0) [2024-03-21 11:32:52,806][14687] Avg episode reward: [(0, '0.754')] [2024-03-21 11:32:56,119][14919] Updated weights for policy 0, policy_version 58656 (0.0013) [2024-03-21 11:32:57,805][14687] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1922072576. Throughput: 0: 44500.1. Samples: 127063600. Policy #0 lag: (min: 4.0, avg: 29.7, max: 71.0) [2024-03-21 11:32:57,805][14687] Avg episode reward: [(0, '1.317')] [2024-03-21 11:32:58,437][14898] Signal inference workers to stop experience collection... (2500 times) [2024-03-21 11:32:58,510][14919] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-03-21 11:32:58,515][14898] Signal inference workers to resume experience collection... (2500 times) [2024-03-21 11:32:58,554][14919] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-03-21 11:33:02,805][14687] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 1922334720. Throughput: 0: 43140.0. Samples: 127286100. Policy #0 lag: (min: 4.0, avg: 29.7, max: 71.0) [2024-03-21 11:33:02,806][14687] Avg episode reward: [(0, '0.925')] [2024-03-21 11:33:06,794][14919] Updated weights for policy 0, policy_version 58666 (0.0012) [2024-03-21 11:33:07,805][14687] Fps is (10 sec: 32767.8, 60 sec: 40960.0, 300 sec: 43764.7). Total num frames: 1922400256. Throughput: 0: 42835.5. Samples: 127542000. Policy #0 lag: (min: 4.0, avg: 29.7, max: 71.0) [2024-03-21 11:33:07,806][14687] Avg episode reward: [(0, '1.203')] [2024-03-21 11:33:12,805][14687] Fps is (10 sec: 26214.6, 60 sec: 42052.4, 300 sec: 43764.7). Total num frames: 1922596864. Throughput: 0: 45226.8. Samples: 127785100. Policy #0 lag: (min: 4.0, avg: 29.7, max: 71.0) [2024-03-21 11:33:12,805][14687] Avg episode reward: [(0, '1.160')] [2024-03-21 11:33:14,384][14919] Updated weights for policy 0, policy_version 58676 (0.0012) [2024-03-21 11:33:17,805][14687] Fps is (10 sec: 49152.2, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 1922891776. Throughput: 0: 42486.7. Samples: 127911700. Policy #0 lag: (min: 4.0, avg: 29.7, max: 71.0) [2024-03-21 11:33:17,806][14687] Avg episode reward: [(0, '0.952')] [2024-03-21 11:33:19,628][14919] Updated weights for policy 0, policy_version 58686 (0.0011) [2024-03-21 11:33:22,805][14687] Fps is (10 sec: 55705.5, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 1923153920. Throughput: 0: 41949.0. Samples: 128174900. Policy #0 lag: (min: 4.0, avg: 29.7, max: 71.0) [2024-03-21 11:33:22,805][14687] Avg episode reward: [(0, '0.771')] [2024-03-21 11:33:26,200][14919] Updated weights for policy 0, policy_version 58696 (0.0011) [2024-03-21 11:33:27,805][14687] Fps is (10 sec: 55705.6, 60 sec: 44783.0, 300 sec: 45208.7). Total num frames: 1923448832. Throughput: 0: 39395.6. Samples: 128314500. Policy #0 lag: (min: 1.0, avg: 40.8, max: 78.0) [2024-03-21 11:33:27,805][14687] Avg episode reward: [(0, '1.401')] [2024-03-21 11:33:32,805][14687] Fps is (10 sec: 45874.8, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 1923612672. Throughput: 0: 42004.4. Samples: 128563500. Policy #0 lag: (min: 1.0, avg: 40.8, max: 78.0) [2024-03-21 11:33:32,806][14687] Avg episode reward: [(0, '0.908')] [2024-03-21 11:33:34,496][14919] Updated weights for policy 0, policy_version 58706 (0.0017) [2024-03-21 11:33:37,805][14687] Fps is (10 sec: 32767.8, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 1923776512. Throughput: 0: 42153.3. Samples: 128836900. Policy #0 lag: (min: 1.0, avg: 40.8, max: 78.0) [2024-03-21 11:33:37,806][14687] Avg episode reward: [(0, '1.652')] [2024-03-21 11:33:40,974][14919] Updated weights for policy 0, policy_version 58716 (0.0014) [2024-03-21 11:33:42,805][14687] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 1924071424. Throughput: 0: 45222.2. Samples: 129098600. Policy #0 lag: (min: 1.0, avg: 40.8, max: 78.0) [2024-03-21 11:33:42,806][14687] Avg episode reward: [(0, '1.304')] [2024-03-21 11:33:47,805][14687] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 44875.5). Total num frames: 1924169728. Throughput: 0: 43602.3. Samples: 129248200. Policy #0 lag: (min: 1.0, avg: 40.8, max: 78.0) [2024-03-21 11:33:47,805][14687] Avg episode reward: [(0, '1.304')] [2024-03-21 11:33:51,894][14919] Updated weights for policy 0, policy_version 58726 (0.0010) [2024-03-21 11:33:52,805][14687] Fps is (10 sec: 26214.6, 60 sec: 42052.3, 300 sec: 44431.4). Total num frames: 1924333568. Throughput: 0: 44144.5. Samples: 129528500. Policy #0 lag: (min: 1.0, avg: 40.8, max: 78.0) [2024-03-21 11:33:52,805][14687] Avg episode reward: [(0, '1.304')] [2024-03-21 11:33:57,805][14687] Fps is (10 sec: 32767.8, 60 sec: 40413.8, 300 sec: 44320.1). Total num frames: 1924497408. Throughput: 0: 41646.6. Samples: 129659200. Policy #0 lag: (min: 1.0, avg: 40.8, max: 78.0) [2024-03-21 11:33:57,806][14687] Avg episode reward: [(0, '1.331')] [2024-03-21 11:34:02,805][14687] Fps is (10 sec: 26214.2, 60 sec: 37683.2, 300 sec: 43542.6). Total num frames: 1924595712. Throughput: 0: 44046.6. Samples: 129893800. Policy #0 lag: (min: 1.0, avg: 40.8, max: 78.0) [2024-03-21 11:34:02,806][14687] Avg episode reward: [(0, '1.029')] [2024-03-21 11:34:03,545][14919] Updated weights for policy 0, policy_version 58736 (0.0012) [2024-03-21 11:34:06,960][14898] Signal inference workers to stop experience collection... (2550 times) [2024-03-21 11:34:07,025][14919] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-03-21 11:34:07,087][14898] Signal inference workers to resume experience collection... (2550 times) [2024-03-21 11:34:07,087][14919] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-03-21 11:34:07,805][14687] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 44209.0). Total num frames: 1924890624. Throughput: 0: 43857.7. Samples: 130148500. Policy #0 lag: (min: 0.0, avg: 27.4, max: 110.0) [2024-03-21 11:34:07,806][14687] Avg episode reward: [(0, '1.055')] [2024-03-21 11:34:10,753][14919] Updated weights for policy 0, policy_version 58746 (0.0010) [2024-03-21 11:34:12,805][14687] Fps is (10 sec: 39321.5, 60 sec: 39867.6, 300 sec: 43986.9). Total num frames: 1924988928. Throughput: 0: 43708.8. Samples: 130281400. Policy #0 lag: (min: 0.0, avg: 27.4, max: 110.0) [2024-03-21 11:34:12,806][14687] Avg episode reward: [(0, '1.142')] [2024-03-21 11:34:16,101][14919] Updated weights for policy 0, policy_version 58756 (0.0016) [2024-03-21 11:34:17,805][14687] Fps is (10 sec: 58982.3, 60 sec: 43144.5, 300 sec: 44986.6). Total num frames: 1925480448. Throughput: 0: 44168.9. Samples: 130551100. Policy #0 lag: (min: 0.0, avg: 27.4, max: 110.0) [2024-03-21 11:34:17,806][14687] Avg episode reward: [(0, '0.650')] [2024-03-21 11:34:19,090][14919] Updated weights for policy 0, policy_version 58766 (0.0012) [2024-03-21 11:34:22,805][14687] Fps is (10 sec: 91750.0, 60 sec: 45875.1, 300 sec: 45764.1). Total num frames: 1925906432. Throughput: 0: 43293.3. Samples: 130785100. Policy #0 lag: (min: 0.0, avg: 27.4, max: 110.0) [2024-03-21 11:34:22,806][14687] Avg episode reward: [(0, '1.389')] [2024-03-21 11:34:23,386][14919] Updated weights for policy 0, policy_version 58776 (0.0011) [2024-03-21 11:34:27,805][14687] Fps is (10 sec: 58982.7, 60 sec: 43690.6, 300 sec: 45208.7). Total num frames: 1926070272. Throughput: 0: 40888.9. Samples: 130938600. Policy #0 lag: (min: 0.0, avg: 27.4, max: 110.0) [2024-03-21 11:34:27,806][14687] Avg episode reward: [(0, '0.854')] [2024-03-21 11:34:27,819][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000058779_1926070272.pth... [2024-03-21 11:34:27,964][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000058449_1915256832.pth [2024-03-21 11:34:32,805][14687] Fps is (10 sec: 32768.4, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 1926234112. Throughput: 0: 43713.3. Samples: 131215300. Policy #0 lag: (min: 0.0, avg: 27.4, max: 110.0) [2024-03-21 11:34:32,806][14687] Avg episode reward: [(0, '1.535')] [2024-03-21 11:34:34,294][14919] Updated weights for policy 0, policy_version 58786 (0.0018) [2024-03-21 11:34:37,805][14687] Fps is (10 sec: 45875.4, 60 sec: 45875.3, 300 sec: 45319.8). Total num frames: 1926529024. Throughput: 0: 42706.6. Samples: 131450300. Policy #0 lag: (min: 0.0, avg: 27.4, max: 110.0) [2024-03-21 11:34:37,806][14687] Avg episode reward: [(0, '1.773')] [2024-03-21 11:34:40,126][14919] Updated weights for policy 0, policy_version 58796 (0.0011) [2024-03-21 11:34:42,805][14687] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 45097.6). Total num frames: 1926627328. Throughput: 0: 42920.1. Samples: 131590600. Policy #0 lag: (min: 0.0, avg: 45.2, max: 114.0) [2024-03-21 11:34:42,806][14687] Avg episode reward: [(0, '1.483')] [2024-03-21 11:34:47,805][14687] Fps is (10 sec: 9830.4, 60 sec: 40960.0, 300 sec: 44209.1). Total num frames: 1926627328. Throughput: 0: 44677.8. Samples: 131904300. Policy #0 lag: (min: 0.0, avg: 45.2, max: 114.0) [2024-03-21 11:34:47,806][14687] Avg episode reward: [(0, '1.554')] [2024-03-21 11:34:52,425][14919] Updated weights for policy 0, policy_version 58806 (0.0015) [2024-03-21 11:34:52,805][14687] Fps is (10 sec: 32767.8, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1926955008. Throughput: 0: 45188.9. Samples: 132182000. Policy #0 lag: (min: 0.0, avg: 45.2, max: 114.0) [2024-03-21 11:34:52,806][14687] Avg episode reward: [(0, '1.392')] [2024-03-21 11:34:57,805][14687] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1927086080. Throughput: 0: 48744.5. Samples: 132474900. Policy #0 lag: (min: 0.0, avg: 45.2, max: 114.0) [2024-03-21 11:34:57,806][14687] Avg episode reward: [(0, '1.392')] [2024-03-21 11:34:58,007][14898] Signal inference workers to stop experience collection... (2600 times) [2024-03-21 11:34:58,008][14898] Signal inference workers to resume experience collection... (2600 times) [2024-03-21 11:34:58,071][14919] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-03-21 11:34:58,071][14919] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-03-21 11:35:00,419][14919] Updated weights for policy 0, policy_version 58816 (0.0011) [2024-03-21 11:35:02,805][14687] Fps is (10 sec: 36044.9, 60 sec: 45329.1, 300 sec: 44098.0). Total num frames: 1927315456. Throughput: 0: 45802.3. Samples: 132612200. Policy #0 lag: (min: 0.0, avg: 45.2, max: 114.0) [2024-03-21 11:35:02,806][14687] Avg episode reward: [(0, '1.413')] [2024-03-21 11:35:07,805][14687] Fps is (10 sec: 45874.9, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 1927544832. Throughput: 0: 46764.4. Samples: 132889500. Policy #0 lag: (min: 0.0, avg: 45.2, max: 114.0) [2024-03-21 11:35:07,806][14687] Avg episode reward: [(0, '1.595')] [2024-03-21 11:35:08,519][14919] Updated weights for policy 0, policy_version 58826 (0.0010) [2024-03-21 11:35:11,577][14919] Updated weights for policy 0, policy_version 58836 (0.0017) [2024-03-21 11:35:12,805][14687] Fps is (10 sec: 72089.1, 60 sec: 50790.4, 300 sec: 44875.5). Total num frames: 1928036352. Throughput: 0: 46006.6. Samples: 133008900. Policy #0 lag: (min: 0.0, avg: 45.2, max: 114.0) [2024-03-21 11:35:12,806][14687] Avg episode reward: [(0, '1.154')] [2024-03-21 11:35:17,805][14687] Fps is (10 sec: 68812.9, 60 sec: 45875.2, 300 sec: 44653.3). Total num frames: 1928232960. Throughput: 0: 45513.2. Samples: 133263400. Policy #0 lag: (min: 1.0, avg: 46.0, max: 98.0) [2024-03-21 11:35:17,806][14687] Avg episode reward: [(0, '1.568')] [2024-03-21 11:35:18,698][14919] Updated weights for policy 0, policy_version 58846 (0.0024) [2024-03-21 11:35:22,805][14687] Fps is (10 sec: 52428.7, 60 sec: 44236.8, 300 sec: 45208.7). Total num frames: 1928560640. Throughput: 0: 46671.0. Samples: 133550500. Policy #0 lag: (min: 1.0, avg: 46.0, max: 98.0) [2024-03-21 11:35:22,806][14687] Avg episode reward: [(0, '1.291')] [2024-03-21 11:35:23,093][14919] Updated weights for policy 0, policy_version 58856 (0.0019) [2024-03-21 11:35:27,805][14687] Fps is (10 sec: 49151.9, 60 sec: 44236.7, 300 sec: 44542.2). Total num frames: 1928724480. Throughput: 0: 49744.3. Samples: 133829100. Policy #0 lag: (min: 1.0, avg: 46.0, max: 98.0) [2024-03-21 11:35:27,806][14687] Avg episode reward: [(0, '1.334')] [2024-03-21 11:35:32,805][14687] Fps is (10 sec: 32768.2, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 1928888320. Throughput: 0: 45971.1. Samples: 133973000. Policy #0 lag: (min: 1.0, avg: 46.0, max: 98.0) [2024-03-21 11:35:32,806][14687] Avg episode reward: [(0, '0.564')] [2024-03-21 11:35:34,366][14919] Updated weights for policy 0, policy_version 58866 (0.0009) [2024-03-21 11:35:37,805][14687] Fps is (10 sec: 36045.3, 60 sec: 42598.4, 300 sec: 45097.7). Total num frames: 1929084928. Throughput: 0: 45811.2. Samples: 134243500. Policy #0 lag: (min: 1.0, avg: 46.0, max: 98.0) [2024-03-21 11:35:37,805][14687] Avg episode reward: [(0, '1.013')] [2024-03-21 11:35:40,260][14919] Updated weights for policy 0, policy_version 58876 (0.0011) [2024-03-21 11:35:42,805][14687] Fps is (10 sec: 45874.8, 60 sec: 45329.0, 300 sec: 45208.7). Total num frames: 1929347072. Throughput: 0: 42546.6. Samples: 134389500. Policy #0 lag: (min: 1.0, avg: 46.0, max: 98.0) [2024-03-21 11:35:42,806][14687] Avg episode reward: [(0, '1.780')] [2024-03-21 11:35:44,275][14898] Signal inference workers to stop experience collection... (2650 times) [2024-03-21 11:35:44,276][14898] Signal inference workers to resume experience collection... (2650 times) [2024-03-21 11:35:44,344][14919] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-03-21 11:35:44,344][14919] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-03-21 11:35:47,011][14919] Updated weights for policy 0, policy_version 58886 (0.0011) [2024-03-21 11:35:47,805][14687] Fps is (10 sec: 55704.7, 60 sec: 50244.2, 300 sec: 44875.5). Total num frames: 1929641984. Throughput: 0: 45428.7. Samples: 134656500. Policy #0 lag: (min: 1.0, avg: 46.0, max: 98.0) [2024-03-21 11:35:47,806][14687] Avg episode reward: [(0, '1.499')] [2024-03-21 11:35:52,805][14687] Fps is (10 sec: 36045.0, 60 sec: 45875.2, 300 sec: 44097.9). Total num frames: 1929707520. Throughput: 0: 46115.6. Samples: 134964700. Policy #0 lag: (min: 1.0, avg: 46.0, max: 98.0) [2024-03-21 11:35:52,806][14687] Avg episode reward: [(0, '1.499')] [2024-03-21 11:35:55,164][14919] Updated weights for policy 0, policy_version 58896 (0.0014) [2024-03-21 11:35:57,805][14687] Fps is (10 sec: 36045.0, 60 sec: 48605.8, 300 sec: 44209.0). Total num frames: 1930002432. Throughput: 0: 49246.6. Samples: 135225000. Policy #0 lag: (min: 0.0, avg: 27.1, max: 66.0) [2024-03-21 11:35:57,806][14687] Avg episode reward: [(0, '0.980')] [2024-03-21 11:36:02,805][14687] Fps is (10 sec: 42598.4, 60 sec: 46967.4, 300 sec: 43986.9). Total num frames: 1930133504. Throughput: 0: 46715.6. Samples: 135365600. Policy #0 lag: (min: 0.0, avg: 27.1, max: 66.0) [2024-03-21 11:36:02,806][14687] Avg episode reward: [(0, '1.001')] [2024-03-21 11:36:04,637][14919] Updated weights for policy 0, policy_version 58906 (0.0014) [2024-03-21 11:36:07,805][14687] Fps is (10 sec: 36044.9, 60 sec: 46967.5, 300 sec: 44320.1). Total num frames: 1930362880. Throughput: 0: 46577.8. Samples: 135646500. Policy #0 lag: (min: 0.0, avg: 27.1, max: 66.0) [2024-03-21 11:36:07,806][14687] Avg episode reward: [(0, '1.528')] [2024-03-21 11:36:12,280][14919] Updated weights for policy 0, policy_version 58916 (0.0014) [2024-03-21 11:36:12,805][14687] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 43986.9). Total num frames: 1930592256. Throughput: 0: 43186.8. Samples: 135772500. Policy #0 lag: (min: 0.0, avg: 27.1, max: 66.0) [2024-03-21 11:36:12,805][14687] Avg episode reward: [(0, '1.297')] [2024-03-21 11:36:15,716][14919] Updated weights for policy 0, policy_version 58926 (0.0012) [2024-03-21 11:36:17,805][14687] Fps is (10 sec: 52429.0, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 1930887168. Throughput: 0: 45966.7. Samples: 136041500. Policy #0 lag: (min: 0.0, avg: 27.1, max: 66.0) [2024-03-21 11:36:17,806][14687] Avg episode reward: [(0, '0.614')] [2024-03-21 11:36:20,851][14919] Updated weights for policy 0, policy_version 58936 (0.0028) [2024-03-21 11:36:22,805][14687] Fps is (10 sec: 75366.3, 60 sec: 46421.4, 300 sec: 45208.7). Total num frames: 1931345920. Throughput: 0: 44602.2. Samples: 136250600. Policy #0 lag: (min: 0.0, avg: 27.1, max: 66.0) [2024-03-21 11:36:22,806][14687] Avg episode reward: [(0, '0.957')] [2024-03-21 11:36:26,594][14919] Updated weights for policy 0, policy_version 58946 (0.0023) [2024-03-21 11:36:27,805][14687] Fps is (10 sec: 65535.5, 60 sec: 46967.5, 300 sec: 44875.5). Total num frames: 1931542528. Throughput: 0: 47380.0. Samples: 136521600. Policy #0 lag: (min: 0.0, avg: 27.1, max: 66.0) [2024-03-21 11:36:27,806][14687] Avg episode reward: [(0, '1.729')] [2024-03-21 11:36:27,820][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000058946_1931542528.pth... [2024-03-21 11:36:27,929][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000058618_1920794624.pth [2024-03-21 11:36:32,805][14687] Fps is (10 sec: 36044.5, 60 sec: 46967.5, 300 sec: 44986.6). Total num frames: 1931706368. Throughput: 0: 44644.5. Samples: 136665500. Policy #0 lag: (min: 0.0, avg: 44.3, max: 84.0) [2024-03-21 11:36:32,806][14687] Avg episode reward: [(0, '1.185')] [2024-03-21 11:36:37,495][14898] Signal inference workers to stop experience collection... (2700 times) [2024-03-21 11:36:37,601][14919] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-03-21 11:36:37,686][14898] Signal inference workers to resume experience collection... (2700 times) [2024-03-21 11:36:37,687][14919] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-03-21 11:36:37,689][14919] Updated weights for policy 0, policy_version 58956 (0.0017) [2024-03-21 11:36:37,805][14687] Fps is (10 sec: 32767.9, 60 sec: 46421.2, 300 sec: 44320.1). Total num frames: 1931870208. Throughput: 0: 44284.4. Samples: 136957500. Policy #0 lag: (min: 0.0, avg: 44.3, max: 84.0) [2024-03-21 11:36:37,806][14687] Avg episode reward: [(0, '1.084')] [2024-03-21 11:36:42,526][14919] Updated weights for policy 0, policy_version 58966 (0.0011) [2024-03-21 11:36:42,805][14687] Fps is (10 sec: 49152.2, 60 sec: 47513.7, 300 sec: 45208.7). Total num frames: 1932197888. Throughput: 0: 41262.3. Samples: 137081800. Policy #0 lag: (min: 0.0, avg: 44.3, max: 84.0) [2024-03-21 11:36:42,806][14687] Avg episode reward: [(0, '0.612')] [2024-03-21 11:36:47,805][14687] Fps is (10 sec: 49152.4, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 1932361728. Throughput: 0: 43868.9. Samples: 137339700. Policy #0 lag: (min: 0.0, avg: 44.3, max: 84.0) [2024-03-21 11:36:47,806][14687] Avg episode reward: [(0, '1.507')] [2024-03-21 11:36:50,998][14919] Updated weights for policy 0, policy_version 58976 (0.0012) [2024-03-21 11:36:52,805][14687] Fps is (10 sec: 32767.6, 60 sec: 46967.4, 300 sec: 44431.2). Total num frames: 1932525568. Throughput: 0: 44031.1. Samples: 137627900. Policy #0 lag: (min: 0.0, avg: 44.3, max: 84.0) [2024-03-21 11:36:52,806][14687] Avg episode reward: [(0, '1.311')] [2024-03-21 11:36:57,805][14687] Fps is (10 sec: 32767.9, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 1932689408. Throughput: 0: 44295.4. Samples: 137765800. Policy #0 lag: (min: 0.0, avg: 44.3, max: 84.0) [2024-03-21 11:36:57,806][14687] Avg episode reward: [(0, '1.435')] [2024-03-21 11:37:01,436][14919] Updated weights for policy 0, policy_version 58986 (0.0015) [2024-03-21 11:37:02,805][14687] Fps is (10 sec: 36045.4, 60 sec: 45875.3, 300 sec: 43875.8). Total num frames: 1932886016. Throughput: 0: 43826.8. Samples: 138013700. Policy #0 lag: (min: 0.0, avg: 44.3, max: 84.0) [2024-03-21 11:37:02,805][14687] Avg episode reward: [(0, '0.671')] [2024-03-21 11:37:07,805][14687] Fps is (10 sec: 45875.2, 60 sec: 46421.3, 300 sec: 44320.1). Total num frames: 1933148160. Throughput: 0: 45033.2. Samples: 138277100. Policy #0 lag: (min: 0.0, avg: 44.3, max: 84.0) [2024-03-21 11:37:07,806][14687] Avg episode reward: [(0, '0.795')] [2024-03-21 11:37:08,773][14919] Updated weights for policy 0, policy_version 58996 (0.0011) [2024-03-21 11:37:12,805][14687] Fps is (10 sec: 42597.8, 60 sec: 45329.0, 300 sec: 44542.3). Total num frames: 1933312000. Throughput: 0: 45042.2. Samples: 138548500. Policy #0 lag: (min: 0.0, avg: 29.3, max: 69.0) [2024-03-21 11:37:12,806][14687] Avg episode reward: [(0, '1.346')] [2024-03-21 11:37:17,805][14687] Fps is (10 sec: 32768.3, 60 sec: 43144.6, 300 sec: 44098.0). Total num frames: 1933475840. Throughput: 0: 44893.4. Samples: 138685700. Policy #0 lag: (min: 0.0, avg: 29.3, max: 69.0) [2024-03-21 11:37:17,805][14687] Avg episode reward: [(0, '0.770')] [2024-03-21 11:37:17,908][14919] Updated weights for policy 0, policy_version 59006 (0.0017) [2024-03-21 11:37:22,805][14687] Fps is (10 sec: 49152.5, 60 sec: 40960.0, 300 sec: 44209.0). Total num frames: 1933803520. Throughput: 0: 44086.8. Samples: 138941400. Policy #0 lag: (min: 0.0, avg: 29.3, max: 69.0) [2024-03-21 11:37:22,806][14687] Avg episode reward: [(0, '1.282')] [2024-03-21 11:37:22,854][14919] Updated weights for policy 0, policy_version 59016 (0.0013) [2024-03-21 11:37:27,805][14687] Fps is (10 sec: 62258.2, 60 sec: 42598.4, 300 sec: 44431.2). Total num frames: 1934098432. Throughput: 0: 44166.5. Samples: 139069300. Policy #0 lag: (min: 0.0, avg: 29.3, max: 69.0) [2024-03-21 11:37:27,806][14687] Avg episode reward: [(0, '0.606')] [2024-03-21 11:37:28,432][14919] Updated weights for policy 0, policy_version 59026 (0.0018) [2024-03-21 11:37:32,805][14687] Fps is (10 sec: 55705.7, 60 sec: 44236.9, 300 sec: 44986.6). Total num frames: 1934360576. Throughput: 0: 43817.9. Samples: 139311500. Policy #0 lag: (min: 0.0, avg: 29.3, max: 69.0) [2024-03-21 11:37:32,805][14687] Avg episode reward: [(0, '1.436')] [2024-03-21 11:37:34,523][14898] Signal inference workers to stop experience collection... (2750 times) [2024-03-21 11:37:34,585][14898] Signal inference workers to resume experience collection... (2750 times) [2024-03-21 11:37:34,725][14919] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-03-21 11:37:34,815][14919] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-03-21 11:37:37,524][14919] Updated weights for policy 0, policy_version 59036 (0.0011) [2024-03-21 11:37:37,805][14687] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 1934491648. Throughput: 0: 43180.0. Samples: 139571000. Policy #0 lag: (min: 0.0, avg: 29.3, max: 69.0) [2024-03-21 11:37:37,806][14687] Avg episode reward: [(0, '1.077')] [2024-03-21 11:37:42,500][14919] Updated weights for policy 0, policy_version 59046 (0.0016) [2024-03-21 11:37:42,805][14687] Fps is (10 sec: 45875.3, 60 sec: 43690.7, 300 sec: 44653.4). Total num frames: 1934819328. Throughput: 0: 42786.8. Samples: 139691200. Policy #0 lag: (min: 0.0, avg: 29.3, max: 69.0) [2024-03-21 11:37:42,805][14687] Avg episode reward: [(0, '1.382')] [2024-03-21 11:37:47,805][14687] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 44431.2). Total num frames: 1934917632. Throughput: 0: 42751.1. Samples: 139937500. Policy #0 lag: (min: 0.0, avg: 31.1, max: 68.0) [2024-03-21 11:37:47,805][14687] Avg episode reward: [(0, '1.648')] [2024-03-21 11:37:52,389][14919] Updated weights for policy 0, policy_version 59056 (0.0014) [2024-03-21 11:37:52,805][14687] Fps is (10 sec: 36044.5, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1935179776. Throughput: 0: 42626.7. Samples: 140195300. Policy #0 lag: (min: 0.0, avg: 31.1, max: 68.0) [2024-03-21 11:37:52,806][14687] Avg episode reward: [(0, '0.931')] [2024-03-21 11:37:57,805][14687] Fps is (10 sec: 42597.8, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 1935343616. Throughput: 0: 39233.3. Samples: 140314000. Policy #0 lag: (min: 0.0, avg: 31.1, max: 68.0) [2024-03-21 11:37:57,806][14687] Avg episode reward: [(0, '0.768')] [2024-03-21 11:37:59,876][14919] Updated weights for policy 0, policy_version 59066 (0.0018) [2024-03-21 11:38:02,805][14687] Fps is (10 sec: 42598.8, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1935605760. Throughput: 0: 40866.7. Samples: 140524700. Policy #0 lag: (min: 0.0, avg: 31.1, max: 68.0) [2024-03-21 11:38:02,805][14687] Avg episode reward: [(0, '1.072')] [2024-03-21 11:38:07,805][14687] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 44542.3). Total num frames: 1935736832. Throughput: 0: 41375.5. Samples: 140803300. Policy #0 lag: (min: 0.0, avg: 31.1, max: 68.0) [2024-03-21 11:38:07,806][14687] Avg episode reward: [(0, '1.131')] [2024-03-21 11:38:09,657][14919] Updated weights for policy 0, policy_version 59076 (0.0011) [2024-03-21 11:38:12,805][14687] Fps is (10 sec: 36044.5, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 1935966208. Throughput: 0: 41493.4. Samples: 140936500. Policy #0 lag: (min: 0.0, avg: 31.1, max: 68.0) [2024-03-21 11:38:12,806][14687] Avg episode reward: [(0, '1.205')] [2024-03-21 11:38:15,500][14919] Updated weights for policy 0, policy_version 59086 (0.0011) [2024-03-21 11:38:17,805][14687] Fps is (10 sec: 49151.5, 60 sec: 45875.1, 300 sec: 44320.1). Total num frames: 1936228352. Throughput: 0: 41411.0. Samples: 141175000. Policy #0 lag: (min: 0.0, avg: 31.1, max: 68.0) [2024-03-21 11:38:17,806][14687] Avg episode reward: [(0, '1.307')] [2024-03-21 11:38:22,805][14687] Fps is (10 sec: 45875.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1936424960. Throughput: 0: 41584.6. Samples: 141442300. Policy #0 lag: (min: 0.0, avg: 31.1, max: 68.0) [2024-03-21 11:38:22,805][14687] Avg episode reward: [(0, '1.723')] [2024-03-21 11:38:22,946][14919] Updated weights for policy 0, policy_version 59096 (0.0020) [2024-03-21 11:38:27,805][14687] Fps is (10 sec: 32768.3, 60 sec: 40960.1, 300 sec: 43875.8). Total num frames: 1936556032. Throughput: 0: 42111.0. Samples: 141586200. Policy #0 lag: (min: 0.0, avg: 35.7, max: 70.0) [2024-03-21 11:38:27,806][14687] Avg episode reward: [(0, '1.723')] [2024-03-21 11:38:27,819][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000059099_1936556032.pth... [2024-03-21 11:38:27,926][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000058779_1926070272.pth [2024-03-21 11:38:32,805][14687] Fps is (10 sec: 29491.0, 60 sec: 39321.5, 300 sec: 43875.8). Total num frames: 1936719872. Throughput: 0: 42877.7. Samples: 141867000. Policy #0 lag: (min: 0.0, avg: 35.7, max: 70.0) [2024-03-21 11:38:32,806][14687] Avg episode reward: [(0, '1.027')] [2024-03-21 11:38:33,436][14919] Updated weights for policy 0, policy_version 59106 (0.0012) [2024-03-21 11:38:37,805][14687] Fps is (10 sec: 49152.0, 60 sec: 42598.5, 300 sec: 43986.9). Total num frames: 1937047552. Throughput: 0: 42471.1. Samples: 142106500. Policy #0 lag: (min: 0.0, avg: 35.7, max: 70.0) [2024-03-21 11:38:37,806][14687] Avg episode reward: [(0, '1.306')] [2024-03-21 11:38:38,826][14919] Updated weights for policy 0, policy_version 59116 (0.0013) [2024-03-21 11:38:40,067][14898] Signal inference workers to stop experience collection... (2800 times) [2024-03-21 11:38:40,067][14898] Signal inference workers to resume experience collection... (2800 times) [2024-03-21 11:38:40,113][14919] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-03-21 11:38:40,119][14919] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-03-21 11:38:42,805][14687] Fps is (10 sec: 52428.9, 60 sec: 40413.8, 300 sec: 44320.1). Total num frames: 1937244160. Throughput: 0: 42733.4. Samples: 142237000. Policy #0 lag: (min: 0.0, avg: 35.7, max: 70.0) [2024-03-21 11:38:42,806][14687] Avg episode reward: [(0, '0.795')] [2024-03-21 11:38:47,805][14687] Fps is (10 sec: 22937.5, 60 sec: 39321.5, 300 sec: 43875.8). Total num frames: 1937276928. Throughput: 0: 44008.8. Samples: 142505100. Policy #0 lag: (min: 0.0, avg: 35.7, max: 70.0) [2024-03-21 11:38:47,806][14687] Avg episode reward: [(0, '1.370')] [2024-03-21 11:38:50,841][14919] Updated weights for policy 0, policy_version 59126 (0.0015) [2024-03-21 11:38:52,805][14687] Fps is (10 sec: 32767.9, 60 sec: 39867.7, 300 sec: 44320.1). Total num frames: 1937571840. Throughput: 0: 43486.6. Samples: 142760200. Policy #0 lag: (min: 0.0, avg: 35.7, max: 70.0) [2024-03-21 11:38:52,806][14687] Avg episode reward: [(0, '1.534')] [2024-03-21 11:38:57,603][14919] Updated weights for policy 0, policy_version 59136 (0.0033) [2024-03-21 11:38:57,805][14687] Fps is (10 sec: 49152.3, 60 sec: 40413.9, 300 sec: 44653.3). Total num frames: 1937768448. Throughput: 0: 43555.6. Samples: 142896500. Policy #0 lag: (min: 0.0, avg: 35.7, max: 70.0) [2024-03-21 11:38:57,806][14687] Avg episode reward: [(0, '0.828')] [2024-03-21 11:39:02,706][14919] Updated weights for policy 0, policy_version 59146 (0.0013) [2024-03-21 11:39:02,805][14687] Fps is (10 sec: 52428.6, 60 sec: 41506.0, 300 sec: 44764.4). Total num frames: 1938096128. Throughput: 0: 43693.4. Samples: 143141200. Policy #0 lag: (min: 0.0, avg: 48.6, max: 110.0) [2024-03-21 11:39:02,806][14687] Avg episode reward: [(0, '1.066')] [2024-03-21 11:39:07,805][14687] Fps is (10 sec: 58982.5, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1938358272. Throughput: 0: 43326.6. Samples: 143392000. Policy #0 lag: (min: 0.0, avg: 48.6, max: 110.0) [2024-03-21 11:39:07,806][14687] Avg episode reward: [(0, '0.956')] [2024-03-21 11:39:08,743][14919] Updated weights for policy 0, policy_version 59156 (0.0012) [2024-03-21 11:39:12,805][14687] Fps is (10 sec: 49152.9, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1938587648. Throughput: 0: 46046.8. Samples: 143658300. Policy #0 lag: (min: 0.0, avg: 48.6, max: 110.0) [2024-03-21 11:39:12,805][14687] Avg episode reward: [(0, '0.866')] [2024-03-21 11:39:16,054][14919] Updated weights for policy 0, policy_version 59166 (0.0017) [2024-03-21 11:39:17,805][14687] Fps is (10 sec: 49151.8, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 1938849792. Throughput: 0: 42806.7. Samples: 143793300. Policy #0 lag: (min: 0.0, avg: 48.6, max: 110.0) [2024-03-21 11:39:17,806][14687] Avg episode reward: [(0, '1.532')] [2024-03-21 11:39:22,805][14687] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 43875.8). Total num frames: 1939013632. Throughput: 0: 43622.2. Samples: 144069500. Policy #0 lag: (min: 0.0, avg: 48.6, max: 110.0) [2024-03-21 11:39:22,806][14687] Avg episode reward: [(0, '0.971')] [2024-03-21 11:39:23,832][14919] Updated weights for policy 0, policy_version 59176 (0.0011) [2024-03-21 11:39:27,805][14687] Fps is (10 sec: 36044.7, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 1939210240. Throughput: 0: 43984.4. Samples: 144216300. Policy #0 lag: (min: 0.0, avg: 48.6, max: 110.0) [2024-03-21 11:39:27,806][14687] Avg episode reward: [(0, '0.971')] [2024-03-21 11:39:31,476][14919] Updated weights for policy 0, policy_version 59186 (0.0012) [2024-03-21 11:39:32,805][14687] Fps is (10 sec: 42598.1, 60 sec: 45329.0, 300 sec: 43764.7). Total num frames: 1939439616. Throughput: 0: 43922.2. Samples: 144481600. Policy #0 lag: (min: 0.0, avg: 48.6, max: 110.0) [2024-03-21 11:39:32,806][14687] Avg episode reward: [(0, '1.370')] [2024-03-21 11:39:37,805][14687] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 44097.9). Total num frames: 1939636224. Throughput: 0: 44475.5. Samples: 144761600. Policy #0 lag: (min: 0.0, avg: 48.6, max: 110.0) [2024-03-21 11:39:37,806][14687] Avg episode reward: [(0, '1.079')] [2024-03-21 11:39:39,793][14898] Signal inference workers to stop experience collection... (2850 times) [2024-03-21 11:39:39,794][14898] Signal inference workers to resume experience collection... (2850 times) [2024-03-21 11:39:39,873][14919] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-03-21 11:39:39,873][14919] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-03-21 11:39:40,288][14919] Updated weights for policy 0, policy_version 59196 (0.0012) [2024-03-21 11:39:42,805][14687] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 44764.4). Total num frames: 1939832832. Throughput: 0: 44755.5. Samples: 144910500. Policy #0 lag: (min: 0.0, avg: 41.5, max: 114.0) [2024-03-21 11:39:42,806][14687] Avg episode reward: [(0, '1.079')] [2024-03-21 11:39:47,805][14687] Fps is (10 sec: 36045.1, 60 sec: 45329.1, 300 sec: 44209.0). Total num frames: 1939996672. Throughput: 0: 45431.2. Samples: 145185600. Policy #0 lag: (min: 0.0, avg: 41.5, max: 114.0) [2024-03-21 11:39:47,805][14687] Avg episode reward: [(0, '1.117')] [2024-03-21 11:39:48,959][14919] Updated weights for policy 0, policy_version 59206 (0.0011) [2024-03-21 11:39:52,805][14687] Fps is (10 sec: 36045.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1940193280. Throughput: 0: 45835.6. Samples: 145454600. Policy #0 lag: (min: 0.0, avg: 41.5, max: 114.0) [2024-03-21 11:39:52,806][14687] Avg episode reward: [(0, '1.443')] [2024-03-21 11:39:56,742][14919] Updated weights for policy 0, policy_version 59216 (0.0011) [2024-03-21 11:39:57,805][14687] Fps is (10 sec: 49152.8, 60 sec: 45329.2, 300 sec: 44653.4). Total num frames: 1940488192. Throughput: 0: 45555.6. Samples: 145708300. Policy #0 lag: (min: 0.0, avg: 41.5, max: 114.0) [2024-03-21 11:39:57,806][14687] Avg episode reward: [(0, '1.056')] [2024-03-21 11:40:02,805][14687] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 44320.1). Total num frames: 1940619264. Throughput: 0: 45791.1. Samples: 145853900. Policy #0 lag: (min: 0.0, avg: 41.5, max: 114.0) [2024-03-21 11:40:02,806][14687] Avg episode reward: [(0, '1.050')] [2024-03-21 11:40:04,046][14919] Updated weights for policy 0, policy_version 59226 (0.0016) [2024-03-21 11:40:07,703][14919] Updated weights for policy 0, policy_version 59236 (0.0015) [2024-03-21 11:40:07,805][14687] Fps is (10 sec: 55704.1, 60 sec: 44782.9, 300 sec: 44097.9). Total num frames: 1941045248. Throughput: 0: 45048.8. Samples: 146096700. Policy #0 lag: (min: 0.0, avg: 41.5, max: 114.0) [2024-03-21 11:40:07,806][14687] Avg episode reward: [(0, '1.044')] [2024-03-21 11:40:12,805][14687] Fps is (10 sec: 68812.7, 60 sec: 45328.9, 300 sec: 44320.1). Total num frames: 1941307392. Throughput: 0: 44897.8. Samples: 146236700. Policy #0 lag: (min: 0.0, avg: 41.5, max: 114.0) [2024-03-21 11:40:12,806][14687] Avg episode reward: [(0, '0.827')] [2024-03-21 11:40:13,337][14919] Updated weights for policy 0, policy_version 59246 (0.0020) [2024-03-21 11:40:17,805][14687] Fps is (10 sec: 49151.9, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 1941536768. Throughput: 0: 45073.3. Samples: 146509900. Policy #0 lag: (min: 0.0, avg: 46.7, max: 85.0) [2024-03-21 11:40:17,806][14687] Avg episode reward: [(0, '1.469')] [2024-03-21 11:40:20,951][14919] Updated weights for policy 0, policy_version 59256 (0.0021) [2024-03-21 11:40:22,805][14687] Fps is (10 sec: 42598.8, 60 sec: 45329.1, 300 sec: 44098.0). Total num frames: 1941733376. Throughput: 0: 45042.3. Samples: 146788500. Policy #0 lag: (min: 0.0, avg: 46.7, max: 85.0) [2024-03-21 11:40:22,806][14687] Avg episode reward: [(0, '1.212')] [2024-03-21 11:40:27,805][14687] Fps is (10 sec: 39321.6, 60 sec: 45329.0, 300 sec: 44209.0). Total num frames: 1941929984. Throughput: 0: 44786.6. Samples: 146925900. Policy #0 lag: (min: 0.0, avg: 46.7, max: 85.0) [2024-03-21 11:40:27,806][14687] Avg episode reward: [(0, '0.748')] [2024-03-21 11:40:28,094][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000059264_1941962752.pth... [2024-03-21 11:40:28,150][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000058946_1931542528.pth [2024-03-21 11:40:30,670][14919] Updated weights for policy 0, policy_version 59266 (0.0015) [2024-03-21 11:40:32,805][14687] Fps is (10 sec: 42598.4, 60 sec: 45329.2, 300 sec: 44320.1). Total num frames: 1942159360. Throughput: 0: 44993.4. Samples: 147210300. Policy #0 lag: (min: 0.0, avg: 46.7, max: 85.0) [2024-03-21 11:40:32,806][14687] Avg episode reward: [(0, '1.360')] [2024-03-21 11:40:35,363][14898] Signal inference workers to stop experience collection... (2900 times) [2024-03-21 11:40:35,408][14919] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-03-21 11:40:35,598][14898] Signal inference workers to resume experience collection... (2900 times) [2024-03-21 11:40:35,599][14919] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-03-21 11:40:36,211][14919] Updated weights for policy 0, policy_version 59276 (0.0011) [2024-03-21 11:40:37,805][14687] Fps is (10 sec: 42598.8, 60 sec: 45329.1, 300 sec: 44098.0). Total num frames: 1942355968. Throughput: 0: 45233.3. Samples: 147490100. Policy #0 lag: (min: 0.0, avg: 46.7, max: 85.0) [2024-03-21 11:40:37,806][14687] Avg episode reward: [(0, '1.360')] [2024-03-21 11:40:42,805][14687] Fps is (10 sec: 39321.2, 60 sec: 45329.1, 300 sec: 43764.7). Total num frames: 1942552576. Throughput: 0: 45706.4. Samples: 147765100. Policy #0 lag: (min: 0.0, avg: 46.7, max: 85.0) [2024-03-21 11:40:42,806][14687] Avg episode reward: [(0, '0.953')] [2024-03-21 11:40:46,394][14919] Updated weights for policy 0, policy_version 59286 (0.0021) [2024-03-21 11:40:47,805][14687] Fps is (10 sec: 36044.8, 60 sec: 45329.0, 300 sec: 44098.0). Total num frames: 1942716416. Throughput: 0: 45708.9. Samples: 147910800. Policy #0 lag: (min: 0.0, avg: 46.7, max: 85.0) [2024-03-21 11:40:47,806][14687] Avg episode reward: [(0, '1.030')] [2024-03-21 11:40:52,610][14919] Updated weights for policy 0, policy_version 59296 (0.0012) [2024-03-21 11:40:52,805][14687] Fps is (10 sec: 45875.2, 60 sec: 46967.4, 300 sec: 44098.0). Total num frames: 1943011328. Throughput: 0: 46011.2. Samples: 148167200. Policy #0 lag: (min: 0.0, avg: 46.7, max: 85.0) [2024-03-21 11:40:52,806][14687] Avg episode reward: [(0, '1.243')] [2024-03-21 11:40:57,805][14687] Fps is (10 sec: 39321.2, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1943109632. Throughput: 0: 45748.8. Samples: 148295400. Policy #0 lag: (min: 0.0, avg: 29.3, max: 72.0) [2024-03-21 11:40:57,806][14687] Avg episode reward: [(0, '1.080')] [2024-03-21 11:41:01,842][14919] Updated weights for policy 0, policy_version 59306 (0.0013) [2024-03-21 11:41:02,805][14687] Fps is (10 sec: 36044.9, 60 sec: 45875.2, 300 sec: 44098.0). Total num frames: 1943371776. Throughput: 0: 45411.2. Samples: 148553400. Policy #0 lag: (min: 0.0, avg: 29.3, max: 72.0) [2024-03-21 11:41:02,806][14687] Avg episode reward: [(0, '0.889')] [2024-03-21 11:41:07,805][14687] Fps is (10 sec: 36044.9, 60 sec: 40413.9, 300 sec: 43653.6). Total num frames: 1943470080. Throughput: 0: 44977.7. Samples: 148812500. Policy #0 lag: (min: 0.0, avg: 29.3, max: 72.0) [2024-03-21 11:41:07,806][14687] Avg episode reward: [(0, '1.111')] [2024-03-21 11:41:10,672][14919] Updated weights for policy 0, policy_version 59316 (0.0017) [2024-03-21 11:41:12,805][14687] Fps is (10 sec: 32768.1, 60 sec: 39867.8, 300 sec: 43431.5). Total num frames: 1943699456. Throughput: 0: 44364.6. Samples: 148922300. Policy #0 lag: (min: 0.0, avg: 29.3, max: 72.0) [2024-03-21 11:41:12,806][14687] Avg episode reward: [(0, '1.002')] [2024-03-21 11:41:17,805][14687] Fps is (10 sec: 42598.8, 60 sec: 39321.7, 300 sec: 42542.9). Total num frames: 1943896064. Throughput: 0: 43733.3. Samples: 149178300. Policy #0 lag: (min: 0.0, avg: 29.3, max: 72.0) [2024-03-21 11:41:17,806][14687] Avg episode reward: [(0, '1.286')] [2024-03-21 11:41:18,991][14919] Updated weights for policy 0, policy_version 59326 (0.0011) [2024-03-21 11:41:22,805][14687] Fps is (10 sec: 49151.9, 60 sec: 40960.0, 300 sec: 42876.1). Total num frames: 1944190976. Throughput: 0: 42851.1. Samples: 149418400. Policy #0 lag: (min: 0.0, avg: 29.3, max: 72.0) [2024-03-21 11:41:22,805][14687] Avg episode reward: [(0, '1.055')] [2024-03-21 11:41:24,696][14919] Updated weights for policy 0, policy_version 59336 (0.0016) [2024-03-21 11:41:27,805][14687] Fps is (10 sec: 72089.4, 60 sec: 44783.0, 300 sec: 43764.7). Total num frames: 1944616960. Throughput: 0: 39326.7. Samples: 149534800. Policy #0 lag: (min: 0.0, avg: 29.3, max: 72.0) [2024-03-21 11:41:27,806][14687] Avg episode reward: [(0, '1.378')] [2024-03-21 11:41:27,970][14919] Updated weights for policy 0, policy_version 59346 (0.0012) [2024-03-21 11:41:32,502][14898] Signal inference workers to stop experience collection... (2950 times) [2024-03-21 11:41:32,502][14898] Signal inference workers to resume experience collection... (2950 times) [2024-03-21 11:41:32,572][14919] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-03-21 11:41:32,579][14919] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-03-21 11:41:32,823][14687] Fps is (10 sec: 65419.6, 60 sec: 44769.6, 300 sec: 43984.2). Total num frames: 1944846336. Throughput: 0: 41534.7. Samples: 149780600. Policy #0 lag: (min: 1.0, avg: 42.6, max: 79.0) [2024-03-21 11:41:32,823][14687] Avg episode reward: [(0, '1.312')] [2024-03-21 11:41:37,035][14919] Updated weights for policy 0, policy_version 59356 (0.0014) [2024-03-21 11:41:37,805][14687] Fps is (10 sec: 36044.5, 60 sec: 43690.6, 300 sec: 43320.4). Total num frames: 1944977408. Throughput: 0: 42042.2. Samples: 150059100. Policy #0 lag: (min: 1.0, avg: 42.6, max: 79.0) [2024-03-21 11:41:37,806][14687] Avg episode reward: [(0, '1.327')] [2024-03-21 11:41:42,805][14687] Fps is (10 sec: 29544.3, 60 sec: 43144.7, 300 sec: 43320.4). Total num frames: 1945141248. Throughput: 0: 45346.9. Samples: 150336000. Policy #0 lag: (min: 1.0, avg: 42.6, max: 79.0) [2024-03-21 11:41:42,805][14687] Avg episode reward: [(0, '0.434')] [2024-03-21 11:41:45,483][14919] Updated weights for policy 0, policy_version 59366 (0.0016) [2024-03-21 11:41:47,805][14687] Fps is (10 sec: 42598.9, 60 sec: 44782.9, 300 sec: 43653.7). Total num frames: 1945403392. Throughput: 0: 42540.0. Samples: 150467700. Policy #0 lag: (min: 1.0, avg: 42.6, max: 79.0) [2024-03-21 11:41:47,806][14687] Avg episode reward: [(0, '0.434')] [2024-03-21 11:41:52,805][14687] Fps is (10 sec: 39320.7, 60 sec: 42052.3, 300 sec: 43542.6). Total num frames: 1945534464. Throughput: 0: 42737.8. Samples: 150735700. Policy #0 lag: (min: 1.0, avg: 42.6, max: 79.0) [2024-03-21 11:41:52,806][14687] Avg episode reward: [(0, '1.494')] [2024-03-21 11:41:54,823][14919] Updated weights for policy 0, policy_version 59376 (0.0011) [2024-03-21 11:41:57,805][14687] Fps is (10 sec: 32767.9, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 1945731072. Throughput: 0: 43115.5. Samples: 150862500. Policy #0 lag: (min: 1.0, avg: 42.6, max: 79.0) [2024-03-21 11:41:57,806][14687] Avg episode reward: [(0, '1.531')] [2024-03-21 11:42:01,691][14919] Updated weights for policy 0, policy_version 59386 (0.0016) [2024-03-21 11:42:02,805][14687] Fps is (10 sec: 49152.1, 60 sec: 44236.8, 300 sec: 43653.6). Total num frames: 1946025984. Throughput: 0: 43455.5. Samples: 151133800. Policy #0 lag: (min: 1.0, avg: 42.6, max: 79.0) [2024-03-21 11:42:02,806][14687] Avg episode reward: [(0, '1.127')] [2024-03-21 11:42:07,805][14687] Fps is (10 sec: 52428.8, 60 sec: 46421.4, 300 sec: 43875.8). Total num frames: 1946255360. Throughput: 0: 43242.2. Samples: 151364300. Policy #0 lag: (min: 1.0, avg: 42.6, max: 79.0) [2024-03-21 11:42:07,806][14687] Avg episode reward: [(0, '1.288')] [2024-03-21 11:42:08,665][14919] Updated weights for policy 0, policy_version 59396 (0.0012) [2024-03-21 11:42:12,805][14687] Fps is (10 sec: 26214.2, 60 sec: 43144.4, 300 sec: 43431.5). Total num frames: 1946288128. Throughput: 0: 43766.6. Samples: 151504300. Policy #0 lag: (min: 0.0, avg: 33.6, max: 75.0) [2024-03-21 11:42:12,806][14687] Avg episode reward: [(0, '1.268')] [2024-03-21 11:42:17,805][14687] Fps is (10 sec: 19660.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1946451968. Throughput: 0: 44024.1. Samples: 151760900. Policy #0 lag: (min: 0.0, avg: 33.6, max: 75.0) [2024-03-21 11:42:17,806][14687] Avg episode reward: [(0, '0.988')] [2024-03-21 11:42:21,763][14919] Updated weights for policy 0, policy_version 59406 (0.0016) [2024-03-21 11:42:22,805][14687] Fps is (10 sec: 32768.4, 60 sec: 40413.9, 300 sec: 42431.8). Total num frames: 1946615808. Throughput: 0: 43346.8. Samples: 152009700. Policy #0 lag: (min: 0.0, avg: 33.6, max: 75.0) [2024-03-21 11:42:22,806][14687] Avg episode reward: [(0, '1.412')] [2024-03-21 11:42:27,805][14687] Fps is (10 sec: 42598.4, 60 sec: 37683.2, 300 sec: 42431.8). Total num frames: 1946877952. Throughput: 0: 39695.4. Samples: 152122300. Policy #0 lag: (min: 0.0, avg: 33.6, max: 75.0) [2024-03-21 11:42:27,806][14687] Avg episode reward: [(0, '0.479')] [2024-03-21 11:42:27,819][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000059414_1946877952.pth... [2024-03-21 11:42:27,928][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000059099_1936556032.pth [2024-03-21 11:42:28,803][14919] Updated weights for policy 0, policy_version 59416 (0.0010) [2024-03-21 11:42:32,805][14687] Fps is (10 sec: 49151.9, 60 sec: 37694.4, 300 sec: 42765.0). Total num frames: 1947107328. Throughput: 0: 41971.1. Samples: 152356400. Policy #0 lag: (min: 0.0, avg: 33.6, max: 75.0) [2024-03-21 11:42:32,806][14687] Avg episode reward: [(0, '0.912')] [2024-03-21 11:42:37,805][14687] Fps is (10 sec: 36044.8, 60 sec: 37683.3, 300 sec: 42098.5). Total num frames: 1947238400. Throughput: 0: 41349.0. Samples: 152596400. Policy #0 lag: (min: 0.0, avg: 33.6, max: 75.0) [2024-03-21 11:42:37,805][14687] Avg episode reward: [(0, '1.584')] [2024-03-21 11:42:38,056][14919] Updated weights for policy 0, policy_version 59426 (0.0013) [2024-03-21 11:42:41,904][14898] Signal inference workers to stop experience collection... (3000 times) [2024-03-21 11:42:41,984][14919] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-03-21 11:42:42,140][14898] Signal inference workers to resume experience collection... (3000 times) [2024-03-21 11:42:42,141][14919] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-03-21 11:42:42,805][14687] Fps is (10 sec: 36044.8, 60 sec: 38775.3, 300 sec: 42542.9). Total num frames: 1947467776. Throughput: 0: 41360.0. Samples: 152723700. Policy #0 lag: (min: 0.0, avg: 33.6, max: 75.0) [2024-03-21 11:42:42,806][14687] Avg episode reward: [(0, '1.617')] [2024-03-21 11:42:44,439][14919] Updated weights for policy 0, policy_version 59436 (0.0021) [2024-03-21 11:42:47,805][14687] Fps is (10 sec: 58982.3, 60 sec: 40413.9, 300 sec: 42876.1). Total num frames: 1947828224. Throughput: 0: 40768.9. Samples: 152968400. Policy #0 lag: (min: 3.0, avg: 39.1, max: 73.0) [2024-03-21 11:42:47,805][14687] Avg episode reward: [(0, '0.708')] [2024-03-21 11:42:48,481][14919] Updated weights for policy 0, policy_version 59446 (0.0017) [2024-03-21 11:42:52,286][14919] Updated weights for policy 0, policy_version 59456 (0.0015) [2024-03-21 11:42:52,805][14687] Fps is (10 sec: 78643.3, 60 sec: 45329.1, 300 sec: 43764.7). Total num frames: 1948254208. Throughput: 0: 40677.8. Samples: 153194800. Policy #0 lag: (min: 3.0, avg: 39.1, max: 73.0) [2024-03-21 11:42:52,806][14687] Avg episode reward: [(0, '0.653')] [2024-03-21 11:42:57,805][14687] Fps is (10 sec: 68813.8, 60 sec: 46421.5, 300 sec: 43764.7). Total num frames: 1948516352. Throughput: 0: 43460.3. Samples: 153460000. Policy #0 lag: (min: 3.0, avg: 39.1, max: 73.0) [2024-03-21 11:42:57,805][14687] Avg episode reward: [(0, '0.895')] [2024-03-21 11:42:59,215][14919] Updated weights for policy 0, policy_version 59466 (0.0019) [2024-03-21 11:43:02,805][14687] Fps is (10 sec: 55705.6, 60 sec: 46421.4, 300 sec: 44320.1). Total num frames: 1948811264. Throughput: 0: 40793.3. Samples: 153596600. Policy #0 lag: (min: 3.0, avg: 39.1, max: 73.0) [2024-03-21 11:43:02,806][14687] Avg episode reward: [(0, '1.434')] [2024-03-21 11:43:07,805][14687] Fps is (10 sec: 32767.4, 60 sec: 43144.5, 300 sec: 43653.6). Total num frames: 1948844032. Throughput: 0: 41386.7. Samples: 153872100. Policy #0 lag: (min: 3.0, avg: 39.1, max: 73.0) [2024-03-21 11:43:07,806][14687] Avg episode reward: [(0, '1.676')] [2024-03-21 11:43:11,372][14919] Updated weights for policy 0, policy_version 59476 (0.0017) [2024-03-21 11:43:12,805][14687] Fps is (10 sec: 16384.1, 60 sec: 44783.1, 300 sec: 43209.3). Total num frames: 1948975104. Throughput: 0: 44360.0. Samples: 154118500. Policy #0 lag: (min: 3.0, avg: 39.1, max: 73.0) [2024-03-21 11:43:12,806][14687] Avg episode reward: [(0, '1.404')] [2024-03-21 11:43:17,805][14687] Fps is (10 sec: 16384.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1949007872. Throughput: 0: 42280.1. Samples: 154259000. Policy #0 lag: (min: 3.0, avg: 39.1, max: 73.0) [2024-03-21 11:43:17,806][14687] Avg episode reward: [(0, '1.498')] [2024-03-21 11:43:21,164][14919] Updated weights for policy 0, policy_version 59486 (0.0016) [2024-03-21 11:43:22,805][14687] Fps is (10 sec: 29491.2, 60 sec: 44236.9, 300 sec: 43098.3). Total num frames: 1949270016. Throughput: 0: 42806.7. Samples: 154522700. Policy #0 lag: (min: 3.0, avg: 39.1, max: 73.0) [2024-03-21 11:43:22,805][14687] Avg episode reward: [(0, '1.499')] [2024-03-21 11:43:27,805][14687] Fps is (10 sec: 36044.9, 60 sec: 41506.2, 300 sec: 42876.1). Total num frames: 1949368320. Throughput: 0: 43322.3. Samples: 154673200. Policy #0 lag: (min: 0.0, avg: 29.2, max: 73.0) [2024-03-21 11:43:27,806][14687] Avg episode reward: [(0, '0.688')] [2024-03-21 11:43:31,900][14919] Updated weights for policy 0, policy_version 59496 (0.0015) [2024-03-21 11:43:32,805][14687] Fps is (10 sec: 36044.7, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1949630464. Throughput: 0: 43980.0. Samples: 154947500. Policy #0 lag: (min: 0.0, avg: 29.2, max: 73.0) [2024-03-21 11:43:32,806][14687] Avg episode reward: [(0, '0.688')] [2024-03-21 11:43:36,497][14898] Signal inference workers to stop experience collection... (3050 times) [2024-03-21 11:43:36,548][14919] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-03-21 11:43:36,780][14898] Signal inference workers to resume experience collection... (3050 times) [2024-03-21 11:43:36,780][14919] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-03-21 11:43:37,419][14919] Updated weights for policy 0, policy_version 59506 (0.0015) [2024-03-21 11:43:37,805][14687] Fps is (10 sec: 52428.1, 60 sec: 44236.7, 300 sec: 42876.1). Total num frames: 1949892608. Throughput: 0: 44173.3. Samples: 155182600. Policy #0 lag: (min: 0.0, avg: 29.2, max: 73.0) [2024-03-21 11:43:37,806][14687] Avg episode reward: [(0, '1.347')] [2024-03-21 11:43:42,805][14687] Fps is (10 sec: 45875.5, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 1950089216. Throughput: 0: 41306.6. Samples: 155318800. Policy #0 lag: (min: 0.0, avg: 29.2, max: 73.0) [2024-03-21 11:43:42,805][14687] Avg episode reward: [(0, '1.347')] [2024-03-21 11:43:45,229][14919] Updated weights for policy 0, policy_version 59516 (0.0012) [2024-03-21 11:43:47,805][14687] Fps is (10 sec: 49152.2, 60 sec: 42598.4, 300 sec: 43431.5). Total num frames: 1950384128. Throughput: 0: 44604.4. Samples: 155603800. Policy #0 lag: (min: 0.0, avg: 29.2, max: 73.0) [2024-03-21 11:43:47,806][14687] Avg episode reward: [(0, '0.737')] [2024-03-21 11:43:49,821][14919] Updated weights for policy 0, policy_version 59526 (0.0012) [2024-03-21 11:43:52,805][14687] Fps is (10 sec: 72088.8, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 1950810112. Throughput: 0: 43960.0. Samples: 155850300. Policy #0 lag: (min: 0.0, avg: 29.2, max: 73.0) [2024-03-21 11:43:52,806][14687] Avg episode reward: [(0, '0.737')] [2024-03-21 11:43:54,786][14919] Updated weights for policy 0, policy_version 59536 (0.0026) [2024-03-21 11:43:57,805][14687] Fps is (10 sec: 65535.9, 60 sec: 42052.1, 300 sec: 43875.8). Total num frames: 1951039488. Throughput: 0: 44062.1. Samples: 156101300. Policy #0 lag: (min: 0.0, avg: 29.2, max: 73.0) [2024-03-21 11:43:57,806][14687] Avg episode reward: [(0, '1.593')] [2024-03-21 11:44:00,654][14919] Updated weights for policy 0, policy_version 59546 (0.0024) [2024-03-21 11:44:02,805][14687] Fps is (10 sec: 39321.7, 60 sec: 39867.7, 300 sec: 43542.6). Total num frames: 1951203328. Throughput: 0: 43946.6. Samples: 156236600. Policy #0 lag: (min: 0.0, avg: 29.2, max: 73.0) [2024-03-21 11:44:02,806][14687] Avg episode reward: [(0, '0.750')] [2024-03-21 11:44:07,805][14687] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 43542.5). Total num frames: 1951432704. Throughput: 0: 43533.3. Samples: 156481700. Policy #0 lag: (min: 0.0, avg: 52.7, max: 106.0) [2024-03-21 11:44:07,806][14687] Avg episode reward: [(0, '1.085')] [2024-03-21 11:44:09,548][14919] Updated weights for policy 0, policy_version 59556 (0.0021) [2024-03-21 11:44:12,805][14687] Fps is (10 sec: 45875.4, 60 sec: 44782.9, 300 sec: 43431.5). Total num frames: 1951662080. Throughput: 0: 46128.9. Samples: 156749000. Policy #0 lag: (min: 0.0, avg: 52.7, max: 106.0) [2024-03-21 11:44:12,806][14687] Avg episode reward: [(0, '0.789')] [2024-03-21 11:44:17,805][14687] Fps is (10 sec: 39321.5, 60 sec: 46967.4, 300 sec: 43431.5). Total num frames: 1951825920. Throughput: 0: 43197.7. Samples: 156891400. Policy #0 lag: (min: 0.0, avg: 52.7, max: 106.0) [2024-03-21 11:44:17,806][14687] Avg episode reward: [(0, '0.789')] [2024-03-21 11:44:18,102][14919] Updated weights for policy 0, policy_version 59566 (0.0010) [2024-03-21 11:44:22,805][14687] Fps is (10 sec: 36044.7, 60 sec: 45875.2, 300 sec: 43431.5). Total num frames: 1952022528. Throughput: 0: 43975.6. Samples: 157161500. Policy #0 lag: (min: 0.0, avg: 52.7, max: 106.0) [2024-03-21 11:44:22,806][14687] Avg episode reward: [(0, '0.866')] [2024-03-21 11:44:27,805][14687] Fps is (10 sec: 26214.4, 60 sec: 45329.0, 300 sec: 42876.1). Total num frames: 1952088064. Throughput: 0: 44297.7. Samples: 157312200. Policy #0 lag: (min: 0.0, avg: 52.7, max: 106.0) [2024-03-21 11:44:27,806][14687] Avg episode reward: [(0, '0.943')] [2024-03-21 11:44:27,819][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000059573_1952088064.pth... [2024-03-21 11:44:27,944][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000059264_1941962752.pth [2024-03-21 11:44:30,949][14919] Updated weights for policy 0, policy_version 59576 (0.0011) [2024-03-21 11:44:32,805][14687] Fps is (10 sec: 32767.6, 60 sec: 45329.0, 300 sec: 43098.2). Total num frames: 1952350208. Throughput: 0: 43857.7. Samples: 157577400. Policy #0 lag: (min: 0.0, avg: 52.7, max: 106.0) [2024-03-21 11:44:32,806][14687] Avg episode reward: [(0, '0.869')] [2024-03-21 11:44:33,100][14898] Signal inference workers to stop experience collection... (3100 times) [2024-03-21 11:44:33,101][14898] Signal inference workers to resume experience collection... (3100 times) [2024-03-21 11:44:33,166][14919] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-03-21 11:44:33,167][14919] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-03-21 11:44:34,564][14919] Updated weights for policy 0, policy_version 59586 (0.0013) [2024-03-21 11:44:37,805][14687] Fps is (10 sec: 65536.1, 60 sec: 47513.6, 300 sec: 43764.7). Total num frames: 1952743424. Throughput: 0: 43728.9. Samples: 157818100. Policy #0 lag: (min: 0.0, avg: 52.7, max: 106.0) [2024-03-21 11:44:37,806][14687] Avg episode reward: [(0, '1.021')] [2024-03-21 11:44:38,584][14919] Updated weights for policy 0, policy_version 59596 (0.0013) [2024-03-21 11:44:42,805][14687] Fps is (10 sec: 72090.6, 60 sec: 49698.1, 300 sec: 44320.1). Total num frames: 1953071104. Throughput: 0: 43862.3. Samples: 158075100. Policy #0 lag: (min: 0.0, avg: 50.5, max: 103.0) [2024-03-21 11:44:42,806][14687] Avg episode reward: [(0, '1.021')] [2024-03-21 11:44:47,815][14687] Fps is (10 sec: 36009.2, 60 sec: 45321.6, 300 sec: 43763.2). Total num frames: 1953103872. Throughput: 0: 44221.4. Samples: 158227000. Policy #0 lag: (min: 0.0, avg: 50.5, max: 103.0) [2024-03-21 11:44:47,816][14687] Avg episode reward: [(0, '1.567')] [2024-03-21 11:44:48,475][14919] Updated weights for policy 0, policy_version 59606 (0.0014) [2024-03-21 11:44:52,805][14687] Fps is (10 sec: 29490.8, 60 sec: 42598.4, 300 sec: 43653.6). Total num frames: 1953366016. Throughput: 0: 44959.9. Samples: 158504900. Policy #0 lag: (min: 0.0, avg: 50.5, max: 103.0) [2024-03-21 11:44:52,806][14687] Avg episode reward: [(0, '1.567')] [2024-03-21 11:44:56,668][14919] Updated weights for policy 0, policy_version 59616 (0.0011) [2024-03-21 11:44:57,805][14687] Fps is (10 sec: 42640.5, 60 sec: 41506.2, 300 sec: 43764.7). Total num frames: 1953529856. Throughput: 0: 44731.0. Samples: 158761900. Policy #0 lag: (min: 0.0, avg: 50.5, max: 103.0) [2024-03-21 11:44:57,806][14687] Avg episode reward: [(0, '0.608')] [2024-03-21 11:45:02,805][14687] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 43098.3). Total num frames: 1953759232. Throughput: 0: 44468.9. Samples: 158892500. Policy #0 lag: (min: 0.0, avg: 50.5, max: 103.0) [2024-03-21 11:45:02,806][14687] Avg episode reward: [(0, '0.997')] [2024-03-21 11:45:03,267][14919] Updated weights for policy 0, policy_version 59626 (0.0017) [2024-03-21 11:45:07,805][14687] Fps is (10 sec: 52428.2, 60 sec: 43690.6, 300 sec: 43209.3). Total num frames: 1954054144. Throughput: 0: 44373.2. Samples: 159158300. Policy #0 lag: (min: 0.0, avg: 50.5, max: 103.0) [2024-03-21 11:45:07,806][14687] Avg episode reward: [(0, '1.187')] [2024-03-21 11:45:10,099][14919] Updated weights for policy 0, policy_version 59636 (0.0013) [2024-03-21 11:45:12,805][14687] Fps is (10 sec: 49152.5, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 1954250752. Throughput: 0: 47233.5. Samples: 159437700. Policy #0 lag: (min: 0.0, avg: 50.5, max: 103.0) [2024-03-21 11:45:12,805][14687] Avg episode reward: [(0, '1.187')] [2024-03-21 11:45:17,805][14687] Fps is (10 sec: 36045.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1954414592. Throughput: 0: 44360.0. Samples: 159573600. Policy #0 lag: (min: 0.0, avg: 40.9, max: 81.0) [2024-03-21 11:45:17,806][14687] Avg episode reward: [(0, '1.542')] [2024-03-21 11:45:18,468][14919] Updated weights for policy 0, policy_version 59646 (0.0012) [2024-03-21 11:45:22,805][14687] Fps is (10 sec: 45874.4, 60 sec: 44782.9, 300 sec: 43320.4). Total num frames: 1954709504. Throughput: 0: 45179.9. Samples: 159851200. Policy #0 lag: (min: 0.0, avg: 40.9, max: 81.0) [2024-03-21 11:45:22,806][14687] Avg episode reward: [(0, '1.507')] [2024-03-21 11:45:24,733][14919] Updated weights for policy 0, policy_version 59656 (0.0021) [2024-03-21 11:45:25,396][14898] Signal inference workers to stop experience collection... (3150 times) [2024-03-21 11:45:25,471][14898] Signal inference workers to resume experience collection... (3150 times) [2024-03-21 11:45:25,475][14919] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-03-21 11:45:25,528][14919] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-03-21 11:45:27,805][14687] Fps is (10 sec: 65536.2, 60 sec: 49698.1, 300 sec: 43764.7). Total num frames: 1955069952. Throughput: 0: 42433.2. Samples: 159984600. Policy #0 lag: (min: 0.0, avg: 40.9, max: 81.0) [2024-03-21 11:45:27,806][14687] Avg episode reward: [(0, '1.507')] [2024-03-21 11:45:29,664][14919] Updated weights for policy 0, policy_version 59666 (0.0018) [2024-03-21 11:45:32,805][14687] Fps is (10 sec: 55705.8, 60 sec: 48605.9, 300 sec: 43764.7). Total num frames: 1955266560. Throughput: 0: 45014.3. Samples: 160252200. Policy #0 lag: (min: 0.0, avg: 40.9, max: 81.0) [2024-03-21 11:45:32,806][14687] Avg episode reward: [(0, '1.690')] [2024-03-21 11:45:37,805][14687] Fps is (10 sec: 36045.2, 60 sec: 44783.0, 300 sec: 43653.7). Total num frames: 1955430400. Throughput: 0: 44835.7. Samples: 160522500. Policy #0 lag: (min: 0.0, avg: 40.9, max: 81.0) [2024-03-21 11:45:37,805][14687] Avg episode reward: [(0, '1.630')] [2024-03-21 11:45:38,026][14919] Updated weights for policy 0, policy_version 59676 (0.0020) [2024-03-21 11:45:42,805][14687] Fps is (10 sec: 45875.2, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 1955725312. Throughput: 0: 42140.0. Samples: 160658200. Policy #0 lag: (min: 0.0, avg: 40.9, max: 81.0) [2024-03-21 11:45:42,806][14687] Avg episode reward: [(0, '0.820')] [2024-03-21 11:45:47,404][14919] Updated weights for policy 0, policy_version 59686 (0.0016) [2024-03-21 11:45:47,805][14687] Fps is (10 sec: 39321.4, 60 sec: 45336.5, 300 sec: 43431.5). Total num frames: 1955823616. Throughput: 0: 45033.3. Samples: 160919000. Policy #0 lag: (min: 0.0, avg: 40.9, max: 81.0) [2024-03-21 11:45:47,806][14687] Avg episode reward: [(0, '1.048')] [2024-03-21 11:45:52,805][14687] Fps is (10 sec: 19660.6, 60 sec: 42598.4, 300 sec: 43431.5). Total num frames: 1955921920. Throughput: 0: 45326.7. Samples: 161198000. Policy #0 lag: (min: 0.0, avg: 40.9, max: 81.0) [2024-03-21 11:45:52,806][14687] Avg episode reward: [(0, '1.395')] [2024-03-21 11:45:56,265][14919] Updated weights for policy 0, policy_version 59696 (0.0012) [2024-03-21 11:45:57,805][14687] Fps is (10 sec: 29491.3, 60 sec: 43144.6, 300 sec: 43209.3). Total num frames: 1956118528. Throughput: 0: 41951.0. Samples: 161325500. Policy #0 lag: (min: 0.0, avg: 51.2, max: 97.0) [2024-03-21 11:45:57,805][14687] Avg episode reward: [(0, '1.311')] [2024-03-21 11:46:02,805][14687] Fps is (10 sec: 36045.2, 60 sec: 42052.3, 300 sec: 43431.5). Total num frames: 1956282368. Throughput: 0: 44444.5. Samples: 161573600. Policy #0 lag: (min: 0.0, avg: 51.2, max: 97.0) [2024-03-21 11:46:02,806][14687] Avg episode reward: [(0, '0.568')] [2024-03-21 11:46:05,799][14919] Updated weights for policy 0, policy_version 59706 (0.0012) [2024-03-21 11:46:07,805][14687] Fps is (10 sec: 45875.5, 60 sec: 42052.4, 300 sec: 43653.6). Total num frames: 1956577280. Throughput: 0: 43640.1. Samples: 161815000. Policy #0 lag: (min: 0.0, avg: 51.2, max: 97.0) [2024-03-21 11:46:07,805][14687] Avg episode reward: [(0, '0.695')] [2024-03-21 11:46:11,510][14919] Updated weights for policy 0, policy_version 59716 (0.0011) [2024-03-21 11:46:12,805][14687] Fps is (10 sec: 52429.2, 60 sec: 42598.4, 300 sec: 43764.7). Total num frames: 1956806656. Throughput: 0: 46273.5. Samples: 162066900. Policy #0 lag: (min: 0.0, avg: 51.2, max: 97.0) [2024-03-21 11:46:12,806][14687] Avg episode reward: [(0, '1.011')] [2024-03-21 11:46:17,805][14687] Fps is (10 sec: 45874.4, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 1957036032. Throughput: 0: 43484.4. Samples: 162209000. Policy #0 lag: (min: 0.0, avg: 51.2, max: 97.0) [2024-03-21 11:46:17,806][14687] Avg episode reward: [(0, '1.011')] [2024-03-21 11:46:19,257][14919] Updated weights for policy 0, policy_version 59726 (0.0016) [2024-03-21 11:46:22,805][14687] Fps is (10 sec: 58982.0, 60 sec: 44783.0, 300 sec: 43320.4). Total num frames: 1957396480. Throughput: 0: 42964.4. Samples: 162455900. Policy #0 lag: (min: 0.0, avg: 51.2, max: 97.0) [2024-03-21 11:46:22,806][14687] Avg episode reward: [(0, '0.624')] [2024-03-21 11:46:22,978][14919] Updated weights for policy 0, policy_version 59736 (0.0014) [2024-03-21 11:46:25,843][14898] Signal inference workers to stop experience collection... (3200 times) [2024-03-21 11:46:25,843][14898] Signal inference workers to resume experience collection... (3200 times) [2024-03-21 11:46:25,924][14919] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-03-21 11:46:25,924][14919] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-03-21 11:46:27,805][14687] Fps is (10 sec: 52429.3, 60 sec: 41506.2, 300 sec: 43100.8). Total num frames: 1957560320. Throughput: 0: 45997.8. Samples: 162728100. Policy #0 lag: (min: 0.0, avg: 51.2, max: 97.0) [2024-03-21 11:46:27,806][14687] Avg episode reward: [(0, '0.609')] [2024-03-21 11:46:27,818][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000059740_1957560320.pth... [2024-03-21 11:46:27,930][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000059414_1946877952.pth [2024-03-21 11:46:31,298][14919] Updated weights for policy 0, policy_version 59746 (0.0015) [2024-03-21 11:46:32,805][14687] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 43653.7). Total num frames: 1957855232. Throughput: 0: 43135.6. Samples: 162860100. Policy #0 lag: (min: 2.0, avg: 45.2, max: 83.0) [2024-03-21 11:46:32,805][14687] Avg episode reward: [(0, '1.597')] [2024-03-21 11:46:35,780][14919] Updated weights for policy 0, policy_version 59756 (0.0017) [2024-03-21 11:46:37,805][14687] Fps is (10 sec: 52428.6, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 1958084608. Throughput: 0: 42475.6. Samples: 163109400. Policy #0 lag: (min: 2.0, avg: 45.2, max: 83.0) [2024-03-21 11:46:37,806][14687] Avg episode reward: [(0, '0.937')] [2024-03-21 11:46:42,805][14687] Fps is (10 sec: 36044.8, 60 sec: 41506.1, 300 sec: 43431.5). Total num frames: 1958215680. Throughput: 0: 45657.8. Samples: 163380100. Policy #0 lag: (min: 2.0, avg: 45.2, max: 83.0) [2024-03-21 11:46:42,806][14687] Avg episode reward: [(0, '1.219')] [2024-03-21 11:46:47,805][14687] Fps is (10 sec: 29491.1, 60 sec: 42598.4, 300 sec: 43542.6). Total num frames: 1958379520. Throughput: 0: 42919.9. Samples: 163505000. Policy #0 lag: (min: 2.0, avg: 45.2, max: 83.0) [2024-03-21 11:46:47,806][14687] Avg episode reward: [(0, '1.248')] [2024-03-21 11:46:47,987][14919] Updated weights for policy 0, policy_version 59766 (0.0011) [2024-03-21 11:46:52,805][14687] Fps is (10 sec: 42598.6, 60 sec: 45329.2, 300 sec: 43764.7). Total num frames: 1958641664. Throughput: 0: 43882.2. Samples: 163789700. Policy #0 lag: (min: 2.0, avg: 45.2, max: 83.0) [2024-03-21 11:46:52,805][14687] Avg episode reward: [(0, '1.248')] [2024-03-21 11:46:54,023][14919] Updated weights for policy 0, policy_version 59776 (0.0014) [2024-03-21 11:46:57,805][14687] Fps is (10 sec: 52428.9, 60 sec: 46421.3, 300 sec: 43653.6). Total num frames: 1958903808. Throughput: 0: 43633.2. Samples: 164030400. Policy #0 lag: (min: 2.0, avg: 45.2, max: 83.0) [2024-03-21 11:46:57,806][14687] Avg episode reward: [(0, '1.460')] [2024-03-21 11:47:02,805][14687] Fps is (10 sec: 36044.4, 60 sec: 45329.0, 300 sec: 43209.3). Total num frames: 1959002112. Throughput: 0: 43266.7. Samples: 164156000. Policy #0 lag: (min: 2.0, avg: 45.2, max: 83.0) [2024-03-21 11:47:02,806][14687] Avg episode reward: [(0, '0.532')] [2024-03-21 11:47:04,665][14919] Updated weights for policy 0, policy_version 59786 (0.0013) [2024-03-21 11:47:07,805][14687] Fps is (10 sec: 19660.8, 60 sec: 42052.2, 300 sec: 43431.5). Total num frames: 1959100416. Throughput: 0: 43300.0. Samples: 164404400. Policy #0 lag: (min: 2.0, avg: 45.2, max: 83.0) [2024-03-21 11:47:07,806][14687] Avg episode reward: [(0, '1.251')] [2024-03-21 11:47:12,805][14687] Fps is (10 sec: 36045.1, 60 sec: 42598.4, 300 sec: 43764.7). Total num frames: 1959362560. Throughput: 0: 40002.2. Samples: 164528200. Policy #0 lag: (min: 1.0, avg: 26.9, max: 64.0) [2024-03-21 11:47:12,806][14687] Avg episode reward: [(0, '1.208')] [2024-03-21 11:47:13,440][14919] Updated weights for policy 0, policy_version 59796 (0.0014) [2024-03-21 11:47:17,805][14687] Fps is (10 sec: 52428.8, 60 sec: 43144.6, 300 sec: 44097.9). Total num frames: 1959624704. Throughput: 0: 42671.1. Samples: 164780300. Policy #0 lag: (min: 1.0, avg: 26.9, max: 64.0) [2024-03-21 11:47:17,806][14687] Avg episode reward: [(0, '1.545')] [2024-03-21 11:47:20,706][14919] Updated weights for policy 0, policy_version 59806 (0.0016) [2024-03-21 11:47:22,805][14687] Fps is (10 sec: 42598.5, 60 sec: 39867.8, 300 sec: 43764.7). Total num frames: 1959788544. Throughput: 0: 42877.9. Samples: 165038900. Policy #0 lag: (min: 1.0, avg: 26.9, max: 64.0) [2024-03-21 11:47:22,806][14687] Avg episode reward: [(0, '1.660')] [2024-03-21 11:47:27,805][14687] Fps is (10 sec: 22937.7, 60 sec: 38229.3, 300 sec: 43209.3). Total num frames: 1959854080. Throughput: 0: 39715.6. Samples: 165167300. Policy #0 lag: (min: 1.0, avg: 26.9, max: 64.0) [2024-03-21 11:47:27,806][14687] Avg episode reward: [(0, '1.096')] [2024-03-21 11:47:30,800][14919] Updated weights for policy 0, policy_version 59816 (0.0017) [2024-03-21 11:47:32,693][14898] Signal inference workers to stop experience collection... (3250 times) [2024-03-21 11:47:32,784][14919] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-03-21 11:47:32,805][14687] Fps is (10 sec: 32767.6, 60 sec: 37683.2, 300 sec: 43653.6). Total num frames: 1960116224. Throughput: 0: 42233.3. Samples: 165405500. Policy #0 lag: (min: 1.0, avg: 26.9, max: 64.0) [2024-03-21 11:47:32,806][14687] Avg episode reward: [(0, '0.944')] [2024-03-21 11:47:32,822][14898] Signal inference workers to resume experience collection... (3250 times) [2024-03-21 11:47:32,824][14919] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-03-21 11:47:35,551][14919] Updated weights for policy 0, policy_version 59826 (0.0019) [2024-03-21 11:47:37,805][14687] Fps is (10 sec: 55705.5, 60 sec: 38775.5, 300 sec: 43875.8). Total num frames: 1960411136. Throughput: 0: 40686.6. Samples: 165620600. Policy #0 lag: (min: 1.0, avg: 26.9, max: 64.0) [2024-03-21 11:47:37,806][14687] Avg episode reward: [(0, '1.123')] [2024-03-21 11:47:42,805][14687] Fps is (10 sec: 55706.2, 60 sec: 40960.0, 300 sec: 43542.6). Total num frames: 1960673280. Throughput: 0: 41346.8. Samples: 165891000. Policy #0 lag: (min: 1.0, avg: 26.9, max: 64.0) [2024-03-21 11:47:42,806][14687] Avg episode reward: [(0, '1.123')] [2024-03-21 11:47:43,300][14919] Updated weights for policy 0, policy_version 59836 (0.0023) [2024-03-21 11:47:47,655][14919] Updated weights for policy 0, policy_version 59846 (0.0047) [2024-03-21 11:47:47,805][14687] Fps is (10 sec: 62259.1, 60 sec: 44236.8, 300 sec: 43320.4). Total num frames: 1961033728. Throughput: 0: 41511.2. Samples: 166024000. Policy #0 lag: (min: 4.0, avg: 62.2, max: 111.0) [2024-03-21 11:47:47,806][14687] Avg episode reward: [(0, '0.976')] [2024-03-21 11:47:52,805][14687] Fps is (10 sec: 62259.0, 60 sec: 44236.8, 300 sec: 43320.4). Total num frames: 1961295872. Throughput: 0: 41340.0. Samples: 166264700. Policy #0 lag: (min: 4.0, avg: 62.2, max: 111.0) [2024-03-21 11:47:52,806][14687] Avg episode reward: [(0, '0.778')] [2024-03-21 11:47:56,700][14919] Updated weights for policy 0, policy_version 59856 (0.0017) [2024-03-21 11:47:57,805][14687] Fps is (10 sec: 36045.1, 60 sec: 41506.2, 300 sec: 42653.9). Total num frames: 1961394176. Throughput: 0: 44346.7. Samples: 166523800. Policy #0 lag: (min: 4.0, avg: 62.2, max: 111.0) [2024-03-21 11:47:57,806][14687] Avg episode reward: [(0, '0.882')] [2024-03-21 11:48:01,833][14919] Updated weights for policy 0, policy_version 59866 (0.0015) [2024-03-21 11:48:02,805][14687] Fps is (10 sec: 42598.5, 60 sec: 45329.1, 300 sec: 43653.6). Total num frames: 1961721856. Throughput: 0: 41320.1. Samples: 166639700. Policy #0 lag: (min: 4.0, avg: 62.2, max: 111.0) [2024-03-21 11:48:02,806][14687] Avg episode reward: [(0, '0.668')] [2024-03-21 11:48:07,805][14687] Fps is (10 sec: 52428.2, 60 sec: 46967.5, 300 sec: 43875.8). Total num frames: 1961918464. Throughput: 0: 41746.6. Samples: 166917500. Policy #0 lag: (min: 4.0, avg: 62.2, max: 111.0) [2024-03-21 11:48:07,806][14687] Avg episode reward: [(0, '0.668')] [2024-03-21 11:48:11,651][14919] Updated weights for policy 0, policy_version 59876 (0.0011) [2024-03-21 11:48:12,805][14687] Fps is (10 sec: 29491.2, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1962016768. Throughput: 0: 42160.0. Samples: 167064500. Policy #0 lag: (min: 4.0, avg: 62.2, max: 111.0) [2024-03-21 11:48:12,806][14687] Avg episode reward: [(0, '1.364')] [2024-03-21 11:48:17,805][14687] Fps is (10 sec: 22937.5, 60 sec: 42052.2, 300 sec: 43653.6). Total num frames: 1962147840. Throughput: 0: 42773.3. Samples: 167330300. Policy #0 lag: (min: 4.0, avg: 62.2, max: 111.0) [2024-03-21 11:48:17,806][14687] Avg episode reward: [(0, '1.068')] [2024-03-21 11:48:22,217][14919] Updated weights for policy 0, policy_version 59886 (0.0011) [2024-03-21 11:48:22,805][14687] Fps is (10 sec: 36044.8, 60 sec: 43144.5, 300 sec: 44098.0). Total num frames: 1962377216. Throughput: 0: 43986.7. Samples: 167600000. Policy #0 lag: (min: 4.0, avg: 62.2, max: 111.0) [2024-03-21 11:48:22,806][14687] Avg episode reward: [(0, '1.076')] [2024-03-21 11:48:25,953][14898] Signal inference workers to stop experience collection... (3300 times) [2024-03-21 11:48:26,042][14919] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-03-21 11:48:26,198][14898] Signal inference workers to resume experience collection... (3300 times) [2024-03-21 11:48:26,199][14919] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-03-21 11:48:27,805][14687] Fps is (10 sec: 42599.0, 60 sec: 45329.1, 300 sec: 43875.8). Total num frames: 1962573824. Throughput: 0: 43044.4. Samples: 167828000. Policy #0 lag: (min: 0.0, avg: 24.0, max: 57.0) [2024-03-21 11:48:27,806][14687] Avg episode reward: [(0, '1.294')] [2024-03-21 11:48:27,825][14898] Saving /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000059893_1962573824.pth... [2024-03-21 11:48:27,979][14898] Removing /workspace/metta/train_dir/p2.objt_atn.4/checkpoint_p0/checkpoint_000059573_1952088064.pth [2024-03-21 11:48:31,571][14919] Updated weights for policy 0, policy_version 59896 (0.0018) [2024-03-21 11:48:32,805][14687] Fps is (10 sec: 36044.6, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1962737664. Throughput: 0: 42975.6. Samples: 167957900. Policy #0 lag: (min: 0.0, avg: 24.0, max: 57.0) [2024-03-21 11:48:32,806][14687] Avg episode reward: [(0, '1.065')] [2024-03-21 11:48:37,805][14687] Fps is (10 sec: 36044.6, 60 sec: 42052.2, 300 sec: 43542.5). Total num frames: 1962934272. Throughput: 0: 43462.2. Samples: 168220500. Policy #0 lag: (min: 0.0, avg: 24.0, max: 57.0) [2024-03-21 11:48:37,806][14687] Avg episode reward: [(0, '1.145')] [2024-03-21 11:48:39,884][14919] Updated weights for policy 0, policy_version 59906 (0.0013) [2024-03-21 11:48:42,805][14687] Fps is (10 sec: 36045.0, 60 sec: 40413.9, 300 sec: 43098.3). Total num frames: 1963098112. Throughput: 0: 40931.1. Samples: 168365700. Policy #0 lag: (min: 0.0, avg: 24.0, max: 57.0) [2024-03-21 11:48:42,805][14687] Avg episode reward: [(0, '1.202')] [2024-03-21 11:48:44,985][14919] Updated weights for policy 0, policy_version 59916 (0.0021) [2024-03-21 11:48:47,805][14687] Fps is (10 sec: 52429.0, 60 sec: 40413.9, 300 sec: 42876.1). Total num frames: 1963458560. Throughput: 0: 43800.0. Samples: 168610700. Policy #0 lag: (min: 0.0, avg: 24.0, max: 57.0) [2024-03-21 11:48:47,806][14687] Avg episode reward: [(0, '1.293')] [2024-03-21 11:48:50,811][14919] Updated weights for policy 0, policy_version 59926 (0.0016) [2024-03-21 11:48:52,805][14687] Fps is (10 sec: 68812.1, 60 sec: 41506.1, 300 sec: 43209.3). Total num frames: 1963786240. Throughput: 0: 43064.5. Samples: 168855400. Policy #0 lag: (min: 0.0, avg: 24.0, max: 57.0) [2024-03-21 11:48:52,806][14687] Avg episode reward: [(0, '0.851')] [2024-03-21 11:48:56,382][14919] Updated weights for policy 0, policy_version 59936 (0.0010) [2024-03-21 11:48:57,805][14687] Fps is (10 sec: 65535.7, 60 sec: 45329.0, 300 sec: 43764.7). Total num frames: 1964113920. Throughput: 0: 45295.5. Samples: 169102800. Policy #0 lag: (min: 0.0, avg: 24.0, max: 57.0) [2024-03-21 11:48:57,806][14687] Avg episode reward: [(0, '1.170')] [2024-03-21 11:49:00,810][14919] Updated weights for policy 0, policy_version 59946 (0.0022) [2024-03-21 11:49:02,805][14687] Fps is (10 sec: 52429.2, 60 sec: 43144.5, 300 sec: 43653.6). Total num frames: 1964310528. Throughput: 0: 42442.3. Samples: 169240200. Policy #0 lag: (min: 0.0, avg: 24.0, max: 57.0) [2024-03-21 11:49:02,806][14687] Avg episode reward: [(0, '1.268')] [2024-03-21 11:49:07,805][14687] Fps is (10 sec: 52429.3, 60 sec: 45329.2, 300 sec: 43986.9). Total num frames: 1964638208. Throughput: 0: 42306.7. Samples: 169503800. Policy #0 lag: (min: 0.0, avg: 67.3, max: 130.0) [2024-03-21 11:49:07,806][14687] Avg episode reward: [(0, '0.766')] [2024-03-21 11:49:07,814][14919] Updated weights for policy 0, policy_version 59956 (0.0012) [2024-03-21 11:49:12,805][14687] Fps is (10 sec: 36044.8, 60 sec: 44236.8, 300 sec: 43542.6). Total num frames: 1964670976. Throughput: 0: 43782.2. Samples: 169798200. Policy #0 lag: (min: 0.0, avg: 67.3, max: 130.0) [2024-03-21 11:49:12,806][14687] Avg episode reward: [(0, '0.766')] [2024-03-21 11:49:17,805][14687] Fps is (10 sec: 22937.4, 60 sec: 45329.1, 300 sec: 43542.6). Total num frames: 1964867584. Throughput: 0: 43906.6. Samples: 169933700. Policy #0 lag: (min: 0.0, avg: 67.3, max: 130.0) [2024-03-21 11:49:17,806][14687] Avg episode reward: [(0, '0.766')] [2024-03-21 11:49:19,664][14898] Signal inference workers to stop experience collection... (3350 times) [2024-03-21 11:49:19,665][14898] Signal inference workers to resume experience collection... (3350 times) [2024-03-21 11:49:19,730][14919] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-03-21 11:49:19,730][14919] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-03-21 11:49:20,011][14919] Updated weights for policy 0, policy_version 59966 (0.0011) [2024-03-21 11:49:22,805][14687] Fps is (10 sec: 32767.7, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 1964998656. Throughput: 0: 44044.4. Samples: 170202500. Policy #0 lag: (min: 0.0, avg: 67.3, max: 130.0) [2024-03-21 11:49:22,806][14687] Avg episode reward: [(0, '1.248')] [2024-03-21 11:49:27,805][14687] Fps is (10 sec: 22937.8, 60 sec: 42052.3, 300 sec: 43209.4). Total num frames: 1965096960. Throughput: 0: 47084.4. Samples: 170484500. Policy #0 lag: (min: 0.0, avg: 67.3, max: 130.0) [2024-03-21 11:49:27,806][14687] Avg episode reward: [(0, '1.199')] [2024-03-21 11:49:30,456][14919] Updated weights for policy 0, policy_version 59976 (0.0012) [2024-03-21 11:49:32,805][14687] Fps is (10 sec: 45874.9, 60 sec: 45329.0, 300 sec: 43098.2). Total num frames: 1965457408. Throughput: 0: 44406.6. Samples: 170609000. Policy #0 lag: (min: 0.0, avg: 67.3, max: 130.0) [2024-03-21 11:49:32,806][14687] Avg episode reward: [(0, '1.529')] [2024-03-21 11:49:34,585][14919] Updated weights for policy 0, policy_version 59986 (0.0020) [2024-03-21 11:49:37,805][14687] Fps is (10 sec: 78642.3, 60 sec: 49152.0, 300 sec: 43431.5). Total num frames: 1965883392. Throughput: 0: 44544.4. Samples: 170859900. Policy #0 lag: (min: 0.0, avg: 67.3, max: 130.0) [2024-03-21 11:49:37,806][14687] Avg episode reward: [(0, '1.310')] [2024-03-21 11:49:42,805][14687] Fps is (10 sec: 45876.3, 60 sec: 46967.5, 300 sec: 43433.0). Total num frames: 1965916160. Throughput: 0: 41782.4. Samples: 170983000. Policy #0 lag: (min: 0.0, avg: 67.3, max: 130.0) [2024-03-21 11:49:42,805][14687] Avg episode reward: [(0, '0.701')] [2024-03-21 11:49:44,041][14919] Updated weights for policy 0, policy_version 59996 (0.0015) [2024-03-21 11:49:47,805][14687] Fps is (10 sec: 22937.9, 60 sec: 44236.8, 300 sec: 43209.4). Total num frames: 1966112768. Throughput: 0: 43488.9. Samples: 171197200. Policy #0 lag: (min: 0.0, avg: 35.6, max: 72.0) [2024-03-21 11:49:47,805][14687] Avg episode reward: [(0, '1.646')]